User-agent: * Disallow: /secret/ Disallow: /error404.php Disallow: /product-faxform.php5 Disallow: /checkout/ Disallow: /account/ Disallow: /productsheets/ Disallow: /supplierdata/ Disallow: /senddetails.php Disallow: /technicalrequest.php Disallow: /validationrequest.php Disallow: /quoterequest.php Disallow: /bulkrequest.php Disallow: /publications/ Disallow: /ajax/ Disallow: /antigen/ Disallow: /research-area/ Disallow: /clone/ Disallow: /host/ Disallow: /application/ Disallow: /reactivity/ Disallow: /special/ Disallow: /abstracts-index/ ### bots we know User-agent: googlebot User-agent: googlebot-image User-agent: googlebot-news User-agent: googlebot-video User-agent: googlebot-mobile User-agent: Mediapartners-Google User-agent: Mediapartners User-agent: AdsBot-Google User-agent: AdsBot-Google-Mobile User-agent: yahoo-slurp User-agent: yahoo-mmcrawler User-agent: bingbot User-agent: msnbot User-agent: yandex User-agent: baiduspider Disallow: /secret/ Disallow: /error404.php Disallow: /product-faxform.php5 Disallow: /checkout/ Disallow: /account/ Disallow: /productsheets/ Disallow: /supplierdata/ Disallow: /senddetails.php Disallow: /technicalrequest.php Disallow: /publications/ Disallow: /ajax/ Disallow: /antigen/ Disallow: /research-area/ Disallow: /clone/ Disallow: /host/ Disallow: /application/ Disallow: /reactivity/ Disallow: /special/ Disallow: /abstracts-index/ ### www.integromedb.org/Crawler User-agent: www.integromedb.org/Crawler Disallow: /secret/ Disallow: /error404.php Disallow: /product-faxform.php5 Disallow: /checkout/ Disallow: /account/ Disallow: /productsheets/ Disallow: /supplierdata/ Disallow: /senddetails.php Disallow: /technicalrequest.php Disallow: /publications/ Disallow: /ajax/ Disallow: /antigen/ Disallow: /research-area/ Disallow: /clone/ Disallow: /host/ Disallow: /application/ Disallow: /reactivity/ Disallow: /special/ Disallow: /abstracts-index/ Crawl-delay: 5 ### 008 - http://www.80legs.com/webcrawler.html User-agent: 008 Disallow: /secret/ Disallow: /error404.php Disallow: /product-faxform.php5 Disallow: /checkout/ Disallow: /account/ Disallow: /productsheets/ Disallow: /supplierdata/ Disallow: /senddetails.php Disallow: /technicalrequest.php Disallow: /publications/ Disallow: /ajax/ Disallow: /antigen/ Disallow: /research-area/ Disallow: /clone/ Disallow: /host/ Disallow: /application/ Disallow: /reactivity/ Disallow: /special/ Disallow: /abstracts-index/ Crawl-delay: 30 ### Huasai/1.0, too fast User-agent: Huasai Disallow: /secret/ Disallow: /error404.php Disallow: /product-faxform.php5 Disallow: /checkout/ Disallow: /account/ Disallow: /productsheets/ Disallow: /supplierdata/ Disallow: /senddetails.php Disallow: /technicalrequest.php Disallow: /publications/ Disallow: /ajax/ Disallow: /antigen/ Disallow: /research-area/ Disallow: /clone/ Disallow: /host/ Disallow: /application/ Disallow: /reactivity/ Disallow: /special/ Disallow: /abstracts-index/ Crawl-delay: 10 ### LEXI @ 2008-09-03: Allowed faster crawling for any agent below ### sogou.com spider, much too fast User-agent: Sogou web spider Disallow: /secret/ Disallow: /error404.php Disallow: /product-faxform.php5 Disallow: /checkout/ Disallow: /account/ Disallow: /productsheets/ Disallow: /supplierdata/ Disallow: /senddetails.php Disallow: /technicalrequest.php Disallow: /publications/ Disallow: /ajax/ Disallow: /antigen/ Disallow: /research-area/ Disallow: /clone/ Disallow: /host/ Disallow: /application/ Disallow: /reactivity/ Disallow: /special/ Disallow: /abstracts-index/ Crawl-delay: 20 ### YoudaoBot User-agent: YoudaoBot Disallow: /secret/ Disallow: /error404.php Disallow: /product-faxform.php5 Disallow: /checkout/ Disallow: /account/ Disallow: /productsheets/ Disallow: /supplierdata/ Disallow: /senddetails.php Disallow: /technicalrequest.php Disallow: /publications/ Disallow: /ajax/ Disallow: /antigen/ Disallow: /research-area/ Disallow: /clone/ Disallow: /host/ Disallow: /application/ Disallow: /reactivity/ Disallow: /special/ Disallow: /abstracts-index/ Crawl-delay: 20 ### Labhoo.com spider - hit the bot trap though they say they respect robots.txt User-agent: Labhoo Disallow: /secret/ Disallow: /error404.php Disallow: /product-faxform.php5 Disallow: /checkout/ Disallow: /account/ Disallow: /productsheets/ Disallow: /supplierdata/ Disallow: /senddetails.php Disallow: /technicalrequest.php Disallow: /publications/ Disallow: /ajax/ Disallow: /antigen/ Disallow: /research-area/ Disallow: /clone/ Disallow: /host/ Disallow: /application/ Disallow: /reactivity/ Disallow: /special/ Disallow: /abstracts-index/ Crawl-delay: 10 ### Slow-down Ask Jeeves/Teoma to one call every minute User-agent: Jeeves/Teoma Disallow: /secret/ Disallow: /error404.php Disallow: /product-faxform.php5 Disallow: /checkout/ Disallow: /account/ Disallow: /productsheets/ Disallow: /supplierdata/ Disallow: /senddetails.php Disallow: /technicalrequest.php Disallow: /publications/ Disallow: /ajax/ Disallow: /antigen/ Disallow: /research-area/ Disallow: /clone/ Disallow: /host/ Disallow: /application/ Disallow: /reactivity/ Disallow: /special/ Disallow: /abstracts-index/ Crawl-delay: 10 ### Slow-down Exabot to one call every minute ### Never visited recently - just to make sure User-agent: Exabot Disallow: /secret/ Disallow: /error404.php Disallow: /product-faxform.php5 Disallow: /checkout/ Disallow: /account/ Disallow: /productsheets/ Disallow: /supplierdata/ Disallow: /senddetails.php Disallow: /technicalrequest.php Disallow: /publications/ Disallow: /ajax/ Disallow: /antigen/ Disallow: /research-area/ Disallow: /clone/ Disallow: /host/ Disallow: /application/ Disallow: /reactivity/ Disallow: /special/ Disallow: /abstracts-index/ Crawl-delay: 10 ### Netluchs.de Crawler - too fast - one call every minute User-agent: Netluchs/Nutch-0.9-dev Disallow: /secret/ Disallow: /error404.php Disallow: /product-faxform.php5 Disallow: /checkout/ Disallow: /account/ Disallow: /productsheets/ Disallow: /supplierdata/ Disallow: /senddetails.php Disallow: /technicalrequest.php Disallow: /publications/ Disallow: /ajax/ Disallow: /antigen/ Disallow: /research-area/ Disallow: /clone/ Disallow: /host/ Disallow: /application/ Disallow: /reactivity/ Disallow: /special/ Disallow: /abstracts-index/ Crawl-delay: 10 ### Slow-down Seokicks to one call every minute ### 2012-10-31: ~1000 visits a day User-agent: SEOkicks Disallow: /secret/ Disallow: /error404.php Disallow: /product-faxform.php5 Disallow: /checkout/ Disallow: /account/ Disallow: /productsheets/ Disallow: /supplierdata/ Disallow: /senddetails.php Disallow: /technicalrequest.php Disallow: /publications/ Disallow: /ajax/ Disallow: /antigen/ Disallow: /research-area/ Disallow: /clone/ Disallow: /host/ Disallow: /application/ Disallow: /reactivity/ Disallow: /special/ Disallow: /abstracts-index/ Crawl-delay: 60 ### Try to deny User-agent: InetURL Disallow: / ### Potential email collectors User-agent: email Disallow: / ### A couple of Crawlers we have noticed #User-agent: West Wind Internet Protocols 4.55 # Requests homepage ~ once an hour #User-agent: Java/1.5.0_11 # Java/1.4.1_04 # Java/1.4.2_04 # 1.5 Appears to have crawled the entire alpha nav and then left #User-agent: libwww-perl/5.65 #User-agent: Google-Sitemaps/1.0 # Google Sitemap / Webmaster Tools verification #User-agent: Snoopy v1.2 # Now 403'd through .htaccess # A PHP class that emulates a web browser # http://sourceforge.net/projects/snoopy/ #User-agent: Feedfetcher-Google # http://www.google.com/feedfetcher.html; 1 subscribers; feed-id=2589828680658507079 #User-agent: AdsBot-Google # http://www.google.com/adsbot.html #User-agent: w00tw00t.at.ISC.SANS.DFind # Evil #User-agent: Netluchs/Nutch-0.9-dev # Way too fast #User-agent: Microsoft URL Control - 6.00.8862 #User-agent: Ask Jeeves/Teoma #User-agent: MagpieRSS/0.72 # OpenSource RSS Client #User-agent: ia_archiver # The Internet Archive / Wayback Machine #User-agent: Yahoo-MMCrawler/3.x #User-agent: Snapbot/1.0 # Reads robots.txt #User-agent: Semager/1.0 #User-agent: SeznamBot/1.0 #User-agent: PHP version tracker (http://www.nexen.net/phpversion/bot.php) #User-agent: NASA Search 1.0 #User-agent: sogou spider # Respects robots.txt #User-agent: VadixBot # Reads robots.txt #User-agent: Seekbot/1.0 (http://www.seekbot.net/bot.html) RobotsTxtFetcher/1.2 #User-agent: ApacheBench/2.0.41-dev Sitemap: http://www.antikoerper-online.de/sitemaps/1/mapindex.xml