robot-id: abcdatos robot-name: ABCdatos BotLink robot-cover-url: http://www.abcdatos.com/ robot-details-url: http://www.abcdatos.com/botlink/ robot-owner-name: ABCdatos robot-owner-url: http://www.abcdatos.com/ robot-owner-email: botlink+AEA-abcdatos.com robot-status: active robot-purpose: maintenance robot-type: standalone robot-platform: windows robot-availability: none robot-exclusion: no robot-exclusion-useragent: BotLink robot-noindex: no robot-host: 217.126.39.167 robot-from: no robot-useragent: ABCdatos BotLink/1.0.2 (test links) robot-language: basic robot-description: This robot is used to verify availability of the ABCdatos directory entries (http://www.abcdatos.com), checking HTTP HEAD. Robot runs twice a week. Under HTTP 5xx error responses or unable to connect, it repeats verification some hours later, verifiying if that was a temporary situation. robot-history: This robot was developed by ABCdatos team to help working in the directory maintenance. robot-environment: commercial modified-date: Thu, 29 May 2003 01:00:00 GMT modified-by: ABCdatos robot-id: Acme.Spider robot-name: Acme.Spider robot-cover-url: http://www.acme.com/java/software/Acme.Spider.html robot-details-url: http://www.acme.com/java/software/Acme.Spider.html robot-owner-name: Jef Poskanzer - ACME Laboratories robot-owner-url: http://www.acme.com/ robot-owner-email: jef@acme.com robot-status: active robot-purpose: indexing maintenance statistics robot-type: standalone robot-platform: java robot-availability: source robot-exclusion: yes robot-exclusion-useragent: Due to a deficiency in Java it's not currently possible to set the User-Agent. robot-noindex: no robot-host: * robot-from: no robot-useragent: Due to a deficiency in Java it's not currently possible to set the User-Agent. robot-language: java robot-description: A Java utility class for writing your own robots. robot-history: robot-environment: modified-date: Wed, 04 Dec 1996 21:30:11 GMT modified-by: Jef Poskanzer robot-id: ahoythehomepagefinder robot-name: Ahoy! The Homepage Finder robot-cover-url: http://www.cs.washington.edu/research/ahoy/ robot-details-url: http://www.cs.washington.edu/research/ahoy/doc/home.html robot-owner-name: Marc Langheinrich robot-owner-url: http://www.cs.washington.edu/homes/marclang robot-owner-email: marclang@cs.washington.edu robot-status: active robot-purpose: maintenance robot-type: standalone robot-platform: UNIX robot-availability: none robot-exclusion: yes robot-exclusion-useragent: ahoy robot-noindex: no robot-host: cs.washington.edu robot-from: no robot-useragent: 'Ahoy! The Homepage Finder' robot-language: Perl 5 robot-description: Ahoy! is an ongoing research project at the University of Washington for finding personal Homepages. robot-history: Research project at the University of Washington in 1995/1996 robot-environment: research modified-date: Fri June 28 14:00:00 1996 modified-by: Marc Langheinrich robot-id: Alkaline robot-name: Alkaline robot-cover-url: http://www.vestris.com/alkaline robot-details-url: http://www.vestris.com/alkaline robot-owner-name: Daniel Doubrovkine robot-owner-url: http://cuiwww.unige.ch/~doubrov5 robot-owner-email: dblock@vestris.com robot-status: development active robot-purpose: indexing robot-type: standalone robot-platform: unix windows95 windowsNT robot-availability: binary robot-exclusion: yes robot-exclusion-useragent: AlkalineBOT robot-noindex: yes robot-host: * robot-from: no robot-useragent: AlkalineBOT robot-language: c++ robot-description: Unix/NT internet/intranet search engine robot-history: Vestris Inc. search engine designed at the University of Geneva robot-environment: commercial research modified-date: Thu Dec 10 14:01:13 MET 1998 modified-by: Daniel Doubrovkine robot-id:anthill robot-name:Anthill robot-cover-url:http://www.anthill.org/index.html robot-details-url:http://www.anthill.org/index.html robot-owner-name:Torsten Kaubisch robot-owner-url:http://www.anthill.org/index.html robot-owner-email:info@anthill.org robot-status:development robot-purpose:indexing robot-type:standalone robot-platform:independent robot-availability:not yet robot-exclusion:no (soon in V1.2) robot-exclusion-useragent:anthill robot-noindex:no robot-host:anywhere robot-from:no robot-useragent:AnthillV1.1 robot-language:java robot-description:Anthill is used to gather priceinformation automatically from online stores.support for international versions. robot-history:This is a reasearch project at the University of Mannheim in Germany, professorship Prof. Martin Schader, assistant Dr. Stefan Kuhlins robot-environment:research modified-date:Thu, 6 Dec 2001 01:55:00 GMT modified-by:Torsten Kaubisch robot-id: appie robot-name: Walhello appie robot-cover-url: www.walhello.com robot-details-url: www.walhello.com/aboutgl.html robot-owner-name: Aimo Pieterse robot-owner-url: www.walhello.com robot-owner-email: aimo@walhello.com robot-status: active robot-purpose: indexing robot-type: standalone robot-platform: windows98 robot-availability: none robot-exclusion: yes robot-exclusion-useragent: appie robot-noindex: yes robot-host: 213.10.10.116, 213.10.10.117, 213.10.10.118 robot-from: yes robot-useragent: appie/1.1 robot-language: Visual C++ robot-description: The appie-spider is used to collect and index web pages for the Walhello search engine robot-history: The spider was built in march/april 2000 robot-environment: commercial modified-date: Thu, 20 Jul 2000 22:38:00 GMT modified-by: Aimo Pieterse robot-id: arachnophilia robot-name: Arachnophilia robot-cover-url: robot-details-url: robot-owner-name: Vince Taluskie robot-owner-url: http://www.ph.utexas.edu/people/vince.html robot-owner-email: taluskie@utpapa.ph.utexas.edu robot-status: robot-purpose: robot-type: robot-platform: robot-availability: robot-exclusion: yes robot-exclusion-useragent: robot-noindex: no robot-host: halsoft.com robot-from: robot-useragent: Arachnophilia robot-language: robot-description: The purpose (undertaken by HaL Software) of this run was to collect approximately 10k html documents for testing automatic abstract generation robot-history: robot-environment: modified-date: modified-by: robot-id: arale robot-name: Arale robot-cover-url: http://web.tiscali.it/_flat robot-details-url: http://web.tiscali.it/_flat robot-owner-name: Flavio Tordini robot-owner-url: http://web.tiscali.it/_flat robot-owner-email: flaviotordini@tiscali.it robot-status: active robot-purpose: maintenance robot-type: standalone robot-platform: unix, windows, windows95, windowsNT, os2, mac, linux robot-availability: source, binary robot-exclusion: no robot-exclusion-useragent: arale robot-noindex: no robot-host: * robot-from: no robot-useragent: no robot-language: java robot-description: A java multithreaded web spider. Download entire web sites or specific resources from the web. Render dynamic sites to static pages. robot-history: This is brand new. robot-environment: hobby modified-date: Thu, 09 Jan 2001 17:28:52 GMT modified-by: Flavio Tordini robot-id: araneo robot-name: Araneo robot-cover-url: http://esperantisto.net robot-details-url: http://esperantisto.net/araneo/ robot-owner-name: Arto Sarle robot-owner-url: http://esperantisto.net robot-owner-email: araneo@esperantisto.net robot-status: development robot-purpose: indexing, statistics robot-type: standalone robot-platform: Linux robot-availability: none robot-exclusion: yes robot-exclusion-useragent: araneo robot-noindex: yes robot-nofollow: yes robot-host: *.esperantisto.net robot-from: yes robot-useragent: Araneo/0.7 (araneo@esperantisto.net; http://esperantisto.net) robot-language: Python, Java robot-description: Araneo is a web robot developed for crawling and indexing web pages written in the international language Esperanto. The database will be used to build a web search engine and auxiliary services to be published at esperantisto.net. robot-history: (The name Araneo means "spider" in Esperanto.) robot-environment: hobby, research modified-date: Fri, 16 Nov 2001 08:30:00 GMT modified-by: Arto Sarle robot-id: araybot robot-name: AraybOt robot-cover-url: http://www.araykoo.com/ robot-details-url: http://www.araykoo.com/araybot.html robot-owner-name: Guti robot-owner-url: http://www.araykoo.com/ robot-owner-email: robot@araykoo.com robot-status: active robot-purpose: indexing maintenance robot-type: standalone robot-platform: Linux robot-availability: none robot-exclusion: yes robot-exclusion-useragent: AraybOt robot-noindex: yes robot-host: * robot-from: no robot-useragent: AraybOt/1.0 (+http://www.araykoo.com/araybot.html) robot-language: perl5 robot-description: AraybOt is the agent software of AraykOO! which crawls web sites listed in http://dmoz.org/Adult/, in order to build a adult search engine. robot-history: robot-environment: service modified-date: Sat, 19 Jun 2004 20:25:00 GMT+1 modified-by: Guti robot-id: architext robot-name: ArchitextSpider robot-cover-url: http://www.excite.com/ robot-details-url: robot-owner-name: Architext Software robot-owner-url: http://www.atext.com/spider.html robot-owner-email: spider@atext.com robot-status: robot-purpose: indexing, statistics robot-type: standalone robot-platform: robot-availability: robot-exclusion: yes robot-exclusion-useragent: robot-noindex: no robot-host: *.atext.com robot-from: yes robot-useragent: ArchitextSpider robot-language: perl 5 and c robot-description: Its purpose is to generate a Resource Discovery database, and to generate statistics. The ArchitextSpider collects information for the Excite and WebCrawler search engines. robot-history: robot-environment: modified-date: Tue Oct 3 01:10:26 1995 modified-by: robot-id: aretha robot-name: Aretha robot-cover-url: robot-details-url: robot-owner-name: Dave Weiner robot-owner-url: http://www.hotwired.com/Staff/userland/ robot-owner-email: davew@well.com robot-status: robot-purpose: robot-type: robot-platform: Macintosh robot-availability: robot-exclusion: robot-exclusion-useragent: robot-noindex: robot-host: robot-from: robot-useragent: robot-language: robot-description: A crude robot built on top of Netscape and Userland Frontier, a scripting system for Macs robot-history: robot-environment: modified-date: modified-by: robot-id: ariadne robot-name: ARIADNE robot-cover-url: (forthcoming) robot-details-url: (forthcoming) robot-owner-name: Mr. Matthias H. Gross robot-owner-url: http://www.lrz-muenchen.de/~gross/ robot-owner-email: Gross@dbs.informatik.uni-muenchen.de robot-status: development robot-purpose: statistics, development of focused crawling strategies robot-type: standalone robot-platform: java robot-availability: none robot-exclusion: yes robot-exclusion-useragent: ariadne robot-noindex: no robot-host: dbs.informatik.uni-muenchen.de robot-from: no robot-useragent: Due to a deficiency in Java it's not currently possible to set the User-Agent. robot-language: java robot-description: The ARIADNE robot is a prototype of a environment for testing focused crawling strategies. robot-history: This robot is part of a research project at the University of Munich (LMU), started in 2000. robot-environment: research modified-date: Mo, 13 Mar 2000 14:00:00 GMT modified-by: Mr. Matthias H. Gross robot-id:arks robot-name:arks robot-cover-url:http://www.dpsindia.com robot-details-url:http://www.dpsindia.com robot-owner-name:Aniruddha Choudhury robot-owner-url: robot-owner-email:aniruddha.c@usa.net robot-status:development robot-purpose:indexing robot-type:standalone robot-platform:PLATFORM INDEPENDENT robot-availability:data robot-exclusion:yes robot-exclusion-useragent:arks robot-noindex:no robot-host:dpsindia.com robot-from:no robot-useragent:arks/1.0 robot-language:Java 1.2 robot-description:The Arks robot is used to build the database for the dpsindia/lawvistas.com search service . The robot runs weekly, and visits sites in a random order robot-history:finds its root from s/w development project for a portal robot-environment:commercial modified-date:6 th November 2000 modified-by:Aniruddha Choudhury robot-id: aspider robot-name: ASpider (Associative Spider) robot-cover-url: robot-details-url: robot-owner-name: Fred Johansen robot-owner-url: http://www.pvv.ntnu.no/~fredj/ robot-owner-email: fredj@pvv.ntnu.no robot-status: retired robot-purpose: indexing robot-type: robot-platform: unix robot-availability: robot-exclusion: robot-exclusion-useragent: robot-noindex: no robot-host: nova.pvv.unit.no robot-from: yes robot-useragent: ASpider/0.09 robot-language: perl4 robot-description: ASpider is a CGI script that searches the web for keywords given by the user through a form. robot-history: robot-environment: hobby modified-date: modified-by: robot-id: atn.txt robot-name: ATN Worldwide robot-details-url: robot-cover-url: robot-owner-name: All That Net robot-owner-url: http://www.allthatnet.com robot-owner-email: info@allthatnet.com robot-status: active robot-purpose: indexing robot-type: robot-platform: robot-availability: robot-exclusion: yes robot-exclusion-useragent: ATN_Worldwide robot-noindex: robot-nofollow: robot-host: www.allthatnet.com robot-from: robot-useragent: ATN_Worldwide robot-language: robot-description: The ATN robot is used to build the database for the AllThatNet search service operated by All That Net. The robot runs weekly, and visits sites in a random order. robot-history: robot-environment: modified-date: July 09, 2000 17:43 GMT robot-id: atomz robot-name: Atomz.com Search Robot robot-cover-url: http://www.atomz.com/help/ robot-details-url: http://www.atomz.com/ robot-owner-name: Mike Thompson robot-owner-url: http://www.atomz.com/ robot-owner-email: mike@atomz.com robot-status: active robot-purpose: indexing robot-type: standalone robot-platform: unix robot-availability: service robot-exclusion: yes robot-exclusion-useragent: Atomz robot-noindex: yes robot-host: www.atomz.com robot-from: no robot-useragent: Atomz/1.0 robot-language: c robot-description: Robot used for web site search service. robot-history: Developed for Atomz.com, launched in 1999. robot-environment: service modified-date: Tue Jul 13 03:50:06 GMT 1999 modified-by: Mike Thompson robot-id: auresys