O'Reilly Answers is a community site for sharing knowledge, asking questions, and providing answers that brings together our customers, authors, editors, conference speakers, and Foo (Friends of O'Reilly). More »
If you're creating a search engine you'll need a way to collect documents. In this excerpt from Tony Segaran's Programming Collective Intelligence the author shows you how to set up a simple web crawl...