README 529 B

1234567891011121314151617
  1. https://://www.bananas-playground.net/projekt/aranea
  2. A small web crawler named aranea (Latin for spider).
  3. The aim is to gather unique domains to show what is out there.
  4. It starts with a given set of URL(s) and parses them for more
  5. URLs. Stores them and fethches them too.
  6. -> fetch.pl
  7. Each URL result (Stored result from the call) will be parsed
  8. for other URLs to follow.
  9. -> parse.pl
  10. After a run cleanup will gather all the uniqe Domains into
  11. a table. Removes URLs from the fetch table which are already
  12. enough.
  13. -> cleanup.pl