Commit message (Expand) | Author | Age | Files | Lines | ||
---|---|---|---|---|---|---|
... | ||||||
* | make a _urlclean() function to always store a proper URL ... | Arthur de Jong | 2005-07-30 | 1 | -2/+12 | |
* | import time as we need it for sleep | Arthur de Jong | 2005-07-29 | 1 | -0/+1 | |
* | do an extra breadth first traversal of the site to combin... | Arthur de Jong | 2005-07-29 | 1 | -5/+61 | |
* | remove references to email addresses where they are not u... | Arthur de Jong | 2005-07-29 | 1 | -3/+3 | |
* | turn tocheck list into fifo queue | Arthur de Jong | 2005-07-27 | 1 | -1/+1 | |
* | only add links to crawl list if they are not in there all... | Arthur de Jong | 2005-07-24 | 1 | -2/+2 | |
* | fix regular expression matching | Arthur de Jong | 2005-07-23 | 1 | -2/+3 | |
* | Mike Meyer -> Mike W. Meyer | Arthur de Jong | 2005-07-23 | 1 | -1/+1 | |
* | add support for sleep between requests | Arthur de Jong | 2005-07-22 | 1 | -0/+4 | |
* | almost complete rewrite of crawling and site state code m... | Arthur de Jong | 2005-07-22 | 1 | -0/+330 |