| Commit message (Expand) | Author | Age | Files | Lines |
* | also pass mimetypes to scheme modules to only fetch conte... | Arthur de Jong | 2005-08-12 | 1 | -6/+18 |
* | put compiled regular expression on module level so that i... | Arthur de Jong | 2005-08-12 | 1 | -2/+4 |
* | make parsing handle errors a little more gracefully, than... | Arthur de Jong | 2005-08-01 | 1 | -3/+6 |
* | also catch AttributeError for problem in HTMLParser not f... | Arthur de Jong | 2005-07-31 | 1 | -1/+1 |
* | replace numeric entity refs with their proper values base... | Arthur de Jong | 2005-07-31 | 1 | -2/+11 |
* | put new html parser in place | Arthur de Jong | 2005-07-31 | 1 | -88/+113 |
* | remove references to email addresses where they are not u... | Arthur de Jong | 2005-07-29 | 3 | -5/+5 |
* | empty module as place holder to parse CSS (referenced fro... | Arthur de Jong | 2005-07-25 | 1 | -0/+20 |
* | don't replace an already set title | Arthur de Jong | 2005-07-25 | 1 | -1/+2 |
* | Mike Meyer -> Mike W. Meyer | Arthur de Jong | 2005-07-23 | 1 | -1/+1 |
* | almost complete rewrite of crawling and site state code m... | Arthur de Jong | 2005-07-22 | 2 | -20/+65 |
* | move htmlparse to a more generic parsers package, cleanin... | Arthur de Jong | 2005-07-09 | 2 | -0/+128 |