Arthur de Jong

Open Source / Free Software developer

summaryrefslogtreecommitdiffstats
path: root/parsers
Commit message (Expand)AuthorAgeFilesLines
* also feed style tag content to the CSS parser to parse in...Arthur de Jong2005-08-201-0/+7
* remove some debugging functions from CSS parserArthur de Jong2005-08-201-3/+0
* first attempt at a very simple CSS parser that just summa...Arthur de Jong2005-08-201-1/+28
* add checking of unescaped spaces to the html parser, incl...Arthur de Jong2005-08-201-25/+41
* split problems into page problems (parsing errors, wrong ...Arthur de Jong2005-08-191-1/+1
* also pass mimetypes to scheme modules to only fetch conte...Arthur de Jong2005-08-121-6/+18
* put compiled regular expression on module level so that i...Arthur de Jong2005-08-121-2/+4
* make parsing handle errors a little more gracefully, than...Arthur de Jong2005-08-011-3/+6
* also catch AttributeError for problem in HTMLParser not f...Arthur de Jong2005-07-311-1/+1
* replace numeric entity refs with their proper values base...Arthur de Jong2005-07-311-2/+11
* put new html parser in placeArthur de Jong2005-07-311-88/+113
* remove references to email addresses where they are not u...Arthur de Jong2005-07-293-5/+5
* empty module as place holder to parse CSS (referenced fro...Arthur de Jong2005-07-251-0/+20
* don't replace an already set titleArthur de Jong2005-07-251-1/+2
* Mike Meyer -> Mike W. MeyerArthur de Jong2005-07-231-1/+1
* almost complete rewrite of crawling and site state code m...Arthur de Jong2005-07-222-20/+65
* move htmlparse to a more generic parsers package, cleanin...Arthur de Jong2005-07-092-0/+128