Arthur de Jong

Open Source / Free Software developer

summaryrefslogtreecommitdiffstats
path: root/parsers
Commit message (Collapse)AuthorAgeFilesLines
...
* replace numeric entity refs with their proper values ↵Arthur de Jong2005-07-311-2/+11
| | | | | | based on patch by Eric W.Brown <eric@saugus.net> git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@117 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* put new html parser in placeArthur de Jong2005-07-311-88/+113
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@116 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* remove references to email addresses where they are not ↵Arthur de Jong2005-07-293-5/+5
| | | | | | useful, based on a partial patch by Evelyn Mitchell <efm@tummy.com> git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@99 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* empty module as place holder to parse CSS (referenced ↵Arthur de Jong2005-07-251-0/+20
| | | | | | from __init__.py already) git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@91 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* don't replace an already set titleArthur de Jong2005-07-251-1/+2
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@90 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* Mike Meyer -> Mike W. MeyerArthur de Jong2005-07-231-1/+1
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@72 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* almost complete rewrite of crawling and site state code ↵Arthur de Jong2005-07-222-20/+65
| | | | | | making children and parents link objects instead of URLs and giving link member variables better names, change plugins accordingly, make scheme handling more pluggable and only use one function call and have a better pluggable structure for content parsing (currently only html) git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@66 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* move htmlparse to a more generic parsers package, ↵Arthur de Jong2005-07-092-0/+128
cleaning up the code and simplifying dependencies git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@58 86f53f14-5ff3-0310-afe5-9b438ce3f40c