Arthur de Jong

Open Source / Free Software developer

summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* get files ready for 1.10.2 release1.10.2Arthur de Jong2007-11-047-7/+86
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@363 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* move Homepage pseudo header to control header and remove ↵Arthur de Jong2007-11-041-4/+4
| | | | | | XS- prefix for Vcs tags git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@362 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* add comma for readabilityArthur de Jong2007-11-041-1/+1
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@361 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* fix reference to GPL file in common-licensesArthur de Jong2007-11-041-1/+1
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@360 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* updated information about Python versions to use and add ↵Arthur de Jong2007-11-041-4/+11
| | | | | | section about Distutils git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@359 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* add workaround for Python 2.3 (based on a patch by ↵Arthur de Jong2007-10-094-1/+19
| | | | | | Claire Connelly <cmc@math.hmc.edu>) git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@358 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* add a warning if the used version of BeautifulSoup ↵Arthur de Jong2007-09-171-0/+5
| | | | | | contains a bug git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@357 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* implement undocumented --profile option to write ↵Arthur de Jong2007-07-222-7/+25
| | | | | | profiling information in output directory git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@356 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* remove old linbot provides/conflicts/replaces stuff as ↵Arthur de Jong2007-07-221-3/+0
| | | | | | linbot last shipped in woody git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@355 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* update recommends to python-beautifulsoup version 3.0.2 ↵Arthur de Jong2007-07-221-1/+1
| | | | | | or later since that version fixes a problem with find(attr=True) git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@354 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* get files ready for 1.10.1 release1.10.1Arthur de Jong2007-07-157-25/+147
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@352 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* output which parser module is used in debug modeArthur de Jong2007-07-151-0/+1
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@351 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* fix spelling in ChangeLog messagesArthur de Jong2007-07-151-94/+94
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@350 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* also handle http-equiv refresh meta headerArthur de Jong2007-07-151-3/+13
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@349 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* just ignore setting encoding to NoneArthur de Jong2007-07-151-1/+1
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@348 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* fix printing of None encodingArthur de Jong2007-07-141-1/+1
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@347 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* simplify _normalize_escapes() function to improve ↵Arthur de Jong2007-07-141-22/+30
| | | | | | performance git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@346 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* replace double slashes in file URL paths with single onesArthur de Jong2007-07-141-0/+6
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@345 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* add note about improving performance moreArthur de Jong2007-07-131-0/+2
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@344 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* use sets instead of sequences for children, embedded, ↵Arthur de Jong2007-07-134-56/+48
| | | | | | etc to improve deserialization performance with a factor 25 but now require python 2.4 of more recent git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@343 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* give the matched URL a name to make code more readableArthur de Jong2007-07-131-1/+2
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@342 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* be a little more verbose when raising parsing exceptionsArthur de Jong2007-07-131-5/+5
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@341 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* get rid of unneeded sortArthur de Jong2007-07-131-1/+0
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@340 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* split out URL cleaning code into own moduleArthur de Jong2007-07-074-65/+105
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@339 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* do not handle control-C and pass it along to the main ↵Arthur de Jong2007-07-071-1/+4
| | | | | | exception handler and log http exceptions with a higher level git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@338 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* added XS-Vcs-Svn and XS-Vcs-Browser as specified in #391023Arthur de Jong2007-07-071-0/+2
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@337 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* improve deserialization and handling of Unicode stringsArthur de Jong2007-07-062-18/+14
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@336 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* some extra precautions for handling Unicode data and ↵Arthur de Jong2007-07-062-4/+4
| | | | | | correct HTML escaping git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@335 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* get files ready for 1.10.0 release1.10.0Arthur de Jong2007-05-128-18/+154
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@333 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* also lower-case reqanchorArthur de Jong2007-05-121-0/+2
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@332 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* fix some copyright datesArthur de Jong2007-05-125-5/+5
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@331 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* switch robots.txt handling to default on again (broken ↵Arthur de Jong2007-05-123-2/+15
| | | | | | in 1.9.8) and add new --ignore-robots option to be able to ignore robots retrieval git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@330 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* present the default number of redirectsArthur de Jong2007-05-091-2/+3
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@329 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* update copyright informationArthur de Jong2007-05-081-2/+2
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@328 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* fixes to make output XHTML 1.1 compliantArthur de Jong2007-04-243-8/+20
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@327 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* handle ID attribute as anchor on any tagArthur de Jong2007-04-241-5/+5
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@326 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* lower-case anchor and errors to include id as optionArthur de Jong2007-04-242-2/+4
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@325 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* correctly parse author informationArthur de Jong2007-04-201-2/+2
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@324 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* introduce HTML parsing using BeautifulSoup with a ↵Arthur de Jong2007-04-204-64/+256
| | | | | | fall-back mechanism to the old HTMLParser based solution git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@323 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* mark encoding problems and output more debuggingArthur de Jong2007-04-201-2/+2
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@322 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* fix formatting of previous changelog entryArthur de Jong2007-04-201-3/+2
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@321 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* fix typoArthur de Jong2007-04-201-1/+1
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@320 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* add workaround for bug in idna moduleArthur de Jong2007-04-061-0/+5
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@319 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* add some comments to the follow_link() methodArthur de Jong2007-04-061-0/+4
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@318 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* make parsing of URLs and conversion to Link objects a ↵Arthur de Jong2007-04-061-9/+28
| | | | | | little more consistent git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@317 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* use consistent Unicode conversionArthur de Jong2007-04-061-8/+14
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@316 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* document the fact that --force should be used for ↵Arthur de Jong2007-04-061-1/+2
| | | | | | non-interactive use git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@315 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* bail out if reading user input failedArthur de Jong2007-04-061-1/+6
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@314 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* evaluate archive attribute of <applet> tag instead of ↵Arthur de Jong2007-03-311-2/+5
| | | | | | code attribute if that is present git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@313 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* get rid of old base (singular) as bases is now used ↵Arthur de Jong2007-03-141-3/+0
| | | | | | everywhere git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@312 86f53f14-5ff3-0310-afe5-9b438ce3f40c