Arthur de Jong

Open Source / Free Software developer

summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* fix a typo1.9.4Arthur de Jong2005-09-031-1/+1
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck-1.9.4@166 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* fix NEWS fileArthur de Jong2005-09-031-3/+0
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck-1.9.4@165 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* release 1.9.4Arthur de Jong2005-09-030-0/+0
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck-1.9.4@164 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* get files ready for 1.9.4 releaseArthur de Jong2005-09-035-16/+181
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@163 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* make error handling more robust and have consistent ↵Arthur de Jong2005-09-012-8/+17
| | | | | | error messages git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@162 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* add Herbert Weinhandl <weinhand@unileoben.ac.at> to ↵Arthur de Jong2005-09-011-0/+1
| | | | | | contributors git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@161 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* add some design notes for developersArthur de Jong2005-09-011-0/+28
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@160 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* add extra checks not to overwrite our own input file ↵Arthur de Jong2005-09-011-12/+24
| | | | | | while copying files into place git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@159 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* typo fixArthur de Jong2005-09-011-2/+1
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@158 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* highlight current plugin in the navigation, based on a ↵Arthur de Jong2005-09-012-21/+33
| | | | | | patch by Herbert Weinhandl <weinhand@unileoben.ac.at> git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@157 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* make specifying of target in links configurable ↵Arthur de Jong2005-08-302-3/+9
| | | | | | (disabled by default to keep page valid XHTML 1.1) git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@156 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* add note about making instances of Link classArthur de Jong2005-08-251-0/+3
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@155 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* handle passing file names (instead of URLs) on the ↵Arthur de Jong2005-08-231-0/+5
| | | | | | command line git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@154 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* add initial support for passing URLs to install_file() ↵Arthur de Jong2005-08-231-20/+23
| | | | | | function git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@153 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* include transfer problem in pageproblem descriptionArthur de Jong2005-08-231-2/+3
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@152 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* make problem lists sorted by URL and problem descriptionArthur de Jong2005-08-231-0/+4
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@151 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* include short description in plugin overview pageArthur de Jong2005-08-211-2/+2
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@150 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* add some other people to the AUTHORS file, mostly based ↵Arthur de Jong2005-08-211-0/+3
| | | | | | on contents of the Debian bug tracking system git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@149 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* also feed style tag content to the CSS parser to parse ↵Arthur de Jong2005-08-201-0/+7
| | | | | | inline CSS git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@148 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* remove some debugging functions from CSS parserArthur de Jong2005-08-201-3/+0
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@147 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* first attempt at a very simple CSS parser that just ↵Arthur de Jong2005-08-201-1/+28
| | | | | | summarises links to images and imported CSS files git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@146 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* set status to result of fetching the document (not an ↵Arthur de Jong2005-08-203-2/+7
| | | | | | error indicator) git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@145 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* add checking of unescaped spaces to the html parser, ↵Arthur de Jong2005-08-201-25/+41
| | | | | | including line and column information git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@144 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* pass site as parameter to parse_args() instead of ↵Arthur de Jong2005-08-191-4/+2
| | | | | | declaring it global git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@143 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* fix bug with following redirects where otherwise ↵Arthur de Jong2005-08-191-4/+7
| | | | | | unreferenced links were removed and implement redirect loop detection git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@142 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* move redirect handling code to crawler module, including ↵Arthur de Jong2005-08-194-24/+32
| | | | | | redirect loop detection code git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@141 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* fix html bug and improve bad link stringArthur de Jong2005-08-191-2/+3
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@140 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* change html display of problems to a nicer listArthur de Jong2005-08-196-8/+19
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@139 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* split problems into page problems (parsing errors, wrong ↵Arthur de Jong2005-08-1911-47/+65
| | | | | | links, etc) and link problems (errors retreiving the document) git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@138 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* get files ready for 1.9.3 release1.9.3Arthur de Jong2005-08-165-18/+114
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@136 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* pick up configured filenames if present in directoriesArthur de Jong2005-08-163-46/+76
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@135 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* add extra debugging infoArthur de Jong2005-08-161-8/+15
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@134 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* use a pool of ftp connections to keep ftp connection to ↵Arthur de Jong2005-08-131-18/+25
| | | | | | a host open to do multiple requests (this greatly speeds up crawling of ftp sites) git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@133 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* almost complete reimplementation of the ftp scheme, ↵Arthur de Jong2005-08-131-62/+64
| | | | | | handling errors more gracefully and also crawl normal ftp directories git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@132 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* add missing newline and trim trailing newline of extra ↵Arthur de Jong2005-08-131-2/+3
| | | | | | link info git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@131 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* complete reimplementation of file module, reading ↵Arthur de Jong2005-08-121-21/+49
| | | | | | index.html from directory, otherwise read directory contents git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@130 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* rename parameter to acceptedtypes to not conflict with ↵Arthur de Jong2005-08-124-6/+6
| | | | | | mimetypes module git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@129 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* also pass mimetypes to scheme modules to only fetch ↵Arthur de Jong2005-08-126-15/+29
| | | | | | content if we can parse the content type git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@128 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* don't print referenced from if there are no parentsArthur de Jong2005-08-121-0/+3
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@127 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* add checkurl method to clean up URLs and report problems ↵Arthur de Jong2005-08-121-2/+14
| | | | | | (currently only checks for spaces in URLs) git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@126 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* put compiled regular expression on module level so that ↵Arthur de Jong2005-08-121-2/+4
| | | | | | it is compiled only once git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@125 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* small fix to render menu better under MSIEArthur de Jong2005-08-121-1/+1
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@124 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* add some extra information to every link with a nicely ↵Arthur de Jong2005-08-111-2/+60
| | | | | | formatted size git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@123 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* make parsing handle errors a little more gracefully, ↵Arthur de Jong2005-08-011-3/+6
| | | | | | thanks to Stefan Schröder <stefan@tokonoma.de> for all the testing git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@122 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* get files ready for 1.9.2 release1.9.2Arthur de Jong2005-07-314-248/+391
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@120 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* also catch AttributeError for problem in HTMLParser not ↵Arthur de Jong2005-07-311-1/+1
| | | | | | fully supporting continuing after errors git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@119 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* add note about supported versions of pythonArthur de Jong2005-07-311-0/+4
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@118 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* replace numeric entity refs with their proper values ↵Arthur de Jong2005-07-311-2/+11
| | | | | | based on patch by Eric W.Brown <eric@saugus.net> git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@117 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* put new html parser in placeArthur de Jong2005-07-311-88/+113
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@116 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* add https module as a wrapper to the http moduleArthur de Jong2005-07-311-0/+26
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@115 86f53f14-5ff3-0310-afe5-9b438ce3f40c