| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@220 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
crash on improperly formatted URLs
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@206 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
content-type header
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@192 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
output files are not covered by our copyright
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@186 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@185 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@182 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
store it in the link object
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@175 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
initial fixes to get proxying HTTPS traffic working
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@171 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@168 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
error indicator)
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@145 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
redirect loop detection code
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@141 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
links, etc) and link problems (errors retreiving the document)
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@138 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@135 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@134 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
a host open to do multiple requests (this greatly speeds up crawling of ftp sites)
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@133 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
handling errors more gracefully and also crawl normal ftp directories
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@132 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
index.html from directory, otherwise read directory contents
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@130 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
mimetypes module
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@129 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
content if we can parse the content type
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@128 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@115 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
clean and handle errors cleaner and more consistently
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@108 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
useful, based on a partial patch by Evelyn Mitchell <efm@tummy.com>
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@99 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@84 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
be referenced any more
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@83 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
removing unused settings and clean up boolean types
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@76 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@74 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@72 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
making children and parents link objects instead of URLs and giving link member variables better names, change plugins accordingly, make scheme handling more pluggable and only use one function call and have a better pluggable structure for content parsing (currently only html)
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@66 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
upper-case URL
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@65 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
more clearly mark internal functions and do some major clean-up of the scheme modules code
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@61 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@60 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@59 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
variables
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@54 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@53 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
and don't use self where it doesn't make sense
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@52 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
change) and update notices
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@51 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
and make import config where it is used instead of accessing it through another module
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@43 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@34 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
debug command line option
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@33 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@23 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@20 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@18 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@17 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
header (it is sent by HTTPConnection already
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@16 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@15 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@14 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@12 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
<sdelafond@gmx.net> (from http://bugs.debian.org/286017) to fix problems with recent versions of python
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@11 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@10 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@2 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|