| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@378 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
password information to specific sites based on a patch by Chris Shenton <Chris.Shenton@nasa.gov>
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@370 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@369 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@365 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
profiling information in output directory
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@356 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
in 1.9.8) and add new --ignore-robots option to be able to ignore robots retrieval
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@330 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@329 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@309 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
point where the previous crawl stopped
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@286 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@285 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
again after crawl
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@284 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
crawler.postprocess() functions
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@279 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@261 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@259 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
command line a little more gracefully
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@244 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@242 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
plugin itself to be able to write different kinds of files
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@227 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@225 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
regular expression
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@204 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@190 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@187 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
output files are not covered by our copyright
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@186 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
short options
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@181 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
Schröder <stefan@tokonoma.de> for spotting this)
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@173 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@172 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
error messages
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@162 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
while copying files into place
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@159 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
command line
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@154 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
function
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@153 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
declaring it global
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@143 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
useful, based on a partial patch by Evelyn Mitchell <efm@tummy.com>
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@99 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
removing unused settings and clean up boolean types
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@76 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@75 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@72 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
command line handling in same order as options
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@70 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@67 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
making children and parents link objects instead of URLs and giving link member variables better names, change plugins accordingly, make scheme handling more pluggable and only use one function call and have a better pluggable structure for content parsing (currently only html)
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@66 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
upper-case URL
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@65 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
we can just use the plugins package
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@64 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@59 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
and using CSS for styling also getting rid of the images
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@57 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@53 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
change) and update notices
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@51 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@49 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
mutual imports and reduce the number of stuff gathered in webcheck.py
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@48 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@45 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
and make import config where it is used instead of accessing it through another module
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@43 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
nicer URL (at least not now) and do not overwrite it with something silly from webcheck.py
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@42 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
through redirection using stdout, split writing of navigation frame and plugin pages plus some minor clean-ups to calling plugins
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@35 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
debug command line option
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@33 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|