| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
including line and column information
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@144 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
declaring it global
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@143 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
unreferenced links were removed and implement redirect loop detection
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@142 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
redirect loop detection code
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@141 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@140 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@139 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
links, etc) and link problems (errors retreiving the document)
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@138 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@136 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@135 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@134 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
a host open to do multiple requests (this greatly speeds up crawling of ftp sites)
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@133 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
handling errors more gracefully and also crawl normal ftp directories
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@132 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
link info
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@131 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
index.html from directory, otherwise read directory contents
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@130 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
mimetypes module
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@129 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
content if we can parse the content type
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@128 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@127 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
(currently only checks for spaces in URLs)
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@126 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
it is compiled only once
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@125 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@124 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
formatted size
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@123 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
thanks to Stefan Schröder <stefan@tokonoma.de> for all the testing
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@122 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@120 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
fully supporting continuing after errors
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@119 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@118 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
based on patch by Eric W.Brown <eric@saugus.net>
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@117 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@116 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@115 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
also clean added internal URLs
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@114 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@113 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@112 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
redirect who's target is not crawled, also don't add children and embeds when we are an external link
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@111 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
has one (except the plugins themselves)
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@110 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@109 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
clean and handle errors cleaner and more consistently
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@108 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
debug message
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@107 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@106 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
without a fragment and with at least a slash for URLs with path elements
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@105 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@104 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@103 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
combine links into pages, combining page children and determining depth of every page and using all this in the sitemap
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@102 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
arthur@tiefighter.et.tudelft.nl to arthur@ch.tudelft.nl (including URLs etc)
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@101 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@100 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
useful, based on a partial patch by Evelyn Mitchell <efm@tummy.com>
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@99 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
<scottakirkwood@gmail.com> for spotting another one
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@98 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@97 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
<scottakirkwood@gmail.com>
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@96 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@94 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@93 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@92 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|