| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@224 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@223 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
counts
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@221 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@213 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
and no longer delete unreferenced followed links
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@212 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@209 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
crash on improperly formatted URLs
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@206 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
regular expression
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@204 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
output files are not covered by our copyright
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@186 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@184 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
that they are output as UTF-8
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@179 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
store it in the link object
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@175 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@155 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
error indicator)
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@145 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
unreferenced links were removed and implement redirect loop detection
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@142 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
redirect loop detection code
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@141 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
links, etc) and link problems (errors retreiving the document)
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@138 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
content if we can parse the content type
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@128 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
(currently only checks for spaces in URLs)
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@126 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
also clean added internal URLs
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@114 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@113 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@112 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
redirect who's target is not crawled, also don't add children and embeds when we are an external link
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@111 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
debug message
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@107 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@106 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
without a fragment and with at least a slash for URLs with path elements
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@105 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@103 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
combine links into pages, combining page children and determining depth of every page and using all this in the sitemap
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@102 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
useful, based on a partial patch by Evelyn Mitchell <efm@tummy.com>
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@99 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@97 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
|
|
| |
allready
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@79 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@77 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@72 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
|
|
| |
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@71 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|
|
making children and parents link objects instead of URLs and giving link member variables better names, change plugins accordingly, make scheme handling more pluggable and only use one function call and have a better pluggable structure for content parsing (currently only html)
git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@66 86f53f14-5ff3-0310-afe5-9b438ce3f40c
|