before next release ------------------- * go over all FIXMEs in code * rewrite ftp scheme module * rewrite file scheme module probably before 2.0 release --------------------------- * parse css * maybe choose a different license for webcheck.css * make it possible to copy or reference webcheck.css * make it possible to copy http:.../webcheck.css into place (use scheme system) * create onmouseover information for links containing useful information for url * make more things configurable * make a Debian package * maybe generate a list of page parents (this is useful to list proper parent links for problem pages) * figure out if we need parents and pageparents * make configurable time-out when retrieving a document * support for mult-threading (maybe) * divide problems in transfer problems and page problems (transfer problems result in a bad link problem on a page) * clean up printing of messages, especially needed for multi-threading * rewrite scheme modules to make proper use of new calling method * only download complete documents if the mime type is supported * go over command line options and see if we need long equivalents * implement a fix for redirecting stdout and stderr to work properly * put a maximum transfer size for downloading files and things over http * make error handling of html parser more robust wishlist -------- * make code for stripping last part of a url (e.g. foo/index.html -> foo/) * translate file paths to file:/// urls on the command line * maybe set referer (configurable) * support for authenticating proxies * new config file format (if we want a configfile at all) * cookies support (maybe) * integration with weblint * combine with a logfile checker to also show number of hits per link * performance and other improvements (we can switch to sets with python 2.4) * write a guide to writing plugins * form checking * spelling checking * test w3c conformance of pages * maybe make broken links not clickable * maybe store crawled site's data in some format for later processing or continuing after interruption * create output directory if it does not exist * add support for fetching gzipped content * write section on internal and external urls in the manual page * add a favicon to reports * add a test to see if python supports https and fail elegantly otherwise * maybe follow redirects of external links