2005-07-31 20:44  arthur

	* [r119] parsers/html.py: also catch AttributeError for problem in
	  HTMLParser not fully supporting continuing after errors

2005-07-31 10:50  arthur

	* [r118] README: add note about supported versions of python

2005-07-31 09:45  arthur

	* [r117] parsers/html.py: replace numeric entity refs with their
	  proper values based on patch by UNKNOWN

2005-07-31 09:21  arthur

	* [r116] parsers/html.py: put new html parser in place

2005-07-31 09:14  arthur

	* [r115] schemes/https.py: add https module as a wrapper to the
	  http module

2005-07-31 09:02  arthur

	* [r114] crawler.py: while cleaning urls also make host part
	  lowercase and also clean added internal urls

2005-07-30 15:34  arthur

	* [r113] crawler.py: fix a thinko

2005-07-30 15:32  arthur

	* [r112] crawler.py: fix typo

2005-07-30 15:20  arthur

	* [r111] crawler.py: follow_link() now returns None when trying to
	  follow a redirect who's target is not crawled, also don't add
	  children and embeds when we are an external link

2005-07-30 14:05  arthur

	* [r110] plugins/__init__.py: remove version and author from
	  module as no other module has one (except the plugins themselves)

2005-07-30 14:04  arthur

	* [r109] config.py: remove support for extra configurable headers
	* [r108] schemes/http.py: reimplement http module to be a little
	  more generic and clean and handle errors cleaner and more
	  consistently

2005-07-30 14:00  arthur

	* [r107] crawler.py: give second search through website a slightly
	  different debug message

2005-07-30 13:59  arthur

	* [r106] crawler.py: also ignore io errors when retrieving
	  robots.txt files
	* [r105] crawler.py: make a _urlclean() function to always store a
	  proper url without a fragment and with at least a slash for urls
	  with path elements

2005-07-30 13:55  arthur

	* [r104] README: some minor tweaks in the documentation

2005-07-29 14:36  arthur

	* [r103] crawler.py: import time as we need it for sleep

2005-07-29 14:32  arthur

	* [r102] crawler.py, plugins/sitemap.py: do an extra breadth first
	  traversal of the site to combine links into pages, combining
	  page children and determining depth of every page and using all
	  this in the sitemap

2005-07-29 10:20  arthur

	* [r101] AUTHORS, README, config.py, webcheck.1: change email
	  address from arthur@tiefighter.et.tudelft.nl to
	  arthur@ch.tudelft.nl (including urls etc)

2005-07-29 10:18  arthur

	* [r100] webcheck.css: remove another reference of an email address

2005-07-29 10:11  arthur

	* [r99] NEWS, README, config.py, crawler.py, debugio.py,
	  parsers/__init__.py, parsers/css.py, parsers/html.py,
	  plugins/__init__.py, plugins/about.py, plugins/badlinks.py,
	  plugins/external.py, plugins/images.py, plugins/new.py,
	  plugins/notchkd.py, plugins/notitles.py, plugins/old.py,
	  plugins/problems.py, plugins/sitemap.py, plugins/slow.py,
	  plugins/urllist.py, schemes/__init__.py, schemes/file.py,
	  schemes/ftp.py, schemes/http.py, webcheck.py: remove references
	  to email addresses where they are not useful, based on a partial
	  patch by Evelyn Mitchell <efm@tummy.com>

2005-07-27 20:38  arthur

	* [r98] plugins/__init__.py, plugins/badlinks.py,
	  plugins/problems.py, plugins/sitemap.py: fix a couple of typos,
	  also thanks to Scott Kirkwood <scottakirkwood@gmail.com> for
	  spotting another one

2005-07-27 20:32  arthur

	* [r97] crawler.py: turn tocheck list into fifo queue

2005-07-26 20:40  arthur

	* [r96] plugins/new.py, plugins/old.py: fix typo spotted by Scott
	  Kirkwood <scottakirkwood@gmail.com>

2005-07-25 17:29  arthur

	* [r94] ChangeLog, NEWS, config.py: get files ready for 1.9.1
	  release

2005-07-25 17:17  arthur

	* [r93] webcheck.1: fix typo, thanks to Stefan Schr�der
	  <stefan@tokonoma.de>

2005-07-25 17:16  arthur

	* [r92] plugins/slow.py: only report on internal links

2005-07-25 17:13  arthur

	* [r91] parsers/css.py: empty module as placeholder to parse css
	  (referenced from __init__.py already)

2005-07-25 17:11  arthur

	* [r90] parsers/html.py: don't replace an allready set title

2005-07-24 09:32  arthur

	* [r88] ChangeLog: add ChangeLog for release

2005-07-24 09:30  arthur

	* [r87] NEWS, TODO: get files ready for release

2005-07-24 08:56  arthur

	* [r86] README: clean up README removing sections that should be
	  in the manual page

2005-07-24 08:55  arthur

	* [r85] config.py, plugins/new.py, plugins/old.py,
	  plugins/whatsnew.py, plugins/whatsold.py: rename whatsold and
	  whatsnew plugins to old and new

2005-07-24 08:52  arthur

	* [r84] schemes/http.py: handle socket errors properly
	* [r83] schemes/http.py: fix for incomplete change in r76, now
	  version should not be referenced any more

2005-07-24 08:49  arthur

	* [r82] plugins/__init__.py, plugins/badlinks.py,
	  plugins/external.py, plugins/images.py, plugins/notchkd.py,
	  plugins/notitles.py, plugins/problems.py, plugins/sitemap.py,
	  plugins/slow.py, plugins/urllist.py, plugins/whatsnew.py,
	  plugins/whatsold.py: call make_link() with a link object instead
	  of a url, removing the need for a mySite in plugins

2005-07-24 08:47  arthur

	* [r81] plugins/badlinks.py: remove HTTP status code handling from
	  here as this should be done by the HTTP module
	* [r80] plugins/whatsnew.py, plugins/whatsold.py: only report on
	  internal links

2005-07-24 08:46  arthur

	* [r79] crawler.py: only add links to crawl list if they are not
	  in there allready

2005-07-24 08:45  arthur

	* [r78] debugio.py: flush stdout after each message so that
	  redirecting stdout and stderr together to a file works reliably

2005-07-23 14:02  arthur

	* [r77] crawler.py: fix regular expression matching

2005-07-23 12:55  arthur

	* [r76] config.py, plugins/__init__.py, schemes/http.py,
	  version.py, webcheck.1, webcheck.py: integrate versio.py into
	  config.py, clean up config.py removing unused settings and clean
	  up boolean types

2005-07-23 11:00  arthur

	* [r75] config.py, webcheck.1, webcheck.py: remove logo option
	  since the current output does not use one

2005-07-23 10:53  arthur

	* [r74] schemes/file.py: most systems already know about .shtml
	  files

2005-07-23 08:34  arthur

	* [r73] BUGS, INSTALL, README, webcheck.1: first step in cleaning
	  up documentation, integrating INSTALL in README and BUGS in
	  manual page and adding section on robots handling in manual

2005-07-23 08:28  arthur

	* [r72] AUTHORS, crawler.py, debugio.py, parsers/html.py,
	  plugins/__init__.py, plugins/about.py, plugins/badlinks.py,
	  plugins/external.py, plugins/images.py, plugins/notchkd.py,
	  plugins/notitles.py, plugins/problems.py, plugins/sitemap.py,
	  plugins/slow.py, plugins/whatsnew.py, plugins/whatsold.py,
	  schemes/file.py, schemes/ftp.py, schemes/http.py, version.py,
	  webcheck.1, webcheck.py: Mike Meyer -> Mike W. Meyer

2005-07-22 21:21  arthur

	* [r71] crawler.py: add support for sleep between requests

2005-07-22 21:11  arthur

	* [r70] webcheck.py: don't add . to python path as it's not needed
	  and put command line handling in same order as options

2005-07-22 21:05  arthur

	* [r69] plugins/__init__.py, webcheck.css: change layout to have a
	  simpler layout that also should work in MSIE

2005-07-22 21:04  arthur

	* [r68] debugio.py: fix docstrings

2005-07-22 21:01  arthur

	* [r67] plugins/__init__.py, webcheck.py: do not use start_time
	  from webcheck saving an import

2005-07-22 19:17  arthur

	* [r66] crawler.py, myUrlLib.py, parsers/__init__.py,
	  parsers/html.py, plugins/__init__.py, plugins/badlinks.py,
	  plugins/external.py, plugins/images.py, plugins/notchkd.py,
	  plugins/notitles.py, plugins/sitemap.py, plugins/slow.py,
	  plugins/urllist.py, plugins/whatsnew.py, plugins/whatsold.py,
	  schemes/__init__.py, schemes/file.py, schemes/ftp.py,
	  schemes/http.py, webcheck.py: almost complete rewrite of
	  crawling and site state code making children and parents link
	  objects instead of urls and giving link member variables better
	  names, change plugins accordingly, make scheme handling more
	  pluggable and only use one function call and have a better
	  pluggable structure for content parsing (currently only html)

2005-07-17 08:46  arthur

	* [r65] myUrlLib.py, plugins/__init__.py, plugins/badlinks.py,
	  plugins/external.py, plugins/images.py, plugins/notitles.py,
	  plugins/problems.py, plugins/sitemap.py, plugins/slow.py,
	  plugins/urllist.py, plugins/whatsnew.py, plugins/whatsold.py,
	  schemes/file.py, schemes/ftp.py, schemes/http.py, webcheck.py:
	  use lowercase url attribute in Link instead of uppercase URL

2005-07-16 15:35  arthur

	* [r64] plugins/__init__.py, plugins/badlinks.py,
	  plugins/external.py, plugins/images.py, plugins/notchkd.py,
	  plugins/notitles.py, plugins/problems.py, plugins/rptlib.py,
	  plugins/sitemap.py, plugins/slow.py, plugins/urllist.py,
	  plugins/whatsnew.py, plugins/whatsold.py, webcheck.py: move
	  functionality of rptlib.py to __init__.py so that we can just
	  use the plugins package

2005-07-16 15:33  arthur

	* [r63] plugins/__init__.py: remove __init__.py to be replaced by
	  contents of rptlib.py

2005-07-16 10:24  arthur

	* [r62] webcheck.1: add note about pattern matching

2005-07-10 14:08  arthur

	* [r61] myUrlLib.py, schemes/__init__.py, schemes/file.py,
	  schemes/ftp.py, schemes/http.py: rework scheme code to use more
	  logical function names, more clearly mark internal functions and
	  do some major cleanup of the scheme modules code

2005-07-10 12:26  arthur

	* [r60] myUrlLib.py, plugins/whatsnew.py, plugins/whatsold.py,
	  schemes/file.py, schemes/http.py: store mtime in link object
	  instead of age in days

2005-07-10 12:00  arthur

	* [r59] schemes/ftp.py, webcheck.py: remove unneeded import and
	  print

2005-07-09 20:22  arthur

	* [r58] htmlparse.py, myUrlLib.py, parsers, parsers/__init__.py,
	  parsers/html.py: move htmlparse to a more generic parsers
	  package, cleaning up the code and simplefying dependencies

2005-07-09 13:54  arthur

	* [r57] plugins/about.py, plugins/badlinks.py,
	  plugins/external.py, plugins/images.py, plugins/notchkd.py,
	  plugins/notitles.py, plugins/problems.py, plugins/rptlib.py,
	  plugins/sitemap.py, plugins/slow.py, plugins/urllist.py,
	  plugins/whatsnew.py, plugins/whatsold.py, webcheck.css,
	  webcheck.py: clean up html output generating xhtml 1.1 without
	  frames and using css for styling also getting rid of the images

2005-07-04 21:25  arthur

	* [r56] config.py: put plugins in a more logical order

2005-07-04 20:39  arthur

	* [r55] plugins/badlinks.py, plugins/external.py,
	  plugins/images.py, plugins/notchkd.py, plugins/notitles.py,
	  plugins/rptlib.py, plugins/sitemap.py, plugins/slow.py,
	  plugins/urllist.py, plugins/whatsnew.py, plugins/whatsold.py:
	  implement consistent sorting of all lists removing sort
	  functions from rptlib and using lambda functions where needed

2005-07-03 07:04  arthur

	* [r54] config.py, plugins/rptlib.py, schemes/http.py, webcheck.1:
	  handle and document proxy settings with environment variables

2005-07-03 06:36  arthur

	* [r53] INSTALL, README, config.py, myUrlLib.py,
	  plugins/rptlib.py, schemes/http.py, webcheck.1, webcheck.py:
	  name webcheck with lower case

2005-06-28 20:32  arthur

	* [r52] schemes/http.py: clean up get_reply() function to uses
	  proper recursion and don't use self where it doesn't make sense

2005-06-22 19:24  arthur

	* [r51] COPYING, debugio.py, htmlparse.py, myUrlLib.py,
	  plugins/about.py, plugins/badlinks.py, plugins/external.py,
	  plugins/images.py, plugins/notchkd.py, plugins/notitles.py,
	  plugins/problems.py, plugins/rptlib.py, plugins/sitemap.py,
	  plugins/slow.py, plugins/urllist.py, plugins/whatsnew.py,
	  plugins/whatsold.py, schemes/file.py, schemes/ftp.py,
	  schemes/http.py, version.py, webcheck.1, webcheck.py: change to
	  most recent version of the GPL (FSF address change) and update
	  notices

2005-06-18 19:59  arthur

	* [r50] plugins/external.py: sort external links by url

2005-06-18 13:48  arthur

	* [r49] webcheck.py: split main() part into it's own function

2005-06-18 13:32  arthur

	* [r48] plugins/rptlib.py, webcheck.py: restructure a couple of
	  things to reduce the number of mutual imports and reduce the
	  number of sutff gathered in webcheck.py

2005-06-18 13:31  arthur

	* [r47] config.py, plugins/urllist.py: add simple urllist plugin
	  to list all visited urls

2005-06-18 13:20  arthur

	* [r46] plugins/sitemap.py: only include internal links in sitemap

2005-06-18 12:49  arthur

	* [r45] config.py, webcheck.py: add problems plugin to config
	  instead of hard-coding

2005-06-18 10:25  arthur

	* [r44] plugins/rptlib.py: remove ugly redirection for overwrite
	  file question since we now write all html through a file
	  descriptor

2005-06-15 21:01  arthur

	* [r43] TODO, myUrlLib.py, plugins/about.py, plugins/badlinks.py,
	  plugins/external.py, plugins/images.py, plugins/notchkd.py,
	  plugins/notitles.py, plugins/problems.py, plugins/rptlib.py,
	  plugins/sitemap.py, plugins/slow.py, plugins/whatsnew.py,
	  plugins/whatsold.py, schemes/http.py, webcheck.py: pass
	  reference to Link class to plugins with parameter and make
	  import config where it is used instead of accessing it through
	  another module

2005-06-15 20:55  arthur

	* [r42] myUrlLib.py, plugins/rptlib.py, plugins/sitemap.py,
	  webcheck.py: make use of base consistent, do not modify it to
	  make a nicer url (at least not now) and do not overwrite it with
	  something silly from webcheck.py

2005-06-14 19:17  arthur

	* [r41] myUrlLib.py: also set URL attribute on yaked links

2005-06-12 06:21  arthur

	* [r40] plugins/badlinks.py, plugins/images.py,
	  plugins/notchkd.py, plugins/notitles.py: again use the url as
	  link title for some links

2005-06-11 21:52  arthur

	* [r39] httpcodes.py, plugins/about.py, plugins/badlinks.py,
	  plugins/external.py, plugins/images.py, plugins/notchkd.py,
	  plugins/notitles.py, plugins/problems.py, plugins/rptlib.py,
	  plugins/sitemap.py, plugins/slow.py, plugins/whatsnew.py,
	  plugins/whatsold.py: general cleanup of plugins structure and
	  code, moving httpcodes to the only place they were used,
	  cleaning up plugin titles, version numbers and descriptios,
	  adding docstrings and using slightly more logical and consistent
	  names (plus some other cleanups)

2005-06-11 21:39  arthur

	* [r38] plugins/rptlib.py: make_link(): if no title is specified,
	  try to look up the title of the page and fallback to the url as
	  title

2005-06-11 21:24  arthur

	* [r37] plugins/about.py: adapt plugin to using file descriptor etc

2005-06-11 18:52  arthur

	* [r36] contrib, plugins/about.py: move about plugin to plugins
	  directory

2005-06-08 19:29  arthur

	* [r35] plugins/badlinks.py, plugins/external.py,
	  plugins/images.py, plugins/notchkd.py, plugins/notitles.py,
	  plugins/problems.py, plugins/rptlib.py, plugins/sitemap.py,
	  plugins/slow.py, plugins/whatsnew.py, plugins/whatsold.py,
	  webcheck.py: write html files using file descriptors instead of
	  through redirection using stdout, split writing of navigation
	  frame and plugin pages plus some minor cleanups to calling
	  plugins

2005-06-08 19:10  arthur

	* [r34] plugins/__init__.py, schemes/__init__.py: claiming
	  copyright on empty files is silly

2005-06-06 21:22  arthur

	* [r33] debugio.py, htmlparse.py, myUrlLib.py, plugins/rptlib.py,
	  schemes/ftp.py, schemes/http.py, webcheck.1, webcheck.py: redo
	  output writing using a cleaner debugio and change debug command
	  line option

2005-06-06 20:11  arthur

	* [r32] plugins/badlinks.py, plugins/notchkd.py: replace a couple
	  more tabs

2005-06-06 20:05  arthur

	* [r31] webcheck.1: initial version of manual page loosely based
	  on documentation

2005-06-06 19:22  arthur

	* [r30] AUTHORS: added myself as copyright holder and added
	  Bastian Kleineidam (previous debian package maintainer) as
	  contributor

2005-06-06 19:20  arthur

	* [r29] webcheck.py: small text improvement

2005-05-27 20:39  arthur

	* [r28] webcheck.sh: remove unneeded shell script

2005-05-27 20:28  arthur

	* [r27] webcheck.py: also support --force

2005-05-27 20:18  arthur

	* [r26] webcheck.py: redo command-line checking

2005-04-13 19:41  arthur

	* [r25] contrib/plugins/about.py: general cleanup
	* [r24] plugins/sitemap.py: rework recursion to make it simpler
	  plus some general cleanups

2005-04-13 19:20  arthur

	* [r23] contrib/plugins/about.py, myUrlLib.py,
	  plugins/badlinks.py, plugins/external.py, plugins/images.py,
	  plugins/notchkd.py, plugins/notitles.py, plugins/problems.py,
	  plugins/rptlib.py, plugins/sitemap.py, plugins/slow.py,
	  plugins/whatsnew.py, plugins/whatsold.py, schemes/http.py,
	  webcheck.py: rename linkList to linkMap

2005-04-13 19:18  arthur

	* [r22] myUrlLib.py, robotparser.py: remove local copy of
	  robotparser, just use python\'s

2005-04-09 20:03  arthur

	* [r21] myUrlLib.py: qualify references to types functions

2005-04-09 13:48  arthur

	* [r20] htmlparse.py, myUrlLib.py, plugins/badlinks.py,
	  plugins/external.py, plugins/images.py, plugins/notchkd.py,
	  plugins/notitles.py, plugins/rptlib.py, plugins/slow.py,
	  plugins/whatsnew.py, plugins/whatsold.py, schemes/http.py:
	  indent with spaces instead of tabs (tabs are evil)

2005-04-08 21:31  arthur

	* [r19] myUrlLib.py: move finding of scheme module to separate
	  function

2005-04-08 21:25  arthur

	* [r18] schemes/http.py: rebump loglevel to debug

2005-04-08 16:24  arthur

	* [r17] myUrlLib.py, schemes/file.py, schemes/filelink.py,
	  schemes/ftp.py, schemes/ftplink.py, schemes/http.py,
	  schemes/httplink.py: remove link part from scheme modules

2005-04-07 22:37  arthur

	* [r16] schemes/httplink.py: clean up http request code a little
	  and do not set host header (it is sent by HTTPConnection already

2005-04-07 20:29  arthur

	* [r15] contrib/plugins/about.py, debugio.py, htmlparse.py,
	  httpcodes.py, myUrlLib.py, plugins/__init__.py,
	  plugins/badlinks.py, plugins/external.py, plugins/images.py,
	  plugins/notchkd.py, plugins/notitles.py, plugins/problems.py,
	  plugins/rptlib.py, plugins/sitemap.py, plugins/slow.py,
	  plugins/whatsnew.py, plugins/whatsold.py, schemes/__init__.py,
	  schemes/filelink.py, schemes/ftplink.py, version.py,
	  webcheck.py: make nicer file (copyrights) headers

2005-04-07 20:23  arthur

	* [r14] schemes/httplink.py: fix problem with incorrect indent

2005-04-07 20:06  arthur

	* [r13] config.py, httpcodes.py, plugins/notitles.py: tabs to
	  spaces (tabs are evil)

2005-04-07 20:05  arthur

	* [r12] config.py, contrib/plugins/about.py, httpcodes.py,
	  plugins/badlinks.py, plugins/external.py, plugins/notchkd.py,
	  plugins/notitles.py, plugins/problems.py, plugins/rptlib.py,
	  plugins/sitemap.py, plugins/slow.py, plugins/whatsnew.py,
	  plugins/whatsold.py, schemes/filelink.py, schemes/ftplink.py,
	  schemes/httplink.py: tabs to spaces (tabs are evil)

2005-04-07 20:04  arthur

	* [r11] AUTHORS, schemes/httplink.py: include patch from Sebastien
	  Delafond <sdelafond@gmx.net> (from
	  http://bugs.debian.org/286017) to fix problems with recent
	  versions of python

2005-04-06 19:38  arthur

	* [r10] INSTALL, config.py, htmlparse.py, plugins/images.py,
	  plugins/rptlib.py, schemes/ftplink.py, schemes/httplink.py,
	  webcheck.css, webcheck.py: import Debian package patches

2005-03-31 12:47  arthur

	* [r9] COPYING: install updated file without millenium bug

2005-03-31 12:45  arthur

	* [r8] AUTHORS: reformat file to better match suggested layout

2005-03-31 12:44  arthur

	* [r7] NEWS: put news items in a little more standard format

2005-03-31 12:42  arthur

	* [r6] AUTHORS, CHANGES, CREDITS, ChangeLog-1999, ChangeLog-2002,
	  HISTORY, HISTORY.linbot, NEWS: rename files to more standard
	  names

2005-03-31 12:32  arthur

	* [r5] config.py, plugins/rptlib.py, version.py: remove checks for
	  updates (registry)

2005-03-31 12:28  arthur

	* [r4] ., contrib, contrib/plugins, plugins, schemes: ignore
	  compiled python objects

2005-03-29 12:08  arthur

	* [r2] BUGS, CHANGES, COPYING, CREDITS, HISTORY, HISTORY.linbot,
	  INSTALL, README, TODO, config.py, contrib, contrib/plugins,
	  contrib/plugins/about.py, debugio.py, htmlparse.py,
	  httpcodes.py, myUrlLib.py, plugins, plugins/__init__.py,
	  plugins/badlinks.py, plugins/external.py, plugins/images.py,
	  plugins/notchkd.py, plugins/notitles.py, plugins/problems.py,
	  plugins/rptlib.py, plugins/sitemap.py, plugins/slow.py,
	  plugins/whatsnew.py, plugins/whatsold.py, robotparser.py,
	  schemes, schemes/__init__.py, schemes/filelink.py,
	  schemes/ftplink.py, schemes/httplink.py, version.py,
	  webcheck.css, webcheck.py, webcheck.sh: import of release 1.0

2005-03-28 12:57  arthur

	* [r1] .: create webcheck directory