2008-07-19 11:34 arthur * [r389] debian/control: build-depend on newer version of python-support and support any python version>=2.3 2008-07-19 11:28 arthur * [r388] debian/control: upgrade to standards-version 3.8.0 (no changes needed) 2008-07-19 11:07 arthur * [r387] crawler.py: add docstring 2008-07-13 09:39 arthur * [r386] parsers/html/beautifulsoup.py: copy-paste fix (thanks Robert M. Jansen ) 2008-07-08 19:43 arthur * [r385] schemes/http.py: set correct Host header without port number in it 2008-07-08 19:42 arthur * [r384] myurllib.py: remove default port from URLs 2008-07-04 13:00 arthur * [r383] AUTHORS, config.py, debian/control, parsers/html/__init__.py, parsers/html/calltidy.py: call tidy (if available) on HTML content (based on a patch by Henning Sielaff ) 2008-07-04 12:58 arthur * [r382] parsers/html/beautifulsoup.py: fix name of file 2008-06-21 08:36 arthur * [r381] parsers/html/beautifulsoup.py, parsers/html/htmlparser.py: also pick up any style attributes and parse as css, based on a patch by Robert M. Jansen 2008-06-15 21:17 arthur * [r380] AUTHORS, parsers/html/beautifulsoup.py, parsers/html/htmlparser.py: add parsing of script tag and background attributes, based on a patch by Robert M. Jansen 2008-06-15 21:00 arthur * [r379] parsers/html/beautifulsoup.py: do not require src attribute for parsing inline style tags 2008-06-15 20:56 arthur * [r378] crawler.py, debian/copyright, parsers/html/beautifulsoup.py, webcheck.py: update copyright year 2008-06-15 20:54 arthur * [r377] debian/changelog: include change log from release 1.10.2.0 2008-05-25 12:31 arthur * [r376] parsers/html/beautifulsoup.py: fix parsing of tag 2008-05-25 12:30 arthur * [r375] crawler.py: also log exception information to the output 2008-05-24 21:38 arthur * [r374] AUTHORS: add Chris Shenton since he provided a patch for part of the -u functionality 2008-05-24 21:35 arthur * [r373] parsers/html/beautifulsoup.py: support