Arthur de Jong

Open Source / Free Software developer

summaryrefslogtreecommitdiffstats
path: root/webcheck
Commit message (Expand)AuthorAgeFilesLines
* Provide a CSV file pluginHEADmasterArthur de Jong2013-12-152-1/+64
* Remove duplicate column definitionArthur de Jong2013-12-151-1/+0
* Split functionality into Link.get_or_create()Arthur de Jong2013-12-152-17/+21
* Rename some functionsArthur de Jong2013-12-152-16/+16
* Small simplificationArthur de Jong2013-12-151-1/+1
* Move SQLite initialisation to db moduleArthur de Jong2013-12-152-14/+13
* Remove annoying debug log messageArthur de Jong2013-12-151-2/+1
* Store link and page problems as unicodeArthur de Jong2013-12-021-4/+10
* Only convert content if link has encodingArthur de Jong2013-12-021-1/+2
* Move static files to webcheck/staticArthur de Jong2013-12-027-28/+783
* Fix missing importArthur de Jong2013-12-021-0/+1
* Fix setuptools entry point invocationArthur de Jong2013-12-021-1/+1
* Support older versions of JinjaArthur de Jong2013-11-181-2/+5
* Optimise count_parents()Arthur de Jong2013-10-061-11/+4
* Use crawler.base_urls instead of crawler.basesArthur de Jong2013-09-282-35/+32
* Introduce a site_name in the crawlerArthur de Jong2013-09-283-5/+7
* Fix old and new templates to use datetime objectsArthur de Jong2013-09-284-14/+8
* Fix time formattingArthur de Jong2013-09-281-1/+1
* Get response size and modified date from requestArthur de Jong2013-09-281-3/+9
* Add missing template changes from Jinja mergeArthur de Jong2013-09-223-13/+40
* Remove unused codeArthur de Jong2013-09-223-259/+0
* Switch plugins to use templateArthur de Jong2013-09-2224-458/+671
* Introduce template macros for rendering linksArthur de Jong2013-09-222-0/+86
* Introduce a base templateArthur de Jong2013-09-222-0/+64
* Provide function for template-based report renderingArthur de Jong2013-09-223-3/+25
* Properly write an UTF-8 encoded output fileArthur de Jong2013-09-221-8/+9
* Explicityly close database sessionsArthur de Jong2013-09-2213-11/+27
* Initialise crawler with a configurationArthur de Jong2013-09-202-74/+57
* Expose configured plugins via crawler.pluginsArthur de Jong2013-09-203-34/+29
* Get default configuration from config moduleArthur de Jong2013-09-202-10/+20
* Use the argparse Python moduleArthur de Jong2013-09-201-136/+110
* pass a string to RobotFileParser because of problems with...Arthur de Jong2012-08-291-1/+1
* now setup.py gets homepage and version from webcheck/__in...Devin Bayer2011-11-192-4/+3
* cleanup after introduction of entry_pointDevin Bayer2011-11-161-11/+3
* move cmd.py to package to support an entry point called w...Devin Bayer2011-11-161-0/+208
* in old html parser, handle more invalid encodingsDevin Bayer2011-11-161-4/+3
* support MAX_DEPTH == 0Devin Bayer2011-11-161-1/+1
* detect self-referencing redirects even with intermediate ...Devin Bayer2011-11-161-7/+10
* fix encoding issues with strings passed to/from tidyArthur de Jong2011-11-082-2/+4
* implement a MAX_DEPTH configuration option to limit crawl...Arthur de Jong2011-11-043-3/+14
* simplification in size calculationArthur de Jong2011-10-141-9/+7
* switch to using the logging frameworkArthur de Jong2011-10-147-119/+66
* simplify logging of depthArthur de Jong2011-10-141-2/+1
* fix typo that resulted in bad links not being reported as...Arthur de Jong2011-10-081-1/+1
* fix missing import (broken in r452)Arthur de Jong2011-10-081-1/+1
* also handle exceptions while parsing (e.g. issue when rea...Arthur de Jong2011-10-081-6/+9
* ensure that the database is emptied completely and move t...Arthur de Jong2011-10-082-12/+21
* switch to using MozillaCookieJar because LWPCookieJar has...Arthur de Jong2011-10-081-2/+2
* rename Crawler.add_internal() to Crawler.add_base() and a...Arthur de Jong2011-10-071-9/+25
* rename Site to CrawlerArthur de Jong2011-10-0717-38/+38