Arthur de Jong

Open Source / Free Software developer

summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* switch to using the logging frameworkArthur de Jong2011-10-148-131/+85
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@457 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* simplify logging of depthArthur de Jong2011-10-141-2/+1
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@456 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* fix typo that resulted in bad links not being reported ↵Arthur de Jong2011-10-081-1/+1
| | | | | | as page problems git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@455 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* fix missing import (broken in r452)Arthur de Jong2011-10-081-1/+1
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@454 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* also handle exceptions while parsing (e.g. issue when ↵Arthur de Jong2011-10-081-6/+9
| | | | | | reading the response times out) git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@453 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* ensure that the database is emptied completely and move ↵Arthur de Jong2011-10-082-12/+21
| | | | | | the code to webcheck.db git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@452 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* switch to using MozillaCookieJar because LWPCookieJar ↵Arthur de Jong2011-10-081-2/+2
| | | | | | has issues with some dates (http://bugs.python.org/issue5537) git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@451 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* rename Crawler.add_internal() to Crawler.add_base() and ↵Arthur de Jong2011-10-072-15/+26
| | | | | | automatically initialise database connection when needed git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@450 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* ignore generated coverage filesArthur de Jong2011-10-070-0/+0
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@449 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* rename Site to CrawlerArthur de Jong2011-10-0718-55/+54
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@448 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* move some more initialisation from cmd to crawler and ↵Arthur de Jong2011-10-0714-70/+66
| | | | | | make imports of config and debugio consistent git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@447 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* move some file-handling functions to webcheck.utilArthur de Jong2011-10-074-104/+114
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@446 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* fix webcheck.config importArthur de Jong2011-10-071-4/+3
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@445 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* fix some remaining (previously relative) importsArthur de Jong2011-10-072-4/+4
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@444 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* remove unnecessary importsArthur de Jong2011-10-075-9/+1
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@443 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* move profiling flag from config to cmdArthur de Jong2011-10-072-7/+8
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@442 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* move version and homepage definition from config to the ↵Arthur de Jong2011-10-075-14/+36
| | | | | | webcheck package git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@441 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* remove the unsupported userpass optionArthur de Jong2011-10-071-7/+1
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@440 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* remove unused/unsupported configuration optionsArthur de Jong2011-09-161-16/+0
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@439 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* pass the IO timeout to urllib2Arthur de Jong2011-09-161-3/+2
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@438 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* remove old compatibility codeArthur de Jong2011-09-161-5/+0
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@437 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* use fully qualified plugin namesArthur de Jong2011-09-164-35/+35
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@436 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* move all the code except the command-line handling to ↵Arthur de Jong2011-09-1628-286/+294
| | | | | | the webcheck package and reorganise imports accordingly git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@435 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* show a better estimate of the number of links remainingArthur de Jong2011-09-111-2/+9
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@434 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* fix ### more... linksArthur de Jong2011-09-111-1/+2
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@433 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* fix Vcs-Browser URLArthur de Jong2011-09-041-1/+1
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@432 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* make plugins get their own session and split ↵Arthur de Jong2011-08-2016-58/+92
| | | | | | postprocessinf and report generation git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@431 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* do some performance tuning to ensure that the reports ↵Arthur de Jong2011-08-1910-106/+93
| | | | | | are generated a little faster git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@430 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* use sqltap to save profiling information from sqlalchemy ↵Arthur de Jong2011-08-191-0/+8
| | | | | | if the module is available and we're profiling git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@429 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* fix a problem with the tooltip coming up in the wrong ↵Arthur de Jong2011-08-191-16/+9
| | | | | | location git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@428 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* make source code changes to follow PEP8 moreArthur de Jong2011-08-1827-185/+249
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@427 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* re-enable the anchors pluginArthur de Jong2011-08-104-32/+61
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@426 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* make all relationships into filterable collections and ↵Arthur de Jong2011-08-104-31/+31
| | | | | | several smaller tweaks to improve database access git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@425 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* ensure that the cookies file is generated in the output ↵Arthur de Jong2011-08-102-18/+22
| | | | | | directory git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@424 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* log "items left to check" when something actually happed ↵Arthur de Jong2011-08-101-3/+3
| | | | | | and commit changes after crawling all links git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@423 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* small style updates to SQLAlchemy constructsArthur de Jong2011-08-0410-14/+12
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@422 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* use SQLAlchemy to store crawled website data to improve ↵Arthur de Jong2011-08-0423-952/+596
| | | | | | scalability git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@421 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* move some things from config to webcheck moduleArthur de Jong2011-06-182-9/+11
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@420 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* monkeypatch the robotparser module to improve upon some ↵Arthur de Jong2011-06-182-6/+88
| | | | | | functionality git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@419 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* remove Python 2.3 supportArthur de Jong2011-06-181-6/+0
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@418 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* switch to using urllib2 for crawling (this is mostly ↵Arthur de Jong2011-06-186-528/+86
| | | | | | functional now) git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@417 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* switch to dh_python2Arthur de Jong2011-03-062-4/+3
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@416 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* small simplificationArthur de Jong2010-09-231-1/+1
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@415 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* handle the case where hostname is emptyArthur de Jong2010-09-231-0/+4
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@414 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* spelling fixesArthur de Jong2010-09-182-3/+3
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@413 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* get files ready for 1.10.4 release1.10.4Arthur de Jong2010-09-116-6/+116
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@411 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* update copyright yearsArthur de Jong2010-09-118-9/+9
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@410 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* direct bugreports to mailing list instead of personal ↵Arthur de Jong2010-09-112-3/+3
| | | | | | address git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@409 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* upgrade to standards-version 3.9.1Arthur de Jong2010-09-111-1/+1
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@408 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* drop removing legacy configuration (/etc/webcheck) as ↵Arthur de Jong2010-09-111-21/+0
| | | | | | this directory was already removed in etch git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@407 86f53f14-5ff3-0310-afe5-9b438ce3f40c