Arthur de Jong

Open Source / Free Software developer

summaryrefslogtreecommitdiffstats
path: root/serialize.py
Commit message (Collapse)AuthorAgeFilesLines
* store internal, external and yanked regular expressions ↵Arthur de Jong2006-06-241-3/+3
| | | | | | in a map allowing them to be serialized git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@293 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* do not split list of strings on comma's inside the ↵Arthur de Jong2006-06-041-2/+4
| | | | | | quoted strings git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@288 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* make DeSerializeException a class instead of a function ↵Arthur de Jong2006-06-041-1/+2
| | | | | | and add FIXME git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@287 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* raise a custom exception instead of IOErrorArthur de Jong2006-06-021-9/+11
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@283 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* split crawler.crawl() function into crawler.crawl() and ↵Arthur de Jong2006-05-161-2/+2
| | | | | | crawler.postprocess() functions git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@279 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* flag deserialized links as changed so they will be ↵Arthur de Jong2006-05-161-0/+1
| | | | | | reserialized again git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@276 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* import crawler late as to simplify dependenciesArthur de Jong2006-05-151-1/+1
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@270 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* fix typo in FIXMEArthur de Jong2006-05-151-3/+3
| | | | git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@269 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* only write serialized data if it is different from the ↵Arthur de Jong2006-05-151-10/+20
| | | | | | constructor's default value git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@267 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* clear anchors, linkproblems and pageproblems from to be ↵Arthur de Jong2006-05-151-0/+4
| | | | | | deserialized links to avoid duplicates as a link can be deserialized multiple times git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@266 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* remove the call to crawl() from deserialize as this ↵Arthur de Jong2006-05-151-3/+3
| | | | | | could be a partial deserialize that needs more tweaking to the site before the call to crawl() git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@265 86f53f14-5ff3-0310-afe5-9b438ce3f40c
* add serialize module that allows serializing and ↵Arthur de Jong2006-05-071-0/+313
deserializing all crawler state (site and links) to and from a file, this module is not called anywhere yet git-svn-id: http://arthurdejong.org/svn/webcheck/webcheck@257 86f53f14-5ff3-0310-afe5-9b438ce3f40c