1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
|
2005-07-25 17:17 arthur
* webcheck.1: fix typo, thanks to Stefan Schröder
<stefan@tokonoma.de>
2005-07-25 17:16 arthur
* plugins/slow.py: only report on internal links
2005-07-25 17:13 arthur
* parsers/css.py: empty module as placeholder to parse css
(referenced from __init__.py already)
2005-07-25 17:11 arthur
* parsers/html.py: don't replace an allready set title
2005-07-24 09:32 arthur
* ChangeLog: add ChangeLog for release
2005-07-24 09:30 arthur
* NEWS, TODO: get files ready for release
2005-07-24 08:56 arthur
* README: clean up README removing sections that should be in the
manual page
2005-07-24 08:55 arthur
* config.py, plugins/new.py, plugins/old.py, plugins/whatsnew.py,
plugins/whatsold.py: rename whatsold and whatsnew plugins to old
and new
2005-07-24 08:52 arthur
* schemes/http.py: handle socket errors properly
2005-07-24 08:52 arthur
* schemes/http.py: fix for incomplete change in r76, now version
should not be referenced any more
2005-07-24 08:49 arthur
* plugins/__init__.py, plugins/badlinks.py, plugins/external.py,
plugins/images.py, plugins/notchkd.py, plugins/notitles.py,
plugins/problems.py, plugins/sitemap.py, plugins/slow.py,
plugins/urllist.py, plugins/whatsnew.py, plugins/whatsold.py: call
make_link() with a link object instead of a url, removing the need
for a mySite in plugins
2005-07-24 08:47 arthur
* plugins/badlinks.py: remove HTTP status code handling from here as
this should be done by the HTTP module
2005-07-24 08:47 arthur
* plugins/whatsnew.py, plugins/whatsold.py: only report on internal
links
2005-07-24 08:46 arthur
* crawler.py: only add links to crawl list if they are not in there
allready
2005-07-24 08:45 arthur
* debugio.py: flush stdout after each message so that redirecting
stdout and stderr together to a file works reliably
2005-07-23 14:02 arthur
* crawler.py: fix regular expression matching
2005-07-23 12:55 arthur
* config.py, plugins/__init__.py, schemes/http.py, version.py,
webcheck.1, webcheck.py: integrate versio.py into config.py, clean
up config.py removing unused settings and clean up boolean types
2005-07-23 11:00 arthur
* config.py, webcheck.1, webcheck.py: remove logo option since the
current output does not use one
2005-07-23 10:53 arthur
* schemes/file.py: most systems already know about .shtml files
2005-07-23 08:34 arthur
* BUGS, INSTALL, README, webcheck.1: first step in cleaning up
documentation, integrating INSTALL in README and BUGS in manual
page and adding section on robots handling in manual
2005-07-23 08:28 arthur
* AUTHORS, crawler.py, debugio.py, parsers/html.py,
plugins/__init__.py, plugins/about.py, plugins/badlinks.py,
plugins/external.py, plugins/images.py, plugins/notchkd.py,
plugins/notitles.py, plugins/problems.py, plugins/sitemap.py,
plugins/slow.py, plugins/whatsnew.py, plugins/whatsold.py,
schemes/file.py, schemes/ftp.py, schemes/http.py, version.py,
webcheck.1, webcheck.py: Mike Meyer -> Mike W. Meyer
2005-07-22 21:21 arthur
* crawler.py: add support for sleep between requests
2005-07-22 21:11 arthur
* webcheck.py: don't add . to python path as it's not needed and put
command line handling in same order as options
2005-07-22 21:05 arthur
* plugins/__init__.py, webcheck.css: change layout to have a simpler
layout that also should work in MSIE
2005-07-22 21:04 arthur
* debugio.py: fix docstrings
2005-07-22 21:01 arthur
* plugins/__init__.py, webcheck.py: do not use start_time from
webcheck saving an import
2005-07-22 19:17 arthur
* crawler.py, myUrlLib.py, parsers/__init__.py, parsers/html.py,
plugins/__init__.py, plugins/badlinks.py, plugins/external.py,
plugins/images.py, plugins/notchkd.py, plugins/notitles.py,
plugins/sitemap.py, plugins/slow.py, plugins/urllist.py,
plugins/whatsnew.py, plugins/whatsold.py, schemes/__init__.py,
schemes/file.py, schemes/ftp.py, schemes/http.py, webcheck.py:
almost complete rewrite of crawling and site state code making
children and parents link objects instead of urls and giving link
member variables better names, change plugins accordingly, make
scheme handling more pluggable and only use one function call and
have a better pluggable structure for content parsing (currently
only html)
2005-07-17 08:46 arthur
* myUrlLib.py, plugins/__init__.py, plugins/badlinks.py,
plugins/external.py, plugins/images.py, plugins/notitles.py,
plugins/problems.py, plugins/sitemap.py, plugins/slow.py,
plugins/urllist.py, plugins/whatsnew.py, plugins/whatsold.py,
schemes/file.py, schemes/ftp.py, schemes/http.py, webcheck.py: use
lowercase url attribute in Link instead of uppercase URL
2005-07-16 15:35 arthur
* plugins/__init__.py, plugins/badlinks.py, plugins/external.py,
plugins/images.py, plugins/notchkd.py, plugins/notitles.py,
plugins/problems.py, plugins/rptlib.py, plugins/sitemap.py,
plugins/slow.py, plugins/urllist.py, plugins/whatsnew.py,
plugins/whatsold.py, webcheck.py: move functionality of rptlib.py
to __init__.py so that we can just use the plugins package
2005-07-16 15:33 arthur
* plugins/__init__.py: remove __init__.py to be replaced by contents
of rptlib.py
2005-07-16 10:24 arthur
* webcheck.1: add note about pattern matching
2005-07-10 14:08 arthur
* myUrlLib.py, schemes/__init__.py, schemes/file.py, schemes/ftp.py,
schemes/http.py: rework scheme code to use more logical function
names, more clearly mark internal functions and do some major
cleanup of the scheme modules code
2005-07-10 12:26 arthur
* myUrlLib.py, plugins/whatsnew.py, plugins/whatsold.py,
schemes/file.py, schemes/http.py: store mtime in link object
instead of age in days
2005-07-10 12:00 arthur
* schemes/ftp.py, webcheck.py: remove unneeded import and print
2005-07-09 20:22 arthur
* htmlparse.py, myUrlLib.py, parsers, parsers/__init__.py,
parsers/html.py: move htmlparse to a more generic parsers package,
cleaning up the code and simplefying dependencies
2005-07-09 13:54 arthur
* plugins/about.py, plugins/badlinks.py, plugins/external.py,
plugins/images.py, plugins/notchkd.py, plugins/notitles.py,
plugins/problems.py, plugins/rptlib.py, plugins/sitemap.py,
plugins/slow.py, plugins/urllist.py, plugins/whatsnew.py,
plugins/whatsold.py, webcheck.css, webcheck.py: clean up html
output generating xhtml 1.1 without frames and using css for
styling also getting rid of the images
2005-07-04 21:25 arthur
* config.py: put plugins in a more logical order
2005-07-04 20:39 arthur
* plugins/badlinks.py, plugins/external.py, plugins/images.py,
plugins/notchkd.py, plugins/notitles.py, plugins/rptlib.py,
plugins/sitemap.py, plugins/slow.py, plugins/urllist.py,
plugins/whatsnew.py, plugins/whatsold.py: implement consistent
sorting of all lists removing sort functions from rptlib and using
lambda functions where needed
2005-07-03 07:04 arthur
* config.py, plugins/rptlib.py, schemes/http.py, webcheck.1: handle
and document proxy settings with environment variables
2005-07-03 06:36 arthur
* INSTALL, README, config.py, myUrlLib.py, plugins/rptlib.py,
schemes/http.py, webcheck.1, webcheck.py: name webcheck with lower
case
2005-06-28 20:32 arthur
* schemes/http.py: clean up get_reply() function to uses proper
recursion and don't use self where it doesn't make sense
2005-06-22 19:24 arthur
* COPYING, debugio.py, htmlparse.py, myUrlLib.py, plugins/about.py,
plugins/badlinks.py, plugins/external.py, plugins/images.py,
plugins/notchkd.py, plugins/notitles.py, plugins/problems.py,
plugins/rptlib.py, plugins/sitemap.py, plugins/slow.py,
plugins/urllist.py, plugins/whatsnew.py, plugins/whatsold.py,
schemes/file.py, schemes/ftp.py, schemes/http.py, version.py,
webcheck.1, webcheck.py: change to most recent version of the GPL
(FSF address change) and update notices
2005-06-18 19:59 arthur
* plugins/external.py: sort external links by url
2005-06-18 13:48 arthur
* webcheck.py: split main() part into it's own function
2005-06-18 13:32 arthur
* plugins/rptlib.py, webcheck.py: restructure a couple of things to
reduce the number of mutual imports and reduce the number of sutff
gathered in webcheck.py
2005-06-18 13:31 arthur
* config.py, plugins/urllist.py: add simple urllist plugin to list
all visited urls
2005-06-18 13:20 arthur
* plugins/sitemap.py: only include internal links in sitemap
2005-06-18 12:49 arthur
* config.py, webcheck.py: add problems plugin to config instead of
hard-coding
2005-06-18 10:25 arthur
* plugins/rptlib.py: remove ugly redirection for overwrite file
question since we now write all html through a file descriptor
2005-06-15 21:01 arthur
* TODO, myUrlLib.py, plugins/about.py, plugins/badlinks.py,
plugins/external.py, plugins/images.py, plugins/notchkd.py,
plugins/notitles.py, plugins/problems.py, plugins/rptlib.py,
plugins/sitemap.py, plugins/slow.py, plugins/whatsnew.py,
plugins/whatsold.py, schemes/http.py, webcheck.py: pass reference
to Link class to plugins with parameter and make import config
where it is used instead of accessing it through another module
2005-06-15 20:55 arthur
* myUrlLib.py, plugins/rptlib.py, plugins/sitemap.py, webcheck.py:
make use of base consistent, do not modify it to make a nicer url
(at least not now) and do not overwrite it with something silly
from webcheck.py
2005-06-14 19:17 arthur
* myUrlLib.py: also set URL attribute on yaked links
2005-06-12 06:21 arthur
* plugins/badlinks.py, plugins/images.py, plugins/notchkd.py,
plugins/notitles.py: again use the url as link title for some
links
2005-06-11 21:52 arthur
* httpcodes.py, plugins/about.py, plugins/badlinks.py,
plugins/external.py, plugins/images.py, plugins/notchkd.py,
plugins/notitles.py, plugins/problems.py, plugins/rptlib.py,
plugins/sitemap.py, plugins/slow.py, plugins/whatsnew.py,
plugins/whatsold.py: general cleanup of plugins structure and
code, moving httpcodes to the only place they were used, cleaning
up plugin titles, version numbers and descriptios, adding
docstrings and using slightly more logical and consistent names
(plus some other cleanups)
2005-06-11 21:39 arthur
* plugins/rptlib.py: make_link(): if no title is specified, try to
look up the title of the page and fallback to the url as title
2005-06-11 21:24 arthur
* plugins/about.py: adapt plugin to using file descriptor etc
2005-06-11 18:52 arthur
* contrib, plugins/about.py: move about plugin to plugins directory
2005-06-08 19:29 arthur
* plugins/badlinks.py, plugins/external.py, plugins/images.py,
plugins/notchkd.py, plugins/notitles.py, plugins/problems.py,
plugins/rptlib.py, plugins/sitemap.py, plugins/slow.py,
plugins/whatsnew.py, plugins/whatsold.py, webcheck.py: write html
files using file descriptors instead of through redirection using
stdout, split writing of navigation frame and plugin pages plus
some minor cleanups to calling plugins
2005-06-08 19:10 arthur
* plugins/__init__.py, schemes/__init__.py: claiming copyright on
empty files is silly
2005-06-06 21:22 arthur
* debugio.py, htmlparse.py, myUrlLib.py, plugins/rptlib.py,
schemes/ftp.py, schemes/http.py, webcheck.1, webcheck.py: redo
output writing using a cleaner debugio and change debug command
line option
2005-06-06 20:11 arthur
* plugins/badlinks.py, plugins/notchkd.py: replace a couple more
tabs
2005-06-06 20:05 arthur
* webcheck.1: initial version of manual page loosely based on
documentation
2005-06-06 19:22 arthur
* AUTHORS: added myself as copyright holder and added Bastian
Kleineidam (previous debian package maintainer) as contributor
2005-06-06 19:20 arthur
* webcheck.py: small text improvement
2005-05-27 20:39 arthur
* webcheck.sh: remove unneeded shell script
2005-05-27 20:28 arthur
* webcheck.py: also support --force
2005-05-27 20:18 arthur
* webcheck.py: redo command-line checking
2005-04-13 19:41 arthur
* contrib/plugins/about.py: general cleanup
2005-04-13 19:41 arthur
* plugins/sitemap.py: rework recursion to make it simpler plus some
general cleanups
2005-04-13 19:20 arthur
* contrib/plugins/about.py, myUrlLib.py, plugins/badlinks.py,
plugins/external.py, plugins/images.py, plugins/notchkd.py,
plugins/notitles.py, plugins/problems.py, plugins/rptlib.py,
plugins/sitemap.py, plugins/slow.py, plugins/whatsnew.py,
plugins/whatsold.py, schemes/http.py, webcheck.py: rename linkList
to linkMap
2005-04-13 19:18 arthur
* myUrlLib.py, robotparser.py: remove local copy of robotparser,
just use python\'s
2005-04-09 20:03 arthur
* myUrlLib.py: qualify references to types functions
2005-04-09 13:48 arthur
* htmlparse.py, myUrlLib.py, plugins/badlinks.py,
plugins/external.py, plugins/images.py, plugins/notchkd.py,
plugins/notitles.py, plugins/rptlib.py, plugins/slow.py,
plugins/whatsnew.py, plugins/whatsold.py, schemes/http.py: indent
with spaces instead of tabs (tabs are evil)
2005-04-08 21:31 arthur
* myUrlLib.py: move finding of scheme module to separate function
2005-04-08 21:25 arthur
* schemes/http.py: rebump loglevel to debug
2005-04-08 16:24 arthur
* myUrlLib.py, schemes/file.py, schemes/filelink.py, schemes/ftp.py,
schemes/ftplink.py, schemes/http.py, schemes/httplink.py: remove
link part from scheme modules
2005-04-07 22:37 arthur
* schemes/httplink.py: clean up http request code a little and do
not set host header (it is sent by HTTPConnection already
2005-04-07 20:29 arthur
* contrib/plugins/about.py, debugio.py, htmlparse.py, httpcodes.py,
myUrlLib.py, plugins/__init__.py, plugins/badlinks.py,
plugins/external.py, plugins/images.py, plugins/notchkd.py,
plugins/notitles.py, plugins/problems.py, plugins/rptlib.py,
plugins/sitemap.py, plugins/slow.py, plugins/whatsnew.py,
plugins/whatsold.py, schemes/__init__.py, schemes/filelink.py,
schemes/ftplink.py, version.py, webcheck.py: make nicer file
(copyrights) headers
2005-04-07 20:23 arthur
* schemes/httplink.py: fix problem with incorrect indent
2005-04-07 20:06 arthur
* config.py, httpcodes.py, plugins/notitles.py: tabs to spaces (tabs
are evil)
2005-04-07 20:05 arthur
* config.py, contrib/plugins/about.py, httpcodes.py,
plugins/badlinks.py, plugins/external.py, plugins/notchkd.py,
plugins/notitles.py, plugins/problems.py, plugins/rptlib.py,
plugins/sitemap.py, plugins/slow.py, plugins/whatsnew.py,
plugins/whatsold.py, schemes/filelink.py, schemes/ftplink.py,
schemes/httplink.py: tabs to spaces (tabs are evil)
2005-04-07 20:04 arthur
* AUTHORS, schemes/httplink.py: include patch from Sebastien
Delafond <sdelafond@gmx.net> (from http://bugs.debian.org/286017)
to fix problems with recent versions of python
2005-04-06 19:38 arthur
* INSTALL, config.py, htmlparse.py, plugins/images.py,
plugins/rptlib.py, schemes/ftplink.py, schemes/httplink.py,
webcheck.css, webcheck.py: import Debian package patches
2005-03-31 12:47 arthur
* COPYING: install updated file without millenium bug
2005-03-31 12:45 arthur
* AUTHORS: reformat file to better match suggested layout
2005-03-31 12:44 arthur
* NEWS: put news items in a little more standard format
2005-03-31 12:42 arthur
* AUTHORS, CHANGES, CREDITS, ChangeLog-1999, ChangeLog-2002,
HISTORY, HISTORY.linbot, NEWS: rename files to more standard names
2005-03-31 12:32 arthur
* config.py, plugins/rptlib.py, version.py: remove checks for
updates (registry)
2005-03-31 12:28 arthur
* ., contrib, contrib/plugins, plugins, schemes: ignore compiled
python objects
2005-03-29 12:08 arthur
* BUGS, CHANGES, COPYING, CREDITS, HISTORY, HISTORY.linbot, INSTALL,
README, TODO, config.py, contrib, contrib/plugins,
contrib/plugins/about.py, debugio.py, htmlparse.py, httpcodes.py,
myUrlLib.py, plugins, plugins/__init__.py, plugins/badlinks.py,
plugins/external.py, plugins/images.py, plugins/notchkd.py,
plugins/notitles.py, plugins/problems.py, plugins/rptlib.py,
plugins/sitemap.py, plugins/slow.py, plugins/whatsnew.py,
plugins/whatsold.py, robotparser.py, schemes, schemes/__init__.py,
schemes/filelink.py, schemes/ftplink.py, schemes/httplink.py,
version.py, webcheck.css, webcheck.py, webcheck.sh: import of
release 1.0
2005-03-28 12:57 arthur
* .: create webcheck directory
|