blob: a56040806d71fa4069778828234dab4bea93d07d (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
|
changes from 1.9.0 to 1.9.1
---------------------------
* ship an empty css.py to actually run
* small bugfixes for pages with multiple titles and slow plugin
changes from 1.0 to 1.9.0
-------------------------
* maintainership transferred to Arthur de Jong
* major structural rewrites of crawling code and plugin structure
* the documentation was combined and partially rewritten in the README for
installation instructions and the manual page for usage information
* changed output to no longer use frames and produce valid XHTML 1.1 and use
CSS for layout
* config.py is no longer really a configuration file
changes from 1.0b10 to 1.0
--------------------------
+ Don't send accept headers, as they weren't valid.
+ WARN_OLD_VERSION no longer works, until I decide what to do about it.
+ Named changed to webcheck.
+ Fixed typos in INSTALL.
+ Changes so it works with python 2.0.
changes from 1.0b9 to 1.0b10
----------------------------
b Fixed bug when server redirects to a document in robots.txt (does not show
up as broken (hopefully))
+ Filename mangling in filelink.py to help OS/2 (and Win32) (Patch submitted
by Steffen Siebert <siebert@logware.de>
+ Added WARN_OLD_VERSION config.py option. If this option is set to true
(the default) Linbot will check it's version number and the version
numbers of it's plugins against a global registry on the Net. If it
finds that a version is not the latest, it will print a warning on the
reports along with a link you can follow to download the latest version.
I think it's neat. You might find it annoying.
+ Added preliminary support for authenticating proxies, though it does not
work correctly yet.
+ Added -r (redirect depth) and REDIRECT_DEPTH option in config.py to indicate
the amount of redirects Linbot should follow when following a link. Thanks
to Andrea Glorioso <sama@intercity.it> for the patch.
+ Added debugio module that handles debugging and I/O
+ Added -q (quiet option). Use it to suppress output
+ Added -d (debug) option and DEBUG_LEVEL variable in config.py for debugging
+ added version module and removed __version__ and __author__ from all the
modules (except plugins).
b Fixed bug in Linbot using putrequest() instead of putheader() when requesting
header information. Thanks to Andrea Glorioso <sama@intercity.it> for
fixing this glitch (and Seth Chaiklin <seth@psy.au.dk> for noticing).
changes from 1.0b8 to 1.0b9
---------------------------
+ If you use the -o command-line option or the OUTPUT_DIR config file option
and the directory does not exist, linbot will create it for you (provided
that it has the correct permissions, etc.) Thanks to Andrea Glorioso
<sama@intercity.it> for this feature.
+ Added a CREDITS file and probably left a lot of people out. If you think
you should be in it let me know (marduk@python.net).
b Linbot will now report to the server that it can accept any MIME type (found
in mimetypes.py. This should fix the "406: No acceptable objects found"
error that some servers report.
b Linbot correctly identifies itself as "Linbot <version>" on HEAD requests
as well as GET requests.
changes from 1.0b6 to 1.0b8
---------------------------
b Fixed bug when no images are reported for documents having 0 links
If you don't know what this means it probably wasn't a problem for you.
b Fixed code that was messing with arguments passed via -x and -y and caused
unexpected results and/or errors.
b -b flag should work this time (for real)
b Cosmetic changes (reports didn't look the way I thought they should in IE4.
(and may not still as I havent' had a chance to check it yet)
b Linbot won't follow infinite redirects (currently hardcoded to max of 5
redirects per document)
changes from 1.0b5 to 1.0b6
---------------------------
+ Minor change in ftplink.py should allow better ftp link checking
+ You can now press CTRL-C (or whatever your operating system supports) to break
out of a linbot run. However, the work linbot does is not saved (yet).
b Fixed problem when server redirects a URL to itself. This fix seems to work
for most servers I've tried but there are a few more out there that I need to
take a look at.
b Fixed bug that caused linbot to not check for yanked URLs
+ Added -l command-line option. Usage: -l <url> where <url> is a url pointing
to an image to be used as the report's logo.
b "patched" strings.py so that it can better parse html files created in
Windows/DOS (I think).
+ Made report LOGO a link to the base url
+ httplink does not HEAD a redirected URL if it is already in the link list
(performance improvement)
- Removed LOGO_ALT from config.py
+ Changed my email address to marduk@python.net. The official home page of
Linbot will probaby also change with the next release so stay tuned.
changes from 1.0b4 to 1.0b5
---------------------------
+ Added a contrib directory. Right now it just contains the about plugin. Other
plugins will be included if people contribute them. Also, the man page will
return once I have updated it. Those ugly buttons are obsolete.
+ Linbot now "inlines" stylesheets. This has the benefits of 1) better support
of Netscape browsers (so I hear) and 2) I don't have to document to put
linbot.css in the output directory since it grabs it from starship 8*)
b Handling of error for when robots.txt cannot be retreived.
+ Malformed urls are trapped (sorry, I had that commented out)
b FTP link handling is totally rewritten. Fortunately it shouldn't crash anymore
Unfortunately it doesn't really work reliably and probably never will. See
README.ftp for details.
b Two bugs in HTTP proxy handling made it almost completely unusable, though
conveniently seemed to cancel each other out when I was testing.
b Too many files error on large sites should be fixed. Thanks to Andrew Kuchling
et al for suggestions.
b Bug when some servers erroneously report (or don't report) Content-Length header
fixed.
|