parsing options
David Relson
relson at osagesoftware.com
Thu May 15 17:23:31 CEST 2003
Greetings,
With today's revision to bogofilter-0.12.3.cvs, some of the command line
switches and config file options have been renamed. The changes have been
made to provide greater clarity and uniformity. Hopefully, y'all with
think so, too.
The following command line switches and config file options are understood:
-Pu, -PU upper_case
-Ph, -PH header_line_markup
-Pt, -PT tokenize_html_tags
For the switches, a lower case letter enables the option (turns it on) and
an upper case letter disables it (turns it off).
Bogofilter's default options correspond to "-PUHT", i.e. upper_case,
header_line_markup, and tokenize_html_tags are all disabled. "-PUHT" has
bogofilter operating the same as always.
Note 1: The first two config file options names are different than
previously, which will require editing your config file if you have ben
using the options.
Note 2: Option '-Pu' makes bogofilter case insensitive, which is the way
bogofilter has operated since day 1. With "-Pu", words "test", "Test", and
"TEST" are all converted to "test". All the recent tests of case folding
(or not folding) indicate that doing changing upper case to lower case is
really, really bad.
Note 3: The underlying code for "tokenize_html_tags" is not yet in
cvs. At present it's using flex's unget() function, which will cause flex
to complain "input buffer overflow, can't enlarge buffer because scanner
uses REJECT" and then exit the program. I know (roughly) what needs to
change in the code to make this work, but haven't had time to make the
change yet. When the problem is fixed, the code will be released.
David
More information about the Bogofilter
mailing list