parsing options

David Relson relson at osagesoftware.com
Thu May 15 17:23:31 CEST 2003


Greetings,

With today's revision to bogofilter-0.12.3.cvs, some of the command line 
switches and config file options have been renamed.  The changes have been 
made to provide greater clarity and uniformity.  Hopefully, y'all with 
think so, too.

The following command line switches and config file options are understood:

-Pu, -PU upper_case
-Ph, -PH header_line_markup
-Pt, -PT tokenize_html_tags

For the switches, a lower case letter enables the option (turns it on) and 
an upper case letter disables it (turns it off).

Bogofilter's default options correspond to "-PUHT", i.e. upper_case, 
header_line_markup, and tokenize_html_tags are all disabled.  "-PUHT" has 
bogofilter operating the same as always.

Note 1: The first two config file options names are different than 
previously, which will require editing your config file if you have ben 
using the options.

Note 2:  Option '-Pu' makes bogofilter case insensitive, which is the way 
bogofilter has operated since day 1.  With "-Pu", words "test", "Test", and 
"TEST" are all converted to "test".  All the recent tests of case folding 
(or not folding) indicate that doing changing upper case to lower case is 
really, really bad.

Note 3:  The underlying code for "tokenize_html_tags" is not yet in 
cvs.  At present it's using flex's unget() function, which will cause flex 
to complain "input buffer overflow, can't enlarge buffer because scanner 
uses REJECT" and then exit the program.  I know (roughly) what needs to 
change in the code to make this work, but haven't had time to make the 
change yet.  When the problem is fixed, the code will be released.

David





More information about the Bogofilter mailing list