What's Coming ...

David Relson relson at osagesoftware.com
Mon Nov 24 01:04:32 CET 2003


As you all know, bogofilter has been evolving over the year of its
existence.  During this time it has changed from a relatively simple
Bayesian filter to a program that knows a fair amount about email.  It
has gone from a simple case insensitive parser to a case sensitive
parser that tags header lines and understands multipart mime messages,
various encodings, plain text vs html, etc.  The scoring algorithm has
progressed from the simple Graham algorithm via the Robinson-GM
algorithm to the more sophisticated Robinson-Fisher algorithm.  The
wordlist has evolved from separate ham and spam lists to a single,
combined wordlist that holds all tokens.

At present, bogofilter has options and codes that allow it to function
in the older modes as well as the current mode.  Many of these options
were added to ease the transition from an old way of doing things to a
new way.  Examples of this are options to select between case
sensitivity/insensitivity, enabling/disabling tagging of header
tokens, separate vs combined wordlists, and so on and so forth.  It's
time to clean up the code and remove the old, no longer needed options,
algorithms, modes, etc.

To ease the transition, the old code will be deprecated, i.e. disabled
using #ifdefs, in the near future and will be removed after that.
The last release with everything enabled will be bogofilter-0.15.9.?.
Bogofilter-0.16 will contain the deprecated code, which will be removed
for bogofilter-0.17.

Code targeted for deprecation in 0.16 includes the following:

1 - Graham and Robinson-GM algorithms
2 - separate wordlists
3 - options pertaining to #1 and #2
4 - degeneration code for headers and case sensitivity
5 - command line options -Ph/H, -Pi/I, -Pt/T
6 - config file options: header_line_markup, ignore_case,
    tokenize_html_tags, tokenize_html_script
7 - miscellaneous other items.  Exactly what will become apparent as
    the 0.16 release is prepared.

The expected timetable:

Bogofilter 0.15.9 has been released.  If it's solid, it will be
promoted to "stable" in 7 days; else 0.15.9.1 will be released and the
7 day timer will restart.

Code modified to #ifdef the deprecated portions and CVS updated as
soon as 0.15.9.x is promoted to stable, with release of 0.16 shortly
after that.

After 0.16.x settles down and is promoted to "stable", the deprecated
code will be removed, CVS updated, and 0.17 will be released.




More information about the Bogofilter mailing list