bogofilter's default algorithm

Matthias Andree matthias.andree at gmx.de
Tue Jan 21 00:18:15 CET 2003


David Relson <relson at osagesoftware.com> writes:

> One of the big questions amongst the bogofilter developers is:
>
> 	What algorithms are people using with bogofilter?

I'm using the respective result, and I've had to rebuild the data bases
often enough in the past for corruption issues that I believe are
resolved in the current 0.10.0 version -- but I haven't tried
Robinson-Fisher yet. I have had good results with both Graham and
Robinson methods, Graham's seems to learn faster but is less accurate,
and tends to become undecisive when strong indicators for either side
(spam vs. ham) are present.

I've lost a big part of my spam data base recently (before the locking
fix is in place), and bogofilter lets some spam mails pass, but
correcting their score is getting bogofilter back on track quickly.

One thing I observed: bogofilter with Robinson eventually took nearly
every piece of spam from mid-flight, there was near to nothing for
spamassassin to look -- spamassassin is in 2nd line on my machine. With
Graham's algorithm, bogofilter left more work for SpamAssassin -- and I
run SpamAssassin from CVS, so I'd think it's fairly current and improves
steadily.

> I think it's time to promote the Robinson-Fisher bogofilter's standard
> (default) algorithm.  Version 0.10.0 has been released, though it hasn't
> achieved "stable" status as yet.  I'm planning on the algorithm change
> once 0.10.0 is stable.  Note that bogofilter will continue to support
> the older algorithms.  They will still be selectable by command line
> switch or config file option.

I don't have a strong opinion on the Robinson-vs-Robinson-Fisher
algorithm issue. If there are people who say it's better or more useful
or more practical, and no compelling objections, go for RF.

-- 
Matthias Andree




More information about the Bogofilter mailing list