Bootstrapping the database in a MUA-embedded scenario

Mikhail Zabaluev mhz at altlinux.org
Wed Aug 10 12:02:53 CEST 2005


Hello,

I've used Bogofilter for a time now, and I love it so much that I've
written an Evolution plugin for it.
Now, I have a problem with learning starting from clean user setup.
Bogofilter needs at least one ham message in the database for the
algorithm to work properly (at least with the default parameters). The
problem is, Evolution will only let you report a message as non-spam if
it has been classified as spam before (the problem is symmetrical for
spam, but untagged spam messages aren't usually in short supply :)).
I'd like to provide the users with a working setup out of the box,
without resorting to manual learning procedures involving CLI. The
solution I came up with is to feed bogofilter a made-up seed message
once in order to initialize the ham message count. The message has an
empty body and minimal headers to avoid upsetting the word counts too
much.
I'll appreciate any comments to my approach. Maybe I'm missing
something.



More information about the Bogofilter mailing list