A Tale of Two Sisters

Pavel Kankovsky peak at argo.troja.mff.cuni.cz
Wed Jul 27 09:10:47 CEST 2005


On Sun, 24 Jul 2005, JoeHill wrote:

> It was kinda hit and miss for both, so I sat down today and did training
> on a whole *whack* of mail from each sister (about 20MB each)

Those 20 megabytes were not a single message but a mailbox containing 
multiple messages, were they?

> joehill at node3:~/mail$ bogofilter -vv < sister1 
> joehill at node3:~/mail$ bogofilter -n < sister1 

This is not the right way to deal with multiple messages in a mailbox.
It may appear to work at the first glance but many messages can be
interpretered incorrectly (e.g. BASE64 and QP parts not decoded, binary
attachments not ignored, header tokens not tagged).

You should have done bogofilter -n -M < sisterN to train mailboxes.
Afaik, there is no simple command to get some kind of aggregate scoring 
results for multiple messages in a mailbox but you can do that with
a script.

--Pavel Kankovsky aka Peak  [ Boycott Microsoft--http://www.vcnet.com/bms ]
"Resistance is futile. Open your source code and prepare for assimilation."




More information about the Bogofilter mailing list