why is this not spam?

Trevor Smith trevor at haligonian.com
Wed Aug 4 23:28:45 CEST 2004


On August 4, 2004 6:02 pm, David Relson wrote:
> Bogofilter's default parameters were determined using the bogotune
> program and a corpus of several hundred thousand messages from several
> sources (personal email collections, 80-100 person businesses, etc).  To

Right. So there is a default spam_cutoff value. But how high is it?

> lower).  The effect of this is that ham messages are unlikely to be
> scored as spam and that high scoring ("spammish") messages will score as
> non-spam.  This "lets through" more spam (undesirable), but results in
> fewer false positives (desirable).

Perfectly sound behaviour, especially "out of the box" when it has the 
potential to do the most mischief. I would have built it no differently. Spam 
getting through is minorly annoying; losing 'good' emails is potentially 
quite harmful.

> P.S.  When you disagree with bogofilter's classification ("yes"/"no") of
> a message, use it to further train bogofilter.  That will help
> bogofilter do better in the future.

Of course! Naturally I did this and bogofilter gave the spam a higher 
spamicity score and caught it the next time it looked at it. :-)




More information about the Bogofilter mailing list