New header token tagging

David Relson relson at osagesoftware.com
Thu Sep 25 20:15:38 CEST 2003


On Thu, 25 Sep 2003 14:02:28 -0400
Greg Louis <glouis at dynamicro.on.ca> wrote:

> On 20030925 (Thu) at 1205:26 -0400, David Relson wrote:
> 
> > Greg uses train-on-error and has seen his false positive rate
> > skyrocket.
> >  It's so bad for him that he has requested a way to turn off the new
> > tagging.
> 
> The fp rate did go through the roof for exactly one type of mail:
> valid, short messages from mailing lists on which spam is frequently
> posted.  In addition, the fn rate increased sharply; the spam not
> recognized tended to be short and not egregiously spammy-looking. The
> -H patch was tried yesterday, and worked perfectly.  I've just
> installed the degenerator, which I hope will work well -- I think the
> tag concept is good but I can't afford to rebuild the training db at
> work.  As you mentioned, degeneration renders that unnecessary,
> although of course the benefit of head: tagging (I prefer head: to h:)
> isn't realized as quickly.

Greg,

'Tis good to hear that yesterday's '-H' (to turn off "head:" tagging)
worked as expected.  My expectation is that degeneration will work well,
though there will be a "weak" point as additional training happens and
the spammish difference between "token" and "head:token" start to have
an effect.  Hopefully, it'll make the transition go better.

The voting of "h:" vs. "head:" is presently 2 for the short form and 3
for the long form.

David




More information about the Bogofilter mailing list