New header token tagging
David Relson
relson at osagesoftware.com
Thu Sep 25 20:15:38 CEST 2003
On Thu, 25 Sep 2003 14:02:28 -0400
Greg Louis <glouis at dynamicro.on.ca> wrote:
> On 20030925 (Thu) at 1205:26 -0400, David Relson wrote:
>
> > Greg uses train-on-error and has seen his false positive rate
> > skyrocket.
> > It's so bad for him that he has requested a way to turn off the new
> > tagging.
>
> The fp rate did go through the roof for exactly one type of mail:
> valid, short messages from mailing lists on which spam is frequently
> posted. In addition, the fn rate increased sharply; the spam not
> recognized tended to be short and not egregiously spammy-looking. The
> -H patch was tried yesterday, and worked perfectly. I've just
> installed the degenerator, which I hope will work well -- I think the
> tag concept is good but I can't afford to rebuild the training db at
> work. As you mentioned, degeneration renders that unnecessary,
> although of course the benefit of head: tagging (I prefer head: to h:)
> isn't realized as quickly.
Greg,
'Tis good to hear that yesterday's '-H' (to turn off "head:" tagging)
worked as expected. My expectation is that degeneration will work well,
though there will be a "weak" point as additional training happens and
the spammish difference between "token" and "head:token" start to have
an effect. Hopefully, it'll make the transition go better.
The voting of "h:" vs. "head:" is presently 2 for the short form and 3
for the long form.
David
More information about the Bogofilter
mailing list