New header token tagging
Boris 'pi' Piwinger
3.14 at logic.univie.ac.at
Fri Sep 26 09:17:02 CEST 2003
Greg Louis <glouis at dynamicro.on.ca> wrote:
>> Greg uses train-on-error and has seen his false positive rate skyrocket.
>> It's so bad for him that he has requested a way to turn off the new
>> tagging.
>
>The fp rate did go through the roof for exactly one type of mail:
>valid, short messages from mailing lists on which spam is frequently
>posted. In addition, the fn rate increased sharply; the spam not
>recognized tended to be short and not egregiously spammy-looking.
Well, if the messages has almost only header and that is not
understood ...
Here is what you could do: Take you current database to
decide if something is spam or not (not using header
tagging). Use those new mails (now with header tagging) to
build a new database. For a while you'll have to correct
errors in both databases, but after not too long you can
just switch the database.
pi
More information about the Bogofilter
mailing list