Idea for improving the learning stage
Andrew
aremo at ngi.it
Sat Sep 8 00:57:49 CEST 2007
On Fri, 07 Sep 2007 23:53:06 +0200, mouss <mlist.only at free.fr> wrote:
> [body only]
> Isn't "Subject" a token and that removing it will make it no more
> neutral? I mean, suppose you remove Subject from thousand spam messages,
> then "Subject" may become a ham sign, which it should not be.
Good point, provided that Bogofilter actually treats "Subject:" as any
other word. If that's the case, we should pass a line that only says
"Subject:".
> [subject only]
> and if you only train by subject, you will miss the spammy body tokens.
But you'll also ignore possible "polluting" words in the body, while
taking note of those words (the subject) that really prompted the user
to flag the message as spam.
Regards,
Andrew
More information about the bogofilter-dev
mailing list