Idea for improving the learning stage

Andrew aremo at ngi.it
Sat Sep 8 00:57:49 CEST 2007


On Fri, 07 Sep 2007 23:53:06 +0200, mouss <mlist.only at free.fr> wrote:

> [body only]
> Isn't "Subject" a token and that removing it will make it no more 
> neutral? I mean, suppose you remove Subject from thousand spam messages, 
> then "Subject" may become a ham sign, which it should not be.


Good point, provided that Bogofilter actually treats "Subject:" as any 
other word. If that's the case, we should pass a line that only says 
"Subject:".


> [subject only]
> and if you only train by subject, you will miss the spammy body tokens. 


But you'll also ignore possible "polluting" words in the body, while 
taking note of those words (the subject) that really prompted the user 
to flag the message as spam.

Regards,
Andrew





More information about the bogofilter-dev mailing list