Idea for improving the learning stage

Andrew aremo at ngi.it
Sat Sep 8 12:21:42 CEST 2007


On Sat, 8 Sep 2007 09:35:03 +0200,
Matthias Andree <matthias.andree at gmx.de> wrote:

> How does bogofilter, for a newly arriving mail, decide whether to look
> at header or body?


When mail comes in, Bogofilter would always evaluate the full message in 
any case. 

It's only when the user flags a message as spam or ham that my idea 
comes into play and Bogofilter decides what to look at, based on message 
status.

Message status would tell Bogofilter what words prompted the user to 
recognize the message as spam or ham when he flagged it.

So, in the end, its database would mostly be made of those words that 
*really* were critical for the user when he recognized spam from ham.


> So, does your suggestion imply we'll have to keep header and body 
> databases separate?


I've been thinking about separate databases, but I've come to the 
conclusion that we wouldn't really need them: words that looked "spammy" 
in a subject would still look spammy in the body, and vice-versa. So, in 
my opinion, only one database would still be the way to go.

Regards,
Andrew





More information about the bogofilter-dev mailing list