headers - example

Jozef Hitzinger hitzinger at phobos.fphil.uniba.sk
Tue Mar 9 08:34:30 CET 2004


On Mon, 8 Mar 2004, David Relson wrote:

> Since you consider sources so important, create whitelists and
> blacklists.

No .. important thing is to _not_ use the sources. To _not_ let the
information on where it came from damage bogofilter abilities. But if you
let it train on headers, this information gets in.

> If a message was "pure spam" that's how bogofilter would classify it.
> Your message included messages that you've used in ham training, else
> bogofilter wouldn't classify them as ham.  Sounds like additional
> training is needed.  Bogofilter needs sufficient information to to a
> good job, and that doesn't happen overnight.

Again .. I've trained on full messages, i.e. with headers. Aditional
training would not do any good, because I constantly get more ham than
spam from there.

I'm trying to tell you all, that bogofilter can do _better_, if we drop
the headers (except Subject), on both training and filtering. I find this
much much better than changing the database on the fly by constant
training and/or ad-hoc removing of tokens, as is suggested.

Have a nice day,
-- 
jozef  :-)




More information about the Bogofilter mailing list