How do I filter out spam that turns up on mailing lists?
Tom Anderson
tanderso at oac-design.com
Tue Jan 8 00:12:43 CET 2008
Nigel Henry wrote:
> Hi Tom. Ok I see where you're going. So would it be helpfull if I trained
> bogofilter with a load of genuine Debian ham mails, so as to compare the ham
> from spam?
No, you want to train with the errors only. The point is that the
Debian list headers are already hammy because you've received so many of
them as hams. That's presumably why bogofilter is having a hard time
filtering out the spams on that list. You have to counter this by
training the spams so as to neutralize the Debian-list-specific tokens.
Once those tokens are seen as neither hammy nor spammy, then the
content of the email will shine through and bogofilter will classify it
correctly.
Tom
More information about the Bogofilter
mailing list