Dealing with wordlist mails

Lars Clausen lc at statsbiblioteket.dk
Wed Jan 28 13:13:18 CET 2004


I saw on my run-through of bogofiltered mail today that a huge number of
mails had a bunch of random (but not nonsense) words attached.  Many of
these had bogosity of 0.50000, which is a bad sign, as some ham mails
come over that.  

Thinking back to the original of bogofilter, is it not that only ham
mails are likely to contain words that are specific to you?  When
spammers send out wordlist spams, they put in a lot of words that are
not known at all, so I'm guessing they are marked as
neither-ham-nor-spam, thus tilting the mail towards the middle. 
Shouldn't unknown words be considered slightly spammish, as they have
never appeared in your ham?  Not a lot, as you'd want your friends to be
able to introduce new words to you, but slightly?  Or is that just one
of those tweakings that give poorer results?

-Lars

P.S. Has anyone tried to make Gnus sort mails by bogosity?





More information about the Bogofilter mailing list