[bogofilter] misclassified emails in global wordlist

Tom Anderson tanderso at oac-design.com
Mon Apr 19 13:20:28 CEST 2004


On Mon, 2004-04-19 at 05:11, Chris Fortune wrote:
> Maybe the question is this: is a global wordlist with user input always cursed with fuzzy results?  To use it wisely, should I tweak
> the configuration, and how?  What other precautions should I take?

The more you move from binary certainty to learned intelligence, you'll
almost certainly increase "fuzziness".  When you get different people
telling it different things, that's likely to increase it.  But I don't
think this necessarily means lots of admin supervision.  If one person
misclassifies an email (or even purposefully registers one which
contradicts most other users' classifications), then the other users
will counteract the effect by their own registrations.  Assuming there
are more correct registrations than incorrect, the system should always
balance itself out in favor of correct classifications.

Of course there are configuration tweaks you can make.  I tend to make
small changes now and then (to spam_cutoff, ham_cutoff, and robx
primarily) based on the results provided.  With a global wordlist,
you'll of course have to take into account every user's results, so
taking a poll of false positives/negatives/unsures might be useful.  It
might be a good idea to move certain users off of the global wordlist
and into their own wordlist if their circumstances tend to hurt overall
classifications (eg they are buying or refinancing a house, they are a
doctor that prescribes or talks about viagra, levitra, etc., etc.). 
Sometimes one person's spam is another's ham, so these people shouldn't
share the same wordlist.  If they do, one or more will likely be
frustrated with the classifications.

BTW, do you have a "are you sure" popup or interstitial?  Asking people
to review their choice of registration might cut down on bad ones.

Tom

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20040419/3aa25833/attachment.sig>


More information about the Bogofilter mailing list