Detecting false-positives

David Relson relson at osagesoftware.com
Sun May 22 22:55:47 CEST 2005


On Sun, 22 May 2005 13:30:31 -0700
David Carmean wrote:

> 
> How do those of you detect false-positives?
> 
> Especially those of you with high volumes of individual mail?  
> 
> As you can see in this plot, there are periods where I get well 
> over 1,000 messages each day, and it really defeats the purpose 
> of filtering if I have to check 600-700 messages per day for 
> false positives :)
> 
>     http://www.halibut.com/~dlc/tmp/1year-bogodata.png

A good question!

I've found bogofilter's detection to be accurate enough that I don't
worry about false positives, though they do happen.  Over the years,
I've noticed there are several types of email that are likely false
positives.

One set comes from joining new groups.  For example xanga.com is a
popular blogging site among kids and my high school junior signed up.
At one point I noticed a message for him, asked him about it, and
learned it was valid.  I then searched my archive of recent spam, found
the messages, and trained bogofilter with them.  End of that problem.

He did well on the PSAT exam and is also receiving lots of "look at me"
messages from colleges.  Mostly they are "Unsures", but periodically I
search the spam archive for .edu addresses and manually verify them.

When I order something online from a new company, I've found it useful
to check the spam folder.

So, I've learned about types of messages and check periodically.

I've yet to find about any FP's that I missed and that mattered.  

Given the alternative is handling 100's of obnoxious messages each day,
the risk is accepted :-)

Regards,

David



More information about the Bogofilter mailing list