Detecting false-positives

Dwayne Hottinger dhottinger at harrisonburg.k12.va.us
Sun May 22 22:59:46 CEST 2005


On a little different stroke, I've been getting plenty of the Nazi spam also. 
Some of which appears to have email addresses inside my domain.  I have a setup
where users report email as spam and the reported mail gets sent to a mail box.
 Once a week or more I go through these messages and run a little script that
feeds them into bogofilters wordlist.  I've been hesitant to let the ones with
email addresses inside my domain go into the wordlist for fear bogofilter will
start grabbing those emails as spam also.  Should I go ahead and dump those
emails into bogo's wordlist with no fear that it will corrupt my wordlist?

ddh


Quoting Jef Poskanzer <jef at mail.acme.com>:

> >How do those of you detect false-positives?
> >
> >Especially those of you with high volumes of individual mail?
> >
> >As you can see in this plot, there are periods where I get well
> >over 1,000 messages each day, and it really defeats the purpose
> >of filtering if I have to check 600-700 messages per day for
> >false positives :)
>
> I get about 2,000,000 messages per day, but only about 25,000/day
> make it as far as the Bayesian layer, where I use bogofilter and qsf
> combined in a voting arrangement.  97% of those score the max for
> both filters; those I send directly to the bit bucket.  That leaves
> about 700/day, of which 85% or about 600/day are classified as spam.
>
> These numbers are much higher that usual right now due to all the
> Nazi spam from sober.q.  Before that it was about 1/10th as much.
>
> Anyway, the spam-but-not-max-spam messages get filed in a folder,
> and once or twice a day I scan the subjects.  Maybe once a week
> I find a false positive in there, typically someone I've never
> corresponded with before.
> ---
> Jef
>
>        Jef Poskanzer  jef at mail.acme.com  http://www.acme.com/jef/
> _______________________________________________
> Bogofilter mailing list
> Bogofilter at bogofilter.org
> http://www.bogofilter.org/mailman/listinfo/bogofilter
>


--
Dwayne Hottinger
Network Administrator
Harrisonburg City Public Schools



More information about the Bogofilter mailing list