massive false negatives
Peter Bishop
pgb at adelard.com
Tue May 6 10:58:23 CEST 2003
On at , Unknown wrote:
> Ever since I upgraded to 0.11.1.3, I've been getting a Lot of false
> negatives. In fact, I'm not sure Anything is getting filtered out.
>
> I recently recreated my db from known spam and ham, but that didn't
> appear to help.
>
Looking at your attached listing I saw:
"bedroom" 11 0.000734 0.000000 0.000038 -0.00004 -10.18522 +
"diploma" 11 0.000734 0.000000 0.000038 -0.00004 -10.18522 +
"diplomas" 11 0.000734 0.000000 0.000038 -0.00004 -10.18522 +
"imprisonment" 11 0.000734 0.000000 0.000038 -0.00004 -10.18522 +
"walmart" 11 0.000734 0.000000 0.000038 -0.00004 -10.18522 +
These words look very spammish to me.
But according the listing above, 11 good emails contained these words,
and *zero* spam emails contained the words.
This looks very mch like "finger trouble" when you rebuilt the database
e.g. using -n when training with spam so the words are put on the wrong
list.
or maybe using training files that have been "polluted"
(e.g. the ham training set contains some spam)
--
Peter Bishop
pgb at adelard.com
pgb at csr.city.ac.uk
More information about the Bogofilter
mailing list