When is spam_cutoff too low?

Matej Cepl cepl at surfbest.net
Wed Dec 8 05:37:24 CET 2004


Hi,

I use bogofilter on the remote server (so I would really not like to
retrain) using tri-state spam checking and I am still totally high on the
quality of the spam checking. BTW, the remote server uses BF 0.92.6 w/
BerkeleyDB 3.2.9, if it matters, and if I really do not have to I would not
like to ask the admin of that server for upgrade; moreover, there is only
0.92.8 available for Debian/woody on backports.org (where the package comes
from).

Just for kicks -- this is the output from mailstat on the remote server with
data for the last month:

$ mailstat log-procmail-matej
Folder                                    Number       Size
------                                    ------ ----------
~/.bogofilter/junk-matej                    4409  48987.66k
/dev/null                                   4149  46774.62k
{forwarded to the IMAP server}              3791  33427.04k
------                                    ------ ----------
Total:                                     12349 129189.32k

(The first two lines are what bogofilter classified as certain spam -- in
the middle of the period I gave up on browsing through hundreds of emails
every other day with mailx -- there is no better MUA there and I cannot
install anything -- and I am sending spam to /dev/null). That means I've
got in the last month 8,558 spam messages of the total amount of 95.7MB of
spam!!! Wow! I am really glad (and thankful to you guys) that I do not have
to deal with it!

However, I am still getting some fourty spam messages into my _uncertain
mailbox. Following the advice in tuning BF HOWTO I was tuning down
spam_cutoff in bogofilter.conf (always a percent at one time and then
waited for couple of days, what happens). However, now I have there 0.87
and still plenty of false negatives (no false positive so far) and I am
getting afraid, when I will hit the magic limit, when the bogofilter begin
massively misclassify ham as spam. Is there such a limit? Should I do
something else than tuning down spam_cutoff?

Thanks for any advice,

Matej

-- 
Matej Cepl, http://www.ceplovi.cz/matej
GPG Finger: 89EF 4BC6 288A BF43 1BAB  25C3 E09F EF25 D964 84AC
138 Highland Ave. #10, Somerville, Ma 02143, (617) 623-1488
 
Poor Faulkner. Does he really think big emotions come from big
words?
      -- Ernest Hemingway
         (about William Faulkner)




More information about the Bogofilter mailing list