confessions of a newbie

David Relson relson at osagesoftware.com
Mon May 12 20:52:27 CEST 2003


At 01:51 PM 5/12/03, Charlie Shub wrote:

>I installed bogofilter about a month ago
>
>then,
>
>per ---- http://www.bgl.nu/bogofilter/training2.html
>
>         cd spam
>         bogofilter -sv < spam-found
>         Created directory /users/ludell/users/cdash/.bogofilter .
>         # 102493 words, 418 messages
>
>         cd ~/mail
>         cat [a-hj-np-rt-w]* ia* ie* op* s[acioty]* | bogofilter -nv
>         # 1373824 words, 4564 messages
>
>and add to .procmail
>
>         # filter mail through bogofilter, tagging it as spam and
>         # updating the word lists
>
>         :0fw
>         | bogofilter -u -e -p
>
>         # if bogofilter failed, return the mail to the queue, the MTA will
>         # retry to deliver it later
>         # 75 is the value for EX_TEMPFAIL in /usr/include/sysexits.h
>
>         :0e
>         { EXITCODE=75 HOST }
>
>         # file the mail to spam-bogofilter if it's spam.
>
>         :0:
>         * ^X-Bogosity: Yes, tests=bogofilter
>         spam-bogofilter
>
>
>Then i experimented a while and ended up making a single small change
>in spam_cutoff
>         #
>         #       decreased to .90 4/25/02 by cdash
>         #       decreased to .2550 5/1/02 by cdash
>         #
>         #spam_cutoff = 0.95
>         #
>         spam_cutoff = 0.2550
>
>
>Since may 1, i've had 4 false negatives
>                       0 false positives
>                       164 true positives
>
>I think that's pretty good.
>For the false negatives, I send them through bogofilter -Ns
>
>that is absolutely close enough for me
>I get about 1000 e-mails a month

Charlie,

A success story!  We enjoy hearing those.  Usually a newbie encounters at 
least one confusing detail or spots a flaw in the documents.  Obviously if 
you encountered such events, you figured them and and got past.

Good job!

As a comment, I'm a bit surprised that .2550 works for your 
spam_cutoff.  The incoming messages here have, over time, provided ham with 
scores as high as 0.90 and spam with scores as low as 0.15.  We've long 
known that approriate values differ from site to site.  As a guess, you've 
got a clear division between spam and ham, and that allows the low spam_cutoff.

David

P.S.  Sorry for the double posting of the reply.  Evidently, I accidentally 
hit the "send" button (or something).


>charlie shub   University of Colorado at Colorado Springs
>cdash at cs.uccs.edu               http://cs.uccs.edu/~cdash
>(719) 262-3492                  (fax) 262-3369
>
>---------------------------------------------------------------------
>FAQ: http://bogofilter.sourceforge.net/bogofilter-faq.html
>To unsubscribe, e-mail: bogofilter-unsubscribe at aotto.com
>For summary digest subscription: bogofilter-digest-subscribe at aotto.com
>For more commands, e-mail: bogofilter-help at aotto.com





More information about the Bogofilter mailing list