Bogofilter 0.11.2 config questions
David Relson
relson at osagesoftware.com
Wed Jun 4 18:39:06 CEST 2003
Hello Jerry,
Welcome to the list.
At 12:24 PM 6/4/03, jerry wrote:
>Hello to the
>list
>
>Let me begin by thanking all those involved with bogofilter, I've been
>using it
>since version 0.8.0, and have found it to be very effective against spam.
>I recently upgraded to 0.11.2 and am experiencing some
>problems.
>Here's what I am currently running bogofilter on:
>
>Gentoo Base System version 1.4.2.9 mostly stable packages,
>however I do have some unstable packages installed
>kernel version 2.4.19-gentoo-r9
>i686 AMD Duron(tm) Processor AuthenticAMD GNU/Linux
>fetchmail release 6.2.2+RPA+NTLM+SDPS+SSL+NLS
>procmail v3.22 2001/09/10
>Mutt 1.4.1i (2003-03-19)
>
>And finally on to the problem.
>Having just installed this version, I am training bogofilter from the start
>on all emails by hand, no -u in the script, so I expect,at least until I have
>several hundred emails processed thru, to recieve errors.
>Here's the count so far:
>bogoutil -w ~/.bogofilter .MSG_COUNT
> spam good
>MSG_COUNT 106 71
>All the spam messages thus far have been misidentified as false positive,
>I think, they end up in my normal mailbox, not in the spam folder as spam.
A correction to your terminology: since bogofilter's purpose is to detect
spam, a classification as spam is considered a "positive". Thus, a false
positive is when a ham message is classified as spam and a false negative
is when spam is classified as ham.
Rather than just tell us that bogofilter's getting the classification
wrong, could you tell us the scores it's giving?
>Some of the spam messages contain the **SPAM** identification tag in the
>subject line, others do not contain this tag. Here's the relevant
>portion of .procmailrc:
>
>#file the mail to spam-bogofilter if it's spam.
>:0
>* ^X-Bogosity: Yes, tests=bogofilter
>$MAILDIR/IN-zztrash
>The above was working in the prior version.
>
>In addition to the above problem, the X-Bogosity line appears
>in the message body area, at random places, sometimes just
>after the first line of the message, other times after the reply.
see below ...
>Here's the relevant portion of my .bogofilter.cf:
>stats_in_header=1
>I've tried the default stats_in_header=Y and that results in the same
>problem. /etc/bogofilter.cf has the default set as stats_in_header=Y
>
>Any suggestion as to what I can tweak to fix this?
The name of the "stats_in_header" is a bit deceptive. The spam header
line, a.k.a. the X-Bogosity line, belongs in the message header. If you
use the "-vv" or "-vvv" flags, additional statistics will be added to the
message. The config option controls where they'll be put.
The X-Bogosity line sometimes isn't put in the header because of a
malformed header. The RFC says that a header ends with an empty line,
specifically a CR-LF combo (since emails are delivered with
both). Sometimes there's a seemingly empty line at the end of the header
which actually contains whitespace. Very recently, bogofilter was changed
to accept these lines as end of header lines.
I encourage you to download and use bogofilter-0.13.5. It's due to be
promoted to "stable" tomorrow (barring major problems). It has improved
parsing that enables it to do a much better job of classifying messages.
Hope this helps.
More information about the Bogofilter
mailing list