New header token tagging
David Relson
relson at osagesoftware.com
Thu Sep 25 19:00:11 CEST 2003
On Thu, 25 Sep 2003 09:43:54 -0700
"Greg McCann" <greg at cambria.com> wrote:
> On 9/25/2003 at 12:05 PM David Relson <relson at osagesoftware.com>
> wrote:
>
> >Question 1: Has anybody else noticed an effect from the new header
> >tagging? If so, what have you noticed?
>
> I am only a beginning bogofilter user and I haven't analyzed this
> extensively yet, so my observations may or may not be valid. But for
> what it's worth, I have had problems since upgrading from 0.13.6.2 to
> 0.15.4 last night.
>
> I am processing all of my own incoming email with...
>
> | bogofilter -uepl
>
> Also, I have several spamtraps that I am automatically directing to...
>
> |/usr/local/bin/bogofilter -s
>
> This morning, a user complained that mail that used to be spamicity 0
> was now spamicity .5.
>
> I checked a couple of messages with -vv and found this...
>
> ...
> "head:text" 194 0.000000 0.007838 0.999970 +
> "head:Date" 196 0.000000 0.007919 0.999970 +
> "rcvd:GMT" 196 0.000000 0.007919 0.999970 +
> "rcvd:Received" 196 0.000000 0.007919 0.999970 +
> "rcvd:Sep" 196 0.000000 0.007919 0.999970 +
> "rcvd:Thu" 196 0.000000 0.007919 0.999970 +
> "rcvd:from" 196 0.000000 0.007919 0.999970 +
> ...
>
> Note that these are all in the positive spamicity category. All of
> these fields, which should appear equally in spam and ham have all
> been registered as spam.
>
> When I use bogoutil to check the database entries, I see this...
>
> # bogoutil -d /home/bogofilter/goodlist.db | grep "rcvd:from"
> rcvd:from 0 20030925
> # bogoutil -d /home/bogofilter/spamlist.db | grep "rcvd:from"
> rcvd:from 197 20030925
>
> It seems that the header fields are only being registered in the spam
> database, not the ham database.
>
> If this continues, it will soon start classifying all of my good email
> as spam.
>
>
>
> Greg
Greg,
Your conclusions all sound correct to me. What messages have you been
using to keep bogofilter's training up to date?
>From the info you present, it looks like 196 spam have been recently
added and 0 ham.
I'd suggest grabbing 200 ham and registering them (bogofilter -n) to
restore the balance.
David
More information about the Bogofilter
mailing list