New header token tagging

David Relson relson at osagesoftware.com
Thu Sep 25 19:00:11 CEST 2003


On Thu, 25 Sep 2003 09:43:54 -0700
"Greg McCann" <greg at cambria.com> wrote:

> On 9/25/2003 at 12:05 PM David Relson <relson at osagesoftware.com>
> wrote:
> 
> >Question 1:  Has anybody else noticed an effect from the new header
> >tagging?  If so, what have you noticed?
> 
> I am only a beginning bogofilter user and I haven't analyzed this
> extensively yet, so my observations may or may not be valid.  But for
> what it's worth, I have had problems since upgrading from 0.13.6.2 to
> 0.15.4 last night.
> 
> I am processing all of my own incoming email with...
> 
> | bogofilter -uepl
> 
> Also, I have several spamtraps that I am automatically directing to...
> 
> |/usr/local/bin/bogofilter -s
> 
> This morning, a user complained that mail that used to be spamicity 0
> was now spamicity .5.
> 
> I checked a couple of messages with -vv and found this...
> 
> ...
> "head:text"                        194  0.000000  0.007838  0.999970 +
> "head:Date"                        196  0.000000  0.007919  0.999970 +
> "rcvd:GMT"                         196  0.000000  0.007919  0.999970 +
> "rcvd:Received"                    196  0.000000  0.007919  0.999970 +
> "rcvd:Sep"                         196  0.000000  0.007919  0.999970 +
> "rcvd:Thu"                         196  0.000000  0.007919  0.999970 +
> "rcvd:from"                        196  0.000000  0.007919  0.999970 +
> ...
> 
> Note that these are all in the positive spamicity category.  All of
> these fields, which should appear equally in spam and ham have all
> been registered as spam.
> 
> When I use bogoutil to check the database entries, I see this...
> 
> # bogoutil -d /home/bogofilter/goodlist.db | grep "rcvd:from"
> rcvd:from 0 20030925
> # bogoutil -d /home/bogofilter/spamlist.db | grep "rcvd:from"
> rcvd:from 197 20030925
> 
> It seems that the header fields are only being registered in the spam
> database, not the ham database.
> 
> If this continues, it will soon start classifying all of my good email
> as spam.
> 
> 
> 
> Greg

Greg,

Your conclusions all sound correct to me.  What messages have you been
using to keep bogofilter's training up to date?  

>From the info you present, it looks like 196 spam have been recently
added and 0 ham.  

I'd suggest grabbing 200 ham and registering them (bogofilter -n) to
restore the balance.

David




More information about the Bogofilter mailing list