Ignore lists

David Relson relson at osagesoftware.com
Wed Mar 3 20:50:02 CET 2004


On Wed, 03 Mar 2004 11:41:57 -0800
Greg McCann wrote:

> On 3/3/2004 at 1:49 PM David Relson <relson at osagesoftware.com> wrote:
> 
> >True.  Ignoring tokens (via ignore lists) is different from ignoring
> >lines.  What ideas have you on this?  So far, "ignore 'X-ABC:' lines"
> >and "ignore 1st n ABC:" lines have been suggested.  What else?
> 
> I have a funny problem with my scoring that an ignore wordlist would
> probably help.  Email headers (at least with sendmail) always contain
> the current date.  My ham and spam corpuses (corpi?) are all from
> recent email and my spam corpus, which gets automatically updated from
> spamtrap addresses, is updated much more frequently than my ham, with
> about 1200 new spam every day.
> 
> The unexpected consequence is that every time the month changes, the
> abbreviation for the current month instantly gets a very high spam
> score until I manually throw some more ham at it.  Here's one from
> today.
> 
> "rcvd:Mar"  3790  0.000000  0.029003  0.999998 +
> 
> In this case, an ignore wordlist would probably be more useful than
> ignoring lines, since the "Received:" lines that contain the date also
> contain lots of other useful information, like the sender domain and
> IP address.

Greg,

Sounds like it would indeed help you (and possibly other newbies).
Having run bogofilter for a full set of months, that wouldn'd help me. 
FYI, this is what I have:

[relson at osage bogofilter]$ bogoutil -p $BOGOFILTER_DIR rcvd:Jan rcvd:Feb
   rcvd:Mar rcvd:Apr rcvd:May rcvd:Jun rcvd:Jul rcvd:Aug rcvd:Sep
rcvd:Oct
   rcvd:Nov rcvd:Dec

                                 spam    good    Fisher
rcvd:Jan                         6378    8999  0.469933
rcvd:Feb                         2734    4388  0.438005
rcvd:Mar                         3252    5266  0.435817
rcvd:Apr                         3311    4717  0.467527
rcvd:May                         4295    4987  0.518607
rcvd:Jun                         4630    3478  0.624794
rcvd:Jul                         2887    3948  0.477728
rcvd:Aug                         3299    4685  0.468317
rcvd:Sep                         6373    6239  0.560969
rcvd:Oct                         4670    8247  0.414633
rcvd:Nov                         7869    9985  0.496423
rcvd:Dec                        10706   11200  0.544566





More information about the Bogofilter mailing list