Help! Strange classify behavior in ensim environment

Benji Tittle benji at tittle.net
Wed Aug 20 00:08:30 CEST 2003


Hi all,

Up for a bit of a challenge?

I'm getting some strange bogofilter classification glitches.  I'm using
0.14.5.  The catch:  I'm running bogofilter under an ensim environment...
ensim is a software package hosting companies use to parcel out disk space 
to subscribers.

What this means is that I "appear" to have my own /etc directory, and
three /home/... directories, one for each of the three users with pop
mailboxes and shell accounts under my account.  Each of these directories
in fact exists somewhere else on the server I'm renting space on... ensim 
basically gives me my own little sandbox so my hosting company can be 
confident that I'm playing nice with the hundreds of other users sharing 
my server.

Anyway, I have write access to "my" /etc/procmailrc, and have set it up to 
use bogofilter as seen here:
http://www.tittle.net/spam/procmailrc.txt

Basically I set $USER to myself, so the spam and nonspam for all three
users winds up under my mail directory.  I put spam into mail/spam, while
I pass ham/unsure on to the correct user while saving a copy in mail/ns.  
Since I obviously don't have root on my rented server space, I build
bogofilter (static) on a linux box at work and ftp it to /home/benji/bin.  
I explicitly call out the cf file in /home/benji, which is here:
http://www.tittle.net/spam/bogofiltercf.txt

Anyway, here's what's happening.  I'm getting 2-3 spams a day that get
scored "unsure," and are therefore passed on to the correct mailbox.  
Thing is, when I strip out the X-Bogosity tag and feed it back to
bogofilter, most of these score as spam... even before doing a bogofilter 
-N!  Once I bogofilter -N they are often nearly maxed out on the spam 
scale, although bogofilter initially detected them as unsure.

Here's a sample:
http://www.tittle.net/spam/sample.txt

When this message arrived, it scored unsure (0.711831) and wound up in my
inbox.  But immediately after receiving the message, I saved the entire
message (headers and all) to a file, removed the X-Bogosity header and fed
it to bogofilter -N (to undo the -u), then bogofilter -v... and it ranked
as spam! (0.900022)

Why would this message score differently when processed by procmail, vs.  
when processed on the command line in my user account?

TIA,
Benji





More information about the Bogofilter mailing list