Problems with these messages. See below

Nigel Henry cave.dnb2m97pp at aliceadsl.fr
Tue Jan 27 19:26:59 CET 2009


On Wednesday 21 January 2009 01:11, David Relson wrote:
> On Tue, 20 Jan 2009 22:30:50 +0100
>
> Nigel Henry wrote:
> > I'm getting messages identified as No Subject, with no To, From, or
> > Date, and bogofilter is having problems with them.

> > Nigel.

> Did you forget the example?  'Twould be best to put it in a zip file
> and attach it.
>
> Tokens in the "Subject", "To", and "From" lines get special treatment,
> i.e. a prefix.  This allows bogofilter to score tokens in these lines
> differently from the same word in the body of the message.  Bogofilter
> can do its job without these lines (and their tokens).  As with all
> things bayesian, extra tokens are helpful but the algorithm will work
> even without the extras.
>
> Summary:  Keep training!  Bogofilter will eventually recognize the
> messages -- even lacking header lines.  I don't think there's anything I
> can do to help.
>
> Regards,
>
> David

Hi David.

Apologies for coming back with this problem, and an example of a problem 
message is attached as a .txt file below.

The subject line is set as "No Subject", and there are no entries for the 
"To", "From", or "Date" , although the date sent is within the body of the 
message.

What I'm confused about, is why the X-bogosity histogram, and X-attachments 
are not being removed from these messages, when they turn up in the inbox as 
spam misidentified as ham.

I freely admit to not being too clued up on the inner workings of Bogofilter, 
and bayesian filtering, but on the face of it it appears that the X-bogosity 
histogram, and X-attachments are part of the main message, rather than the 
headers on these spam messages I'm receiving, as you can see from the 
attached .txt file.

I've been training bogofilter with these messages, with the X-bogosity, and 
X-attachments intact, but with no success. I'd expect with continual training 
they'd start to appear in the unsure box by now, but no, they all still turn 
up in the inbox.

I'm using Bogofilter 1.0.2 on Kmail, and have no problems with other spam, 
apart from the odd 419 that escapes the Debian mailing lists filters.

Any suggestions apart from moving to Spamasassin welcome.

Nigel.

Attachment below.












More information about the Bogofilter mailing list