defining empty lines.

Jeremy Blosser jblosser-bogofilter at firinn.org
Tue May 20 00:08:52 CEST 2003


On May 17, David Relson [relson at osagesoftware.com] wrote:
> RFC2822 specifies "The body is simply a sequence of characters that follows 
> the header and is separated from the header by an empty line (i.e., a line 
> with nothing preceding the CRLF).
> 
> Jeremy Blosser has encountered many spam messages where "\b\r\n" appears in 
> this position.  Bogofilter is looking for the truly empty lines for writing 
> out the "X-Bogosity" line (in passthrough mode) and gets it wrong for these 
> messages.

This isn't exactly correct; sorry if my other reports weren't clear.

I don't know yet what exactly is appearing in the line between the headers
and the body.  I was able to cause one failure case by putting \b\r\n
there (failure case meaning bogofilter just looked for the next blank line
and used that as the separator, while the MTA added a blank line before the
\b\r\n and that became the real separator, the end result being the
bogosity header was in the message body).  However, that case (extra blank
line inserted by the MTA) isn't what I'm actually seeing.  Whatever is in
that blank line after the header that is throwing bogofilter off is getting
destroyed before I see the message (in our setup, bogofilter is the first
thing that sees the message, before it's even written to disk).  I haven't
tried to catch one before anything else sees it, yet... that's doable if
necessary but non-trivial considering the volume of mail I'd have to be
looking at.

On May 18, Adrian Ho [aho-sw-bogofilter at 03s.net] wrote:
> On Sat, May 17, 2003 at 03:51:25PM -0400, David Relson wrote:
> > P.S. Alternate solutions are welcomed.
> 
> If you're simply looking for a good place to write the X-Bogosity
> header, just tack it to the start of the message (ie. before every other
> header).  You'd of course want to skip over a leading mbox (From_)
> marker if it exists.

I concur with this, as I mentioned to David.  It doesn't just avoid this
kind of bug, it also makes it clear from the headers which box ran
bogofilter.

On May 17, elijah [elijah at riseup.net] wrote:
> I don't know if this is exactly the same problem, but I frequently get
> legitimate mail where the x-bogosity header starts after the first
> paragraph of the message. In particular, I think the popular web-mail

That's probably the same thing I'm seeing.  Basically it ignores the first
blank line and uses the next one... in my case it usually is a mime header
or bogus html line that is the first thing and what it shows up after, but
in plain text mails it'd be the first paragraph.




More information about the Bogofilter mailing list