defining empty lines.

David Relson relson at osagesoftware.com
Thu May 29 16:05:01 CEST 2003


At 09:05 AM 5/29/03, Matthias Andree wrote:
>David Relson <relson at osagesoftware.com> writes:
> > As I recall, the '\b' was the message _after_ procmail,etc had processed
> > it.  Additional monitoring found that some message had a separator line
> > with a single blank character, 0x20, on it.  I found several of them in
> > my incoming mail (from early April).  All were from the same source.
>
>I wonder if we should "fix" nonconforming mail in passthrough mode,
>i. e. insert the header and the empty line at the first non-header line;
>and consider the first all-whitespace/control to be the "empty"
>line. Might be that this is just another try to subvert filters. What
>does Outlook do with such lines (' ' rather than '\b')?

Matthias,

I don't think we should have bogofilter "fix" nonconforming 
mail.  Bogofilter's job is detection and classification, not fixing.

The message thread that sparked the discussion of empty lines is titled 
"Re: 0.12.1 problem [was: bogofilter and qmail-qfilter]".  Jeremy Blosser 
had reported the problem and found that "\b\r\n" would produce the same 
result - bogofilter putting the X-Bogosity line in the message 
body.  Here's his actual comment:

 > It doesn't look like it's bare CR, but I've sort of reproduced it.
 >
 > If the "blank" separator line between the header and body isn't actually
 > blank, but contains a non-printing character... e.g.:

He wasn't sure exactly what the original message looked like.

I did some scanning of the spam I had and found several messages (from the 
same source) in which the headers "ended" with

" \r\n" (0x20, 0x0D, 0x0A)
"\r\n"  (0x0D, 0x0A)

That finding resulted in my adding is_blank_line() to main.c with a check 
for isspace(c) || (c == '\b').  If the '\b' causes trouble, we can always 
delete it and see if we're better off without it.

David







More information about the Bogofilter mailing list