Failure to properly parse mbox

Matthias Andree matthias.andree at gmx.de
Wed Jun 4 22:16:01 CEST 2003


David Relson <relson at osagesoftware.com> writes:

> Yes. I want them :-)  I take it personally when bogofilter doesn't treat
> a "^From " line as the beginning of a new message.

Incidentally, I read some pages about the Haskell and Objective Caml
languages, which are functional programming languages. I wonder if we
can have a sort-of functional approach for cascading (nesting) lexers,
such as tokenize(decode(MAIL)), or tokenize(decode(split(MBOX))).

We talked about that approach before, whether we could have bounded
buffer sizes or something, or have the inner lexers lex "on demand" of
an outer lexer. (Obviously, tokenize is the outer lexer, decode the
inner (or middle) and split the inner lexer). Of course, these
implementations don't have to be in lex. There are other parsers, and
particularly the splitter would have to be rather fast, and ideally,
these nesting/chaining would copy as little data as possible.

-- 
Matthias Andree




More information about the Bogofilter mailing list