multiple lexers.

Gyepi SAM gyepi at praxis-sw.com
Mon Dec 30 05:56:27 CET 2002


On Sun, Dec 29, 2002 at 11:41:11PM -0500, David Relson wrote:
> At 11:26 PM 12/29/02, Gyepi SAM wrote:
> 
> >On Mon, Dec 30, 2002 at 04:28:19AM +0100, Matthias Andree wrote:
> >
> >Yes. If we are going to do MIME, it needs to be able to handle
> >all valid mime constructs
> >(in addition to many invalid ones whenever possible);
> 
> For classifying spam, I think some details can be glossed over.  For 
> example, 8 bit data in a 7 bit message is incorrect but since we need to be 
> tolerant ...

You're right. We don't need to be too strict, but we do need to handle
the large variety of possible mime combinations correctly.

> >> BTW: it's not exactly helpful to mix parsing boundary= parameters and
> >> --BOUNDARY treatment in the same function, it makes the API ugly. The
> >> boundary= treatment belongs into the Content-Type: parser,
> >
> >I have actually changed this in my private copy...
> 
> I admit that I implemented the first thing I thought of and released it.  A 
> little more thought suggests two functions, perhaps named set_boundary() 
> and check_boundary().  What did you do?  If it's good, lay it on us :-)

Well, it's really a small change. 
I moved the boundary setting code into a routine called set_mime_boundary()
and changed the call from lexer.l. The mime_boundary() routine now just
checks boundaries and pushes new mime parts unto the stack.  It will 
change further to also pop parts off the stack when we encounter their closing
delimiter.

-Gyepi





More information about the bogofilter-dev mailing list