bogofilter 1.2.2 crashes with "flex scanner push-back overflow"

Matthias Andree matthias.andree at gmx.de
Sun May 8 04:06:31 CEST 2011


Am 07.05.2011 23:24, schrieb Seth David Schoen:
> Hi,
> 
> (I sent this message yesterday but it didn't make it to the list,
> maybe because the original version had an attachment.)
> 
> I'm using Ubuntu Maverick's bogofilter 1.2.2 package to filter spam.
> Today I found that bogofilter was crashing with the error "flex scanner
> push-back overflow" and not filtering spam.  I identified the two
> particular spam messages that were causing this problem, and I found
> that they would make bogofilter crash every time.  I've also confirmed
> that they make bogofilter 1.2.2 crash on another machine, running FreeBSD,
> with no wordlist.db, so I think there is a real bug here.
> 
> The spam messages seem to consider of a huge number of (long) separately
> koi8-encoded tokens.  Their contents were identical except for the date
> and recipient address.  I've posted one of the original messages at
> 
> http://www.loyalty.org/~schoen/spam.bz2

I've looked at this, and ultimately we're trying to push back some 160
kBytes of stuff, which is beyond the lexer's static buffer size.

To fix this particular problem, we'd need to get rid of {ENCODED_TOKEN}
and thereabouts in the .l file and move the functionality to lexer.c.  I
am almost there, but have a nasty borderline quirk left to fix that
causes the iconv() (character set conversion) to run twice, breaking the
string.

I'll send David Relson what I have, perhaps he can continue -- I have
something else to do on short notice and shouldn't even have attempted
this today.



More information about the Bogofilter mailing list