Radical lexers

Boris 'pi' Piwinger 3.14 at logic.univie.ac.at
Wed Dec 10 15:26:31 CET 2003


David Relson wrote:

>> The result for b) hurts. It says (if it can be confirmed)
>> that we are doing much too complicated things when defining
>> a token. I did really not expect that lexer to work. But
>> well, that's how it is.
>> 
>> c) is really mind-blowing. This simply MUST NOT work.
> 
> It probably means that enough of your tokens are strictly alphanumeric
> that the others don't matter.

So that is the suprise here. If most of the mail you get is
in English, than ASCII will do the job. But I get a lot in
German, which usually means: 8bit. All those words still
work in b, but are broken up in c. So you really expect a
failure here. But there is nothing too special about the
mail I get. I am completely taken by surprise.

Anyhow, if someone wants to play, I can offer the lexer files.

pi




More information about the Bogofilter mailing list