Radical lexers
Boris 'pi' Piwinger
3.14 at logic.univie.ac.at
Wed Dec 10 15:26:31 CET 2003
David Relson wrote:
>> The result for b) hurts. It says (if it can be confirmed)
>> that we are doing much too complicated things when defining
>> a token. I did really not expect that lexer to work. But
>> well, that's how it is.
>>
>> c) is really mind-blowing. This simply MUST NOT work.
>
> It probably means that enough of your tokens are strictly alphanumeric
> that the others don't matter.
So that is the suprise here. If most of the mail you get is
in English, than ASCII will do the job. But I get a lot in
German, which usually means: 8bit. All those words still
work in b, but are broken up in c. So you really expect a
failure here. But there is nothing too special about the
mail I get. I am completely taken by surprise.
Anyhow, if someone wants to play, I can offer the lexer files.
pi
More information about the Bogofilter
mailing list