What is a word (lexertest)
David Relson
relson at osagesoftware.com
Wed Oct 23 23:50:09 CEST 2002
At 05:03 PM 10/23/02, Tom Allison wrote:
>David Relson wrote:
>>At 06:59 AM 10/22/02, Boris 'pi' Piwinger wrote:
>>
>>>Hi!
>>>
>>>Even though I don't code here, I tested something;-)
>>>
>>>[3.14 at pi ~/local/bogolists]$ echo "»cmsg newgroup«"|lexertest
>>>get_token: 1 '»cmsg'
>>>get_token: 1 'newgroup«'
>>>[3.14 at pi ~/local/bogolists]$ echo "bla"|lexertest
>>>
>>>Both results are not really satisfiying. There might be a reason why
>>>the second does not return anything, but the first is wrong. Well,
>>>here we have the problem that we cannot tell without looking at the
>>>charset.
>>>
>>>pi
>>
>>pi,
>>There _is_ a problem with the lexer.
>>If a line contains exactly one token (composed only of letters and
>>digits), the lexer will ignore it.
Ton,
This is a side-effect of the rule to ignore BASE64 lines. The rule has
been changed to only match lines with a single token longer than 32
characters. Any word of reasonable length will now be returned by the lexer.
David
More information about the Bogofilter
mailing list