What is a word (lexertest)

David Relson relson at osagesoftware.com
Wed Oct 23 23:50:09 CEST 2002


At 05:03 PM 10/23/02, Tom Allison wrote:

>David Relson wrote:
>>At 06:59 AM 10/22/02, Boris 'pi' Piwinger wrote:
>>
>>>Hi!
>>>
>>>Even though I don't code here, I tested something;-)
>>>
>>>[3.14 at pi ~/local/bogolists]$ echo "»cmsg newgroup«"|lexertest
>>>get_token: 1 '»cmsg'
>>>get_token: 1 'newgroup«'
>>>[3.14 at pi ~/local/bogolists]$ echo "bla"|lexertest
>>>
>>>Both results are not really satisfiying. There might be a reason why
>>>the second does not return anything, but the first is wrong. Well,
>>>here we have the problem that we cannot tell without looking at the
>>>charset.
>>>
>>>pi
>>
>>pi,
>>There _is_ a problem with the lexer.
>>If a line contains exactly one token (composed only of letters and 
>>digits), the lexer will ignore it.

Ton,

This is a side-effect of the rule to ignore BASE64 lines.  The rule has 
been changed to only match lines with a single token longer than 32 
characters.  Any word of reasonable length will now be returned by the lexer.

David






More information about the Bogofilter mailing list