radical lexer

David Relson relson at osagesoftware.com
Sun Nov 26 19:16:23 CET 2006


Hi pi,

A quick comparison of bogofilter's lexer_v3.l and your radical lexer
was interesting, particularly the following line:

TOKENBORDER [^[:blank:][:cntrl:]<>;&%@|/\\{}^"*,[\]=()+?:#$._!'`~-]

Evidently you're excluding lots of characters from tokens, notably
dollar sign, period, underscore, exclamation mark, apostrophe, and
hyphen.  This the following tokens have different meanings for you and
me:

  $123
  domain.com
  domain_name
  bad!!!
  don't
  un-complicated

By the way, I probably would have used name TOKENCHAR instead of
TOKENBORDER :->

Cheers!

David



More information about the Bogofilter mailing list