ALPHA [was: lexer change]

Boris 'pi' Piwinger 3.14 at logic.univie.ac.at
Tue Nov 11 00:03:55 CET 2003


David Relson <relson at osagesoftware.com> wrote:

>> ALPHA is never used and is identical to A2.
>> 
>> A2 is defined as [[:alpha:]][[:alnum:]]+ which is AFAICS the
>> same as [[:alnum:]]+. Is that correct? Or does it mean: I
>> alpha character followed by at least one alnum character?
>
>They are different.  Read flex documentation or create a test lexter and
>test it.

I tried to read but failed to understand.

>There is an error.  The trailing "+" does not belong in either A1 or A2.

OK, that makes a difference.

>A1 is needed for the places where a single letter needs to be identified
>for use in a token and a2 is needed for a single letter followed by a
>letter a digit.  An example is a token split by an html comment, i.e.
>"T<!xxx>ha<!xx>t".

I don't understand why just single letters or letters
followed by a letter or digit and not any sequence.

>> In either case: TOKEN_12 is the only place where A1 and A2
>> are used. Since anything of the form A1 is also of the form
>> A2, it would be sufficient to defined TOKEN_12 as
>> ({TOKEN}|{A2}).
>
>"a" is of form A1 but not A2.

Clear if the +'es are dropped.

>Thanks for your close reading of the code.  It has been very helpful in
>spotting code that _looks_ ok (on casual inspection) but is actually
>incorrect.

:-))

pi




More information about the bogofilter-dev mailing list