lexer changes
Boris 'pi' Piwinger
3.14 at logic.univie.ac.at
Tue Nov 11 15:50:59 CET 2003
David Relson wrote:
> As has been said several times before, my test sets a fp target of 0.25%
> (of the message count), finds the cutoff value that corresponds to that
> target, scores the spam, and reports the false negative counts. There's
> no need to report the value for each test because the target is fixed at
> 0.25%
If you closely look at my tests this produces more or less
random results. Example:
wo (fn): 0.500000 26 23 19 68
wo (fp): 0.500000 5 4 4 13
wi (fn): 0.581092 50 41 41 132
wi (fp): 0.581092 3 2 1 6
wi (fn): 0.499993 26 23 19 68
wi (fp): 0.499993 6 4 5 15
wi (fn): 0.457261 15 15 14 44
wi (fp): 0.457261 14 10 8 32
wo (fn): 0.500000 30 30 20 80
wo (fp): 0.500000 4 5 3 12
wi (fn): 0.546680 41 36 34 111
wi (fp): 0.546680 2 2 2 6
wi (fn): 0.499780 29 30 19 78
wi (fp): 0.499780 5 6 4 15
wi (fn): 0.457308 16 18 13 47
wi (fp): 0.457308 14 10 8 32
Which one is better? For 6 fp the second is better, for 15
fp and 32 fp the first. So you make your decision depending
on your choice of fp target.
In other of those test you see that there is a big
difference in the fp's in the first place. But if you shift
to some target you don't see that anymore.
wo (fn): 0.500000 22 18 24 64
wo (fp): 0.500000 5 4 6 15
wi (fn): 0.500248 22 18 24 64
wi (fp): 0.500248 5 4 5 14
wo (fn): 0.500000 24 22 22 68
wo (fp): 0.500000 4 4 3 11
wi (fn): 0.499999 24 22 21 67
wi (fp): 0.499999 5 4 4 13
You also see that the target was missed, there ain't no such
thing as a *fixed* target. So you actually may not compare
the same numbers.
>> >> 1) Some \ slipped back in. Out again.
>> >
>> > None of them "slipped" in.
>>
>> Actually in
>> <20031104092536.1d799059.relson at osagesoftware.com> you had
>> some out which are now back in. Example:
>> +TOKENBACK [^[:blank:]<>;=():&%$#@+|/\\{}^\"?*,[:cntrl:][\]._+-]
>
> Yes, I put them back in because "+" and "-" are special characters in
> many flex constructs and having the backslashes will help avoid future
> problems if the expressions are modified.
+ is not special in a character class. Now we have two +
BTW. man page:
Note that inside of a character class, all regular expres-
sion operators lose their special meaning except escape
('\') and the character class operators, '-', ']', and, at
the beginning of the class, '^'.
Actually - at the end or in the beginning is also fine.
>> >> I cannot find the price range rule which is announced in the
>> >> comment of 12 May 2003.
>> >
>> > Hint: look for the word "dollar"
>>
>> I only find that a token is returned. That token does not
>> allow for any -..
>
> So you want to include a minus sign???
I'm note sure. I just read we do something with it and I
cannot find it.
pi
More information about the bogofilter-dev
mailing list