Test with different lexers
Boris 'pi' Piwinger
3.14 at logic.univie.ac.at
Tue Dec 2 15:35:50 CET 2003
David Relson wrote:
>> > As a second example, "!" is accepted at the end
>> > (but not the beginning), reflecting common spammer usage.
>>
>> This is a nice example of an idea which sounds totally
>> reasonable. In my test (which I did post), though, it was
>> actually indifferent, so in some test it worked better in
>> another worse. With rules like this we try to code some
>> actual technique we see as humans into bogofilter, so we
>> want to be more clever than the statistics. It might well
>> work out in some cases, it might also surprise us or change
>> nothing in effect.
>
> "reasonable" isn't why it's it bogofilter. Paul Graham tested and found
> it useful. Greg and I tested and found it useful. That's why it's
> present.
Yes, but the idea comes from some human observation. So with
some testing results in favor of it we got this special
treatment.
>> We also have no understanding how different rules play
>> together, do they remain useful if combined? Could be, maybe
>> not. So this test was designed to get as much of those a
>> priori judgements out as seemed reasonable to me (others
>> might go even further or not all that far). My result being
>> that we can just as well leave those out.
>
> You don't indicate which special characters you allow and which ones you
> don't allow.
I actually did not change much. I use TOKENBORDER for
TOKENFRONT and TOKENBACK. Comparing TOKENBORDER with
TOKENBACK, I just don't allow ! and ~, but do allow $ (the
latter will allow for more general $-tokens than the special
rule). So TOKENBACK does not change a lot. TOKENFRONT does
not allow $ and digits, what I do. I hope I have not missed
anything.
pi
More information about the Bogofilter
mailing list