Importance of dot in TOKEN
relson at osagesoftware.com
Fri Mar 19 07:49:49 EST 2004
On Fri, 19 Mar 2004 13:14:56 +0100
Boris 'pi' Piwinger wrote:
> So while there are only pretty few test messages, there is
> only little to observe. There is a very, very small
> indication that . might help in avoiding fp's. The number of
> fn's seems reduced a bit by *not* using dots. This is a
> surprise, I expected the dot version to clearly outperform
> the much simpler lexer. It does not. So I gonna keep it out.
As a guess, as wordlists grow and become more comprehensive, each form
of special treatment becomes less important. For example, we have
header tagging and url identification. Removing one (or the other)
would have some effect and removing both would have a larger effect. If
we also removed decoding (base64, qp, etc) or multipart mime processing,
results would change even more.
> With this result in mind it will be interesting to see if IP
> numbers are really useful. I'll keep you posted.
I have found IP numbers to be useful, particularly when using
More information about the Bogofilter