Markup.
David Relson
relson at osagesoftware.com
Sat May 10 16:09:28 CEST 2003
At 08:48 AM 5/10/03, you wrote:
>On 9 May 2003 at 13:26, David Relson wrote:
>
> > Like you, I wouldn't worry too much about it. The benefits seem pretty
> > clear and there's always the occasional message that's virtually
> impossible
> > to classify - even for a human.
>
>A quick calculation suggests that false negatives dropped from 14% of spam
>to ~13% of spam
>- a useful improvement but not huge
>
>It might be the case that removing casefolding would have a greater effect
>Joerg Over did some tests on this but is hard to judge performance
>as the tests were only on 33 spams, but at face value the results were:
>
>Robinson:
> 3% false negatives drops to 0%
>
>Fisher
> 90% false negatives drops to 6%
>
>Clearly there are issues about Fisher performance
>(was the database properly set up?)
>but it looks like removing casefolding results in
>a bigger reduction in false negatives than the checking
>markup tokens (though more tests are needed).
>
>Would it be possible to have a switch option to disable
>case folding?
Peter,
If you want to do a significant test, I'll create a patch that allows
case-folding to be disabled.
Let me know.
David
More information about the Bogofilter
mailing list