Re casefolding
David Relson
relson at osagesoftware.com
Wed May 14 14:26:03 CEST 2003
At 08:14 AM 5/14/03, Boris 'pi' Piwinger wrote:
>David Relson wrote:
>
> > Since 0.12.3, a group of parsing options have been added to bogofilter and
> > bogolexer. They're all toggles that enable/disable capabilities, i.e.
> >
> > "-Pf" for case-folding
> > "-Ph" for tagging of header lines
> > "-Pt" for tokenizing of html tags
> > "-PC" for strict checking (of html comments)
>
>To make full use of it, does that require to rebuild the
>database?
>
>pi
pi,
The usefulness of capital letters has not yet been verified with
bogofilter. Greg is running a test to see if there's a noticable
performance improvement. I'm sure he'll report when he has definitive results.
To answer your question, for quickest and best results, it would probably
be worth rebuilding. If you don't rebuild, new tokens with capital letters
won't be recognized until enough traing has occurred. If using "-u" for
autoupdating, bogofilter should soon learn about the new tokens.
David
More information about the Bogofilter
mailing list