Re casefolding

David Relson relson at osagesoftware.com
Wed May 14 14:26:03 CEST 2003


At 08:14 AM 5/14/03, Boris 'pi' Piwinger wrote:

>David Relson wrote:
>
> > Since 0.12.3, a group of parsing options have been added to bogofilter and
> > bogolexer.  They're all toggles that enable/disable capabilities, i.e.
> >
> > "-Pf" for case-folding
> > "-Ph" for tagging of header lines
> > "-Pt" for tokenizing of html tags
> > "-PC" for strict checking (of html comments)
>
>To make full use of it, does that require to rebuild the
>database?
>
>pi

pi,

The usefulness of capital letters has not yet been verified with 
bogofilter.  Greg is running a test to see if there's a noticable 
performance improvement.  I'm sure he'll report when he has definitive results.

To answer your question, for quickest and best results, it would probably 
be worth rebuilding.  If you don't rebuild, new tokens with capital letters 
won't be recognized until enough traing has occurred.  If using "-u" for 
autoupdating, bogofilter should soon learn about the new tokens.

David





More information about the Bogofilter mailing list