Using casefolded wordlists

Peter Bishop pgb at adelard.com
Fri May 30 16:24:48 CEST 2003


On 30 May 2003 at 7:09, Greg Louis wrote:

> > >From my unscientific sample it appears that ham is affected more than spam 
> > and the effect could be to increase false positives until the wordlists get 
> > updated with mixed case words
> 
> That could be so.  I suggested advising people to classify with -Pi but
> train with -PI for a couple of months if they couldn't rebuild their
> training databases; alternatively, one could speed up the process (as I
> did at work, where I can't rebuild) by training, with -PI, on a large
> batch of new messages (roughly equal numbers of spam and nonspam) right
> after doing the upgrade.
> 

Looks like a good strategy
Do you have any performance measures before and after the changeover?


-- 
Peter Bishop 
pgb at adelard.com
pgb at csr.city.ac.uk






More information about the Bogofilter mailing list