testing partial wordlists

Tom Anderson tanderso at oac-design.com
Sun Feb 6 00:56:31 CET 2005


On Sat, 2005-02-05 at 17:49, David Relson wrote:
> Indeed!!  It'd be interesting to know if there're holiday effects.  I've
> got no info one way or t'other. 
> 
> Now I've got some actual numbers and don't have to imagine to decide
> what to keep or what to pitch.  Given the numbers I got, I've removed
> hapaxes and tokens older than a year.
> 
> I just need to remain vigilant for a while -- in case I've removed too
> much!

What are the commands you would use to trim a wordlist by X months? And
hapaxes?  It'd be nice to have a FAQ entry for this.  Or better yet, a
maintenance script which can be run via cron like once a month which
does this trimming, compacts the database, and makes a backup.

Tom





More information about the Bogofilter mailing list