testing partial wordlists
Tom Anderson
tanderso at oac-design.com
Sun Feb 6 00:56:31 CET 2005
On Sat, 2005-02-05 at 17:49, David Relson wrote:
> Indeed!! It'd be interesting to know if there're holiday effects. I've
> got no info one way or t'other.
>
> Now I've got some actual numbers and don't have to imagine to decide
> what to keep or what to pitch. Given the numbers I got, I've removed
> hapaxes and tokens older than a year.
>
> I just need to remain vigilant for a while -- in case I've removed too
> much!
What are the commands you would use to trim a wordlist by X months? And
hapaxes? It'd be nice to have a FAQ entry for this. Or better yet, a
maintenance script which can be run via cron like once a month which
does this trimming, compacts the database, and makes a backup.
Tom
More information about the Bogofilter
mailing list