what happens if I discard tokens that occur only once?

Chris Fortune cfortune at telus.net
Fri Jun 3 05:47:58 CEST 2005


bogoutil lets you to discard tokens having <= given number of occurrences from the database.

What's the effect of discarding tokens that occur only once?  My assumption is that this would make the wordlist much lighter
without impacting the classifications much.  What are your findings?  Reasons?  FYI, my wordlist is 20,000 ham + 23,000 spam




More information about the Bogofilter mailing list