Clean the database from non-spam mails?

Bill Wohler wohler at newt.com
Tue Dec 2 20:01:20 CET 2003


Johannes Klug <derjoi at gmx.net> writes:

> Btw., why did you decide to use only one file?

The authors can probably cite more reasons, but two reasons I can think
of include "smaller and faster." Note that words are often in both the
spam and ham wordlists:

1. The result is smaller since the word (and the timestamp) is only
   mentioned once.

2. The number of lookups are cut in half since you only have to do one
   lookup instead of two for a given word.

-- 
Bill Wohler <wohler at newt.com>  http://www.newt.com/wohler/  GnuPG ID:610BD9AD
Maintainer of comp.mail.mh FAQ and MH-E. Vote Libertarian!
If you're passed on the right, you're in the wrong lane.




More information about the Bogofilter mailing list