[PATCH] combined wordlist a.k.a. single list

Malcolm Dew-Jones yf110 at victoria.tc.ca
Wed Jun 4 21:21:38 CEST 2003




On Tue, 3 Jun 2003, Jeremy Blosser wrote:
> 
> Our goodlist is a pretty important resource for us.  It took a lot of time
> and effort to create the initial lists, and has taken even more time to
> refine them with user feedback to something we can trust to filter all of
> our mail, especially in a large heterogenous environment like ours.  On my
> personal accounts at home I keep all the spam and nonspam I receive so I
> can do wordlist rebuilds as I need them, but it'd be foolish to try that
> here due to the volume of mail we see, privacy concerns about storing the
> nonspam notwithstanding.  We can't just recreate our existing goodlist from
> mail we have stored somewhere for that purpose.  We need to keep several
> levels of backups of the goodlist, because it'd be hard to replace if we
> somehow lost it, and our ability to block only spam (and never good mail)
> is pretty tied to it.
> 

$0.02

We use the mail that our users _send_ as the primary source for our
legitimate mail sample.

By definition, what they send is not spam. 

The assumption, which for the most part appears to be true, is that any
mail they receive that is similar to what they send is indeed legitimate,
job related communication. 






More information about the bogofilter-dev mailing list