ACME Labs spam wordlist available for use.

David Relson relson at osagesoftware.com
Sat Nov 6 21:53:20 CET 2004


On Sat, 06 Nov 2004 19:56:06 +0100
Boris 'pi' Piwinger wrote:

> Jef Poskanzer <jef at acme.com> wrote:
> 
> >See http://www.acme.com/spamwords/
> 
> What spam is, is highly individual. So I would never use
> such a list.
> 
> pi

pi,

You are, of course, right that the precise definition is spam is
individual.  However, I bet that if you and I compared classifications
for the same message, we'd agree much of the time.  Unfortunately,
that's a different discussion than the acme list, so let's not go into
it.

When using multiple wordlists, bogofilter checks the lowest numbered
list(s) first.  If the token is found, bogofilter uses those ham and
spam counts to compute the token's spam score.  If there's more than one
list with that lowest "order" number, the counts will be combined
additively.  If the token is not found in the lowest numbered list,
bogofilter will then check the next highest numbered list(s).

Jef has carefully configured his list of wordlists.  By using 1 for the
local list and 2 for the acme list, the local list is the primary list
and the acme list is the secondary list.  The acme list is used for
words not found in the local list.  You can think of it as "my local
list doesn't know this word, I'm going to ask a friend".

Using his list is clearly not necessary, but I think it will be useful
for some people.  I leave it for each user to decide -- which is the
same thing Jef is doing.

Regards,

David



More information about the Bogofilter mailing list