question on multiple wordlists

Eric Seppanen eds at reric.net
Fri Oct 11 17:51:31 CEST 2002


On Fri, Oct 11, 2002 at 10:30:03AM -0400, David Relson wrote:
> Eric,
> 
> I've been wondering about your plans for multiple word lists and hope you 
> can let us in on them a bit more.

No problem.  All this stuff is negotiable, of course.
 
>  From your comments, I gather that the config file will be used to specify 
> the word lists (names, attributes, etc).  Can you give us some sample lines?

So far, all I've done is the ability to pull in additional DB files.  To 
do this, all you'd need to specify is a name, file, and weight, like

wordlist systemspam /etc/bogofilter/bigspam.db -0.8
wordlist systemgood /etc/bogofilter/biggood.db 0.8

Though I'm thinking real hard about whether single-line whitespace-
separated config lines are good enough.  I'm leaning toward committing the 
config-file stuff I've got, it's isolated from the rest of the code so it 
shouldn't do any harm if it needs to be replaced later.

> How will multiple wordlists be maintained?  Currently bogofilter uses 
> switches to maintain goodlist.db and spamlist.db.  Single letter switches 
> don't extend to multiple lists very well.  What are your plans for this?

Actually, I hadn't though about it too hard, I'd assume that it'd be 
straightforward to do an option to specify the list to update, like 
--wordlist <filename>

> I think I speak for all of us when I say that we're curious about the 
> direction you're going.

The goals are:

- allow user-specified lists so that new users can start off with someone 
else's spam list to get them started, but that "seed" list is stored 
separately from their list.

- allow user-specifiied lists so that sysadmins can install system-wide 
lists that are available to all users, without interfering with the user's 
own lists.

- allow use of plaintext lists (possibly requiring conversion to db 
format) for whitelisting, blacklisting, ignore-listing.

In general, I think user-specifiied lists are intended to be 
hand-maintained, so I don't think we need to worry about implementing the 
equivalent of -S, -N, -u for those lists.

For summay digest subscription: bogofilter-digest-subscribe at aotto.com



More information about the Bogofilter mailing list