How do I filter out spam that turns up on mailing lists?

Nigel Henry cave.dnb at tiscali.fr
Tue Jan 8 20:22:34 CET 2008


On Tuesday 08 January 2008 00:14, David Relson wrote:
> On Mon, 7 Jan 2008 21:35:35 +0100

> > > Nigel Henry wrote:
> > > > Cutting to the chase. There has just been another batch of spam
> > > > getting through Debian mail filters, and has turned up in my
> > > > Debian mailbox, so it appears that bogofilter was not able to
> > > > detect the Debian list spam when it processed all the incoming
> > > > mail.
> > > >
> > > > Any suggestions on how to deal with mailing list spam?

> > Nigel.
>
> Hello Nigel,

> What you _could_ do is create an ignore list with headers from the
> debian list.  This would eliminate those tokens from the scoring
> effectively telling bogofilter to score using only body tokens.
>
> HTH,
>
> David

Hi David. Thanks for your reply. Ironically the Debian list have fixed the 
spam problem. Someone had accidentally disabled the spamfilter.

All the same, your suggestion for setting up an ignore list sounds like a good 
idea, and would enable me to deal with any future problems.

I have no /etc/bogofilter.cf, only an /etc/bogofilter.cf.example, so I am 
using defaults. The "WORDLIST" part looks like this at present.

#### WORDLIST: define additional word lists
#
# char type: 'r' (regular) or 'i' (ignore)
# char *name: name of list, e.g. "system", "user", "ignore"
# char *path: absolute path to file or
#      file name (relative to bogofilter_dir)
# int  order - once found, skip higher numbered lists
#
##wordlist i,ignore,~/ignorelist.db,1
##wordlist r,wordlist,~/wordlist.db,2

I can see from this that I need to create an .ignorelist.db,1 in 
my /home/user/.bogofilter directory, and change the wordlist.db to 
wordlist.db,2.

I see also that the path isn't correct for mine on the 2 wordlist lines above, 
and presumably they need to look like this.
wordlist i,ignore,~/.bogofilter/ignorelist.db,1
wordlist r,wordlist,~/.bogofilter/wordlist.db,2

Having uncommented both of course.

So far so good, if I'm correct with the above stuff.

Moving on to your FAQ, I'm a bit stuck, and not sure how to procede with the 
stuff below.
<FAQ>
Can I tell bogofilter to ignore certain tokens?

Through the use of an ignore list, bogofilter will ignore the listed tokens 
when scoring the message.

Example:
 
    wordlist I,ignore,~/ignorelist.db,7
    wordlist R,system,/var/spool/bogofilter/wordlist.db,8
 

Because ignorelist.db has a lower index (7), than wordlist.db (8), bogofilter 
will stop looking when finds a token in ignorelist.db.

Note: Technically, bogofilter gives a score of ROBX to the tokens and expects 
the min_dev parameter to drop them from the scoring.

< This is where I'm confused>

There are two main methods for building/maintaining an ignore list.

First, a text file can be created and maintained using any text editor. 
Bogoutil can convert the text file to database format, e.g. "bogoutil -l 
ignorelist.db < ignorelist.txt".

Alternatively, echo ... | bogoutil ... can be used to add a single token, for 
example "ignore.me", as in:
 
  echo ignore.me | bogoutil -l ~/ignorelist.db

Sorry for being an ignoramus, but a little guidance on populating the 
ignorelist.db would be helpfull.

Nigel.

btw. As the Debian list has resolved the spam problem, have you got any links 
to crappy mailing lists that allow a lot of spam through, so I could check 
out if I can remove mailing list spam with bogofilter?






More information about the Bogofilter mailing list