bogofilter, procmail, & training bogofilter

Jesse Meyer meyer at btinet.net
Fri Apr 11 01:29:13 CEST 2003


On Wed, Apr 09, 2003 at 09:36:14PM -0700, Rodney D. Myers wrote:
> How do most people get bogofilter trained? cron? manual?

Since I'm a big fan of IMAP (due to being able to synchronise emails 
between my laptop and my desktop via the server), and maildir (since 
each folder is a directory, each mail message is a file), my
configuration for bogofilter is both simplified and more complicated.

The complicated part comes in because I can't train bogofilter via the
client machines, since the server needs to filter.  Therefore, I have 3
folders under IMAP in addition to the INBOX that are directly related to
bogofilter:  Spam, NotSpam, and MaybeSpam.

Procmail on the server first passes the message to bogofilter, which has
the continual learning flag (I believe its -u, but I'm composing this
while I'm not connected to the server atm).  If bogofilter thinks the 
message is spam, it sents it to the MaybeSpam filter.  If it doesn't
think its spam, then it runs the gauntlet of mailing list filters
(bogofilter, debian-user, etc), and is either sorted to the appropriate
mailing list folder, or ends up in the INBOX.

Now, from the server, it either ends up on the desktop via
Mozilla(win32) Mail's IMAP, or offlineimap synchronizes the server's 
maildir directory with the laptop maildir directory, where I read it
with mutt.

Most of my email reading is with mutt, and I have keybindings setup
where I can quickly mark a false negative by sending it to the "Spam"
folder.  I also have a keybinding set up to mark a false negative by
sending it to the NotSpam folder (this isn't as quick though since
I haven't seen any false positives in a few weeks - thus I keep
forgetting which key marks a false positive  :).  Taking a quick
peek at .muttrc, "S" moves a message to the Spam folder, "G" moves a
message to the NotSpam folder.

Finally, at midnight each night, a cron job digs through the Spam and
NotSpam folders and corrects bogofilter's mistakes.

I am currently just deleting my spam messages, but I am thinking about 
letting another cron job move them to a spam archive folder, in case I
need to rebuild bogofilter's database one day.  Then again, bogofilter
trains relatively quickly for me, and I'd rather not keep some of the
spam I get for moral (and possibly legal) reasons.

The advantages of my system is that it works well in a multi-machine
reading environment, across platforms, and isn't dependent on the email
client.  The disadvantages is that that it can be a tad more complicated
to set up, but I believe that the flexibility is worth it.

The tools I use for this setup include:

	bogofilter (of course!)
	procmail
	courier-imap
	offlineimap
	cron
	mutt
	mozilla 1.2 (win32).

I've already posted my .procmailrc here a few days ago, so just search
the mailing list archive for it, and you can ask me a question any time
as well by IM if you have a problem.  Contact information is in the sig.

-- 
        ...crying "Tekeli-li! Tekeli-li!"... ~ HPL
 icq : 34583382              |     === ascii ribbon campaign ===
 msn : dasunt at hotmail.com    |  ()  - against html mail
 yim : tsunad                |  /\  - against proprietary attachments
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20030410/5f5ee9d9/attachment.sig>


More information about the Bogofilter mailing list