bogolearn

Jesse Meyer meyer at btinet.net
Sun Apr 6 04:15:50 CEST 2003


On Sat, Apr 05, 2003 at 02:21:15PM -0500, Kevin McKinley wrote:
> I found this "bogolearn" script at O'Reilly. (I changed "GOOD" to "HAM"). I
> offer it here for comment, suggested improvements, or whatever.
>
> [ snip script ]

Heh, when I started using bogofilter a few weeks ago, I created a
script with the exact same name and similar actions.  For comparision's
sake, I'll append the script to the bottom of this email, and the
.muttrc rules I use with it.  My script isn't verbose, since it runs
through a cron job each night.

Oh, btw, using this script on non-maildir directories is a really, 
really bad thing.  Perhaps a check to make sure each file is a maildir
message first would be a very good thing...

Astute readers will notice that my bogolearn spam and ham directories 
are different in the bogolearn script and the .muttrc are different - 
its because I use offlineimap to syncronize email between the server and
desktop, and it translates the mailbox names.

Looking at the O'Reilly's bogofilter script, I see that I've fallen 
into the bad habit of feeding a file to a command by doing "cat file |
command" instead of "command < file".

-------------------- bogolearn script --------------------------------
#!/bin/sh
# small script written by Jesse Meyer, Mar 25, 2003
# this should "teach" bogofilter based on messages that 
# were sorted to Spam and NotSpam

SPAM=/home/dasunt/Maildir/.Spam/cur
NOSPAM=/home/dasunt/Maildir/.NotSpam/cur

BOGO=/usr/bin/bogofilter
CAT=/bin/cat
LS=/bin/ls
RM=/bin/rm

# remark false ham to spam & delete
cd $SPAM
for file in `$LS`
do 
	$CAT $file | $BOGO -S;
	$RM $file;
done

cd $NOSPAM

# remark false spam to ham & delete
for file in `$LS`
do 
	$CAT $file | $BOGO -N;
	$RM $file;
done
----------------------- .muttrc snippet ------------------------------
# bogofilter key bindings for continual training  {{{

# Lets bind S to save messages in the Spam Folder, and 
# G to save messages in the NotSpam Folder
 
# Simple way, but...

macro	index	S	s=Spam\rY
macro	pager	S	s=Spam\r\Y

macro	index	G	C=NotSpam\rY
macro	pager	G	C=NotSpam\rY

# macro	index	G	s=INBOX\rY
# macro	pager	G	s=INBOX\rY

# }}}

-- 
        ...crying "Tekeli-li! Tekeli-li!"... ~ HPL
 icq : 34583382              |     === ascii ribbon campaign ===
 msn : dasunt at hotmail.com    |  ()  - against html mail
 yim : tsunad                |  /\  - against proprietary attachments




More information about the Bogofilter mailing list