Training scripts

Stroller Linux.Luser at myrealbox.com
Tue Jan 27 16:06:33 CET 2004


On Jan 27, 2004, at 6:10 am, Matej Cepl (by way of Matej Cepl 
<cepl at surfbest.net>) wrote:

> May anybody help with writing scripts for using bogofilter?

I've posted this script for training before: 
<http://article.gmane.org/gmane.mail.bogofilter.general/5935>
See the attachment link at the bottom - I hope you'll maybe find it 
useful with respect to the find command.

I'm pretty new to Bash scripting myself, so I'm having some problems 
reading your scripts. I'm posting not because I'm an expert, but 
because i welcome and discussion & enlightenment on the subject.
I'm a little unclear why you appear to be calling KMail, for instance, 
and your use of the `formail` command suggests to me you're doing 
something cleverer than I.

All my script does is train Bogofilter on new messages in spam & ham 
(maildir) folders respectively.
What I've realised, however, is that if I run my script then move a 
message, say, from my inbox (which is ignored in case it has spam in 
it) to a saved items folder, then subsequent runs of my script will not 
train on that message. This is because the modification date on the 
message has not changed, and it is on my TODO list to look at fixing 
that. However I found that using `find... -print0 | xargs` was 
*considerably* faster than `find... -exec bogofilter -s -W -v -I \{\} 
\; `, which calls for Bogofilter to be repeatedly restarted with each 
message. IIRC using that latter method took about 20 or 40 minutes to 
build a database based on my modest message corpus; with the script the 
way it is I can move older messages around & completely rebuild my 
database from scratch (by removing the old one) in less than 5 minutes.

Stroller.





More information about the Bogofilter mailing list