Training scripts
Matej Cepl
cepl at surfbest.net
Tue Jan 27 23:40:26 CET 2004
On Tuesday 27 of January 2004 10:06, Stroller wrote:
> I've posted this script for training before:
> <http://article.gmane.org/gmane.mail.bogofilter.general/5935>
> See the attachment link at the bottom - I hope you'll maybe
find it
> useful with respect to the find command.
I'll check it out.
> I'm pretty new to Bash scripting myself, so I'm having some
problems
> reading your scripts. I'm posting not because I'm an expert,
but
> because i welcome and discussion & enlightenment on the
subject.
> I'm a little unclear why you appear to be calling KMail, for
instance,
> and your use of the `formail` command suggests to me you're
doing
> something cleverer than I.
I am not calling KMail at all, just getting rid of some
additional email headers put there by bogofilter and KMail.
> All my script does is train Bogofilter on new messages in spam
& ham
> (maildir) folders respectively.
I am doing here train-on-error only, so it is slightly different.
> What I've realised, however, is that if I run my script then
move a
> message, say, from my inbox (which is ignored in case it has
spam in
> it) to a saved items folder, then subsequent runs of my script
will not
> train on that message.
You are sure to know by heart these two pages, aren't you?
http://cr.yp.to/proto/maildir.html
http://www.qmail.org/man/man5/maildir.html
>However I found that using `find... -print0 | xargs` was
>*considerably* faster than `find... -exec bogofilter -s -W -v -I
>\{\} \; `, which calls for Bogofilter to be repeatedly restarted
>with each message. IIRC using that latter method took about 20
>or 40 minutes to build a database based on my modest message
>corpus; with the script the way it is I can move older messages
>around & completely rebuild my database from scratch (by
>removing the old one) in less than 5 minutes.
I am retraining just around 10 messages a day in one run, so the
speed is not so much issue for me.
Matej
--
Matej Cepl, http://www.ceplovi.cz/matej
GPG Finger: 89EF 4BC6 288A BF43 1BAB 25C3 E09F EF25 D964 84AC
138 Highland Ave. #10, Somerville, Ma 02143, (617) 623-1488
Science is meaningless because it gives no answer to our
question, the only question important to us: ``What shall we do
and how shall we live?''
-- Lev Nikolaevich Tolstoy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: signature
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20040127/2997d461/attachment.sig>
More information about the Bogofilter
mailing list