Training scripts

Matej Cepl cepl at surfbest.net
Tue Jan 27 07:10:24 CET 2004


May anybody help with writing scripts for using bogofilter? 
I have two problems, both with find command (using three-state 
filtering with bogofilter trained on error to exhaustion with 
bogomintrain.pl scripts):

1) This is my script going through _ham and _spam maildir folders 
   and retraining bogofilter as necessary:

	#!/bin/sh
	MAILDIR=$HOME/.mail/

	for msg in $MAILDIR/_spam/*/* ; do
	   formail -I X-Bogosity -I X-KMail -s bogofilter -vs < $msg
	   STR=$(basename $msg)
	   mv $msg $MAILDIR/_junk/new/${STR%*:2,S} \
	      2>&1 >/dev/null
	done

	for msg in $MAILDIR/_ham/*/* ; do
	   formail -I X-Bogosity -I X-KMail -s bogofilter -vn < $msg
	   STR=$(basename $msg)
	   mv $msg $MAILDIR/inbox/new/${STR%*:2,S} \
	      2>&1 >/dev/null
	done

   I would love to replace for cycles with find command, but I do 
   not know how to pull it off:
   
      a) to run three commands from the same find command and
         using the same {} variable twice, and
      b) does anybody know how to avoid using STR variable (i.e.,
         to use basename directly in ${...%..} bash expression)? 
         ${$(basename $msg)%*:2,S} doesn't work.

2) And other training on find is this script, which collects 
   corpuses of ham and spam for training from alive data:


   #!/bin/sh
   TMPBOX=$HOME/mbox.tmp

   cat /dev/null > $TMPBOX
   find $HOME/Maildir/ $HOME/.mail/ -type f -name 10\* \
       \! -iregex $HOME/.\*/_junk.\* \
       -exec cat '{}' >> $TMPBOX \;
   formail -I "X-Bogosity:" -I "X-KMail" -ds < $TMPBOX >> 
$HOME/ham

   cat /dev/null > $TMPBOX
   find $HOME/.mail/_junk -type f -name 10\* \
       -exec cat '{}' >> $TMPBOX \;
   formail -I "X-Bogosity:" -I "X-KMail" -ds < $TMPBOX >> 
$HOME/spam

   rm -f $TMPBOX 2>&1 >/dev/null
   unset TMPBOX

   Obviously, what I would love to achieve is to get rid of 
   $TMPBOX and run formail immediately in find -exec. When trying 
   this:

   #!/bin/sh
   find $HOME/.mail/_junk -type f -name 10\* \
       -exec formail -I "X-Bogosity:" -I "X-KMail" -ds \
         < '{}' >> $HOME/spam \;

   I get error ``./bogocorpus: line 5: {}: neither file nor \
   directory'' (translation to English from localized error 
   messfge). Any thoughts on this? 

Thanks a lot,

   Matej

-- 
Matej Cepl, http://www.ceplovi.cz/matej
GPG Finger: 89EF 4BC6 288A BF43 1BAB  25C3 E09F EF25 D964 84AC
138 Highland Ave. #10, Somerville, Ma 02143, (617) 623-1488
 
In political activity men sail a boundless and bottomless sea;
there is neither harbor for shelter nor floor for anchorage,
neither starting point nor appointed destination.
   -- Michael Oakeshott: Rationalism in Politics





-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: signature
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20040127/eeec085c/attachment.sig>


More information about the Bogofilter mailing list