Training scripts

David Relson relson at osagesoftware.com
Tue Jan 27 13:55:32 CET 2004


On Tue, 27 Jan 2004 01:10:24 -0500
Matej Cepl (by way of Matej Cepl ) wrote:

> May anybody help with writing scripts for using bogofilter? 
> I have two problems, both with find command (using three-state 
> filtering with bogofilter trained on error to exhaustion with 
> bogomintrain.pl scripts):
> 
> 1) This is my script going through _ham and _spam maildir folders 
>    and retraining bogofilter as necessary:
> 
> 	#!/bin/sh
> 	MAILDIR=$HOME/.mail/
> 
> 	for msg in $MAILDIR/_spam/*/* ; do
> 	   formail -I X-Bogosity -I X-KMail -s bogofilter -vs < $msg
> 	   STR=$(basename $msg)
> 	   mv $msg $MAILDIR/_junk/new/${STR%*:2,S} \
> 	      2>&1 >/dev/null
> 	done
> 
> 	for msg in $MAILDIR/_ham/*/* ; do
> 	   formail -I X-Bogosity -I X-KMail -s bogofilter -vn < $msg
> 	   STR=$(basename $msg)
> 	   mv $msg $MAILDIR/inbox/new/${STR%*:2,S} \
> 	      2>&1 >/dev/null
> 	done
> 

I notice "-s" used in both of the bogofilter commands.  Likely you want
"-n" is one of them :-)

>    I would love to replace for cycles with find command, but I do 
>    not know how to pull it off:
>    
>       a) to run three commands from the same find command and
>          using the same {} variable twice, and
>       b) does anybody know how to avoid using STR variable (i.e.,
>          to use basename directly in ${...%..} bash expression)? 
>          ${$(basename $msg)%*:2,S} doesn't work.

Define a subroutine:

doit()
{
  msg="$1"
  command1 
  command2
}

find dir -type f | doit

> 2) And other training on find is this script, which collects 
>    corpuses of ham and spam for training from alive data:

I'm going to skip this one because I'm pressed for time right now :-(




More information about the Bogofilter mailing list