randomtrain - script to train on errors

Greg Louis glouis at dynamicro.on.ca
Sun Dec 1 21:53:30 CET 2002


On 20021201 (Sun) at 1504:59 -0500, David Relson wrote:

> There are several benefits to doing it this way: the config file explicitly 
> sets the algorithm, ham_cutoff, and labels used for the terse 
> header.  Adding '-c randomtrain.cf' to the bogofilter command line ensures 
> the setup when bogofilter runs.   Specifying tristate output 
> (Spam/Ham/Unsure) means the script will _know_ what to expect from 
> bogofilter.  Without the '-c', bogofilter will read the default config 
> files.  Given a difference between those files and what is being tested 
> will result in training with the wrong parameters.

Sure there are benefits.  I'm operating with bogofilter-0.8.0 cut down
so it _only_ does Robinson-Fisher, _only_ has the options I use, and
_only_ writes the output my way.  I don't have a config file anywhere.
I can read that code without wandering all over the bogofilter
directory, and it works for me; that's why I took the trouble to make
all those changes.  But I don't want to be Procrustean about it; there
may be no other bogofilter user on the planet who likes things the way
I like them.  That's one of the reasons people implement config files.

(I probably won't go to the trouble of cutting down 0.9.1, though it
would be nice to have the actual _functional_ improvements like lexer
updates and such; I may backport those to my private version instead.)

> Of course, if you _still_ don't think it's a good idea, I'll use the 
> patches for personal use.

T'other way round: by all means implement the patch, it looks like it
will be convenient for lots of people.  If it gets in my way, I don't
have to use it, after all.

> +cat <<EOF > randomtrain.cf

I notice you create this wherever we're running randomtrain from, but
you don't clean it up afterward, and the name is such that multiple
instances get clobbersome (not that that matters much, since they all
write the same thing).  I'd make it .cf$pid and erase it along with the
other $pid files if I were you; or else put it in whatever the standard
.cf place is.

-- 
| G r e g  L o u i s          | gpg public key:      |
|   http://www.bgl.nu/~glouis |   finger greg at bgl.nu |




More information about the Bogofilter mailing list