New new script to train bogofilter

Boris 'pi' Piwinger 3.14 at logic.univie.ac.at
Fri Jul 4 16:24:45 CEST 2003


David Relson wrote:

>>Actually, I don't know why randomtrain needs more messages.
>>But I don't understand that script.
> 
> My experiments confirm that randomtrain does more training than does 
> bogotrain.pl.  It would be interesting to know why.  bogotrain.pl does an 
> orderly test ham, then spam, then ham, then spam, ... 

The reason is that I need some order. If I'd do all spam
first, then all ham, I'd expect way more errors. This way I
learn "at the same time" from spam and ham.

> randomtrain is, well, random. 

Also random if it takes spam or ham next?

> It seems that either train-on-error is highly dependent on 
> message order

You mean the number needed? What is the exit condition for
randomtrain? Mine (-f) is that there are no errors left.

pi





More information about the Bogofilter mailing list