New new script to train bogofilter

David Relson relson at osagesoftware.com
Fri Jul 4 15:32:52 CEST 2003


At 09:17 AM 7/4/03, Boris 'pi' Piwinger wrote:

>Actually, I don't know why randomtrain needs more messages.
>But I don't understand that script.

My experiments confirm that randomtrain does more training than does 
bogotrain.pl.  It would be interesting to know why.  bogotrain.pl does an 
orderly test ham, then spam, then ham, then spam, ...  randomtrain is, 
well, random.  It seems that either train-on-error is highly dependent on 
message order or else that one of the scripts is flawed.

It might be interesting to run randomtrain and then use its order to create 
comparable ham.mbx and spam.mbx files.  Then run bogotrain.pl with those 
files.  bogotrain.pl already has a verbose mode to show the messages it's 
training with.  Adding a verbose mode to randomtrain would give the same 
information.






More information about the Bogofilter mailing list