New new script to train bogofilter
David Relson
relson at osagesoftware.com
Fri Jul 4 15:32:52 CEST 2003
At 09:17 AM 7/4/03, Boris 'pi' Piwinger wrote:
>Actually, I don't know why randomtrain needs more messages.
>But I don't understand that script.
My experiments confirm that randomtrain does more training than does
bogotrain.pl. It would be interesting to know why. bogotrain.pl does an
orderly test ham, then spam, then ham, then spam, ... randomtrain is,
well, random. It seems that either train-on-error is highly dependent on
message order or else that one of the scripts is flawed.
It might be interesting to run randomtrain and then use its order to create
comparable ham.mbx and spam.mbx files. Then run bogotrain.pl with those
files. bogotrain.pl already has a verbose mode to show the messages it's
training with. Adding a verbose mode to randomtrain would give the same
information.
More information about the Bogofilter
mailing list