New new script to train bogofilter

Greg Louis glouis at dynamicro.on.ca
Fri Jul 4 15:50:25 CEST 2003


On 20030704 (Fri) at 0932:52 -0400, David Relson wrote:
> At 09:17 AM 7/4/03, Boris 'pi' Piwinger wrote:
> 
> >Actually, I don't know why randomtrain needs more messages.
> >But I don't understand that script.
> 
> My experiments confirm that randomtrain does more training than does 
> bogotrain.pl.  It would be interesting to know why.  bogotrain.pl does an 
> orderly test ham, then spam, then ham, then spam, ...  randomtrain is, 
> well, random.  It seems that either train-on-error is highly dependent on 
> message order or else that one of the scripts is flawed.

Well, train-on-error is _obviously_ highly dependent on message order
when the training database is tiny, which is one reason for randomizing.

If anybody cares to read the writeups of my training experiments, we
might avoid some redundant discussion.  Links can be found on

http://www.bgl.nu/bogofilter


-- 
| G r e g  L o u i s          | gpg public key: finger     |
|   http://www.bgl.nu/~glouis |   glouis at consultronics.com |
| http://wecanstopspam.org in signatures fights junk email |




More information about the Bogofilter mailing list