repetitive training

Tue Mar 9 13:44:40 CET 2004

Greg Louis wrote:

>> > The most important disagreement we have is over the value of
>> > reoptimizing parameters after training; I claim it's moderately useful
>> > to do so, pi seems to feel it's too far from what happens in
>> > production
>> 
>> Even worse, I feel it will break the training so far which
>> is made to fit with the given parameters.
> 
> Not in my experience. 

As far as I understand you did one run of training on error,
used bogotune, run the next training on error and so forth.
So while the parameters change the choice of messages from
the previous round would have been different. Also the
adjusted cutoff will move.

>> So what I would like to see in you latest experiment are the
>> actual parameters used for each round. And possibly for each
>> round the error rate before and after the tuning.
> 
> I'll dig through the detailed log when I get time, and add tables of
> parameters to the writeup at http://www.bgl.nu/bogofilter/reptrain.html
> -- I agree that's useful info.  (Usually I append the detailed log to
> the writeup, but for these experiments it's too voluminous to be worth
> posting in its entirety.) 

Great.

> The error rate before tuning is there
> already; for iteration n, it's reported on line n-1 of the table of
> error rates.

I don't understand that. I thought that are the numbers
before the next training.

Also, as I mentioned earlier it is very suspicious that you
have for ten rounds exactly the same number of messages to
train with.

pi