Training frustration

Pavel Kankovsky peak at argo.troja.mff.cuni.cz
Sun Feb 17 19:22:07 CET 2008


On Mon, 11 Feb 2008, Anne Wilson wrote:

> You seem surprised that I was cleaning out the 'trained' messages.  I
> thought it was a bad idea to keep running the same messages through the
> training.  Am I wrong?

It depends. Look for "training to exhaustion". I myself don't do it. My
spam corpus is so huge that I can always find multiple independent copies 
of even the most difficult and exotic spam. :)

> I'll keep the -c parameter in the command in future.  What is the reason for 
> it not being the default?

I am not the author of trainbogo.sh. I guess the idea was to make it
possible to do several passes through the corpus in order to catch cases
when the classification of a message changes (from correct to incorrect) 
after some other messages has been trained.

--Pavel Kankovsky aka Peak  [ Boycott Microsoft--http://www.vcnet.com/bms ]
"Resistance is futile. Open your source code and prepare for assimilation."




More information about the Bogofilter mailing list