Training frustration
Pavel Kankovsky
peak at argo.troja.mff.cuni.cz
Sun Feb 17 19:22:07 CET 2008
On Mon, 11 Feb 2008, Anne Wilson wrote:
> You seem surprised that I was cleaning out the 'trained' messages. I
> thought it was a bad idea to keep running the same messages through the
> training. Am I wrong?
It depends. Look for "training to exhaustion". I myself don't do it. My
spam corpus is so huge that I can always find multiple independent copies
of even the most difficult and exotic spam. :)
> I'll keep the -c parameter in the command in future. What is the reason for
> it not being the default?
I am not the author of trainbogo.sh. I guess the idea was to make it
possible to do several passes through the corpus in order to catch cases
when the classification of a message changes (from correct to incorrect)
after some other messages has been trained.
--Pavel Kankovsky aka Peak [ Boycott Microsoft--http://www.vcnet.com/bms ]
"Resistance is futile. Open your source code and prepare for assimilation."
More information about the Bogofilter
mailing list