New new script to train bogofilter
Greg Louis
glouis at dynamicro.on.ca
Fri Jul 4 14:14:15 CEST 2003
On 20030704 (Fri) at 1347:21 +0200, Boris 'pi' Piwinger wrote:
> As you see, in practice there is no problem with
> overtraining.
What I see is that you have not (yet?) encountered problems. That has
not been my experience. I overtrained for a while, and stopped when I
discovered (a) that my fn counts were rising and (b) that rebuilding
the training db without overtraining brought them back down.
> After all I could have received it twice anyway.
If you receive it twice, you do not distort the estimation of what the
population's like if you add it twice.
> > and rather poor at recognizing
> > messages that are similar to, but not strongly similar to, those
> > training ones.
>
> That would mean that new messages are not classified
> correctly more often than in a full training. So far my
> observations don't show this.
So far. If it's still true after several weeks or months, that will be
interesting. I don't say it's impossible, but I'll be surprised.
--
| G r e g L o u i s | gpg public key: finger |
| http://www.bgl.nu/~glouis | glouis at consultronics.com |
| http://wecanstopspam.org in signatures fights junk email |
More information about the Bogofilter
mailing list