New new script to train bogofilter

Greg Louis glouis at dynamicro.on.ca
Fri Jul 4 14:14:15 CEST 2003


On 20030704 (Fri) at 1347:21 +0200, Boris 'pi' Piwinger wrote:

> As you see, in practice there is no problem with
> overtraining.

What I see is that you have not (yet?) encountered problems.  That has
not been my experience.  I overtrained for a while, and stopped when I
discovered (a) that my fn counts were rising and (b) that rebuilding
the training db without overtraining brought them back down.

> After all I could have received it twice anyway.

If you receive it twice, you do not distort the estimation of what the
population's like if you add it twice.

> > and rather poor at recognizing
> > messages that are similar to, but not strongly similar to, those
> > training ones. 
> 
> That would mean that new messages are not classified
> correctly more often than in a full training. So far my
> observations don't show this.

So far.  If it's still true after several weeks or months, that will be
interesting.  I don't say it's impossible, but I'll be surprised.

-- 
| G r e g  L o u i s          | gpg public key: finger     |
|   http://www.bgl.nu/~glouis |   glouis at consultronics.com |
| http://wecanstopspam.org in signatures fights junk email |




More information about the Bogofilter mailing list