New new script to train bogofilter

Boris 'pi' Piwinger 3.14 at logic.univie.ac.at
Fri Jul 4 15:17:14 CEST 2003


Peter Bishop wrote:

> "If you could only put some fixed number of messages e.g. 1000
> into the database - which ones would you choose out of a larger population 
> of messages?" 
> 
> So following your dog analogy, the train of error procedure only puts a 
> message into the database if it looks different, i.e. like putting 
> different breeds of dog.into a picture book.
> 
> So you end up with a picturebokk full of different breeds, rather than a 
> picture book whre there is a lot a dogs of the same type while others are 
> missed completely.

Right, but I don't have a fixed limit. But I add cats and
dogs until I have the complete picture of all know breeds.
If a new breed comes along, I have to find something
similar, but this is what bogofilter does all the time.

So I add only new information to the database, but as much
of it as I can find. I end when there is no new information
available in my training set.

Actually, I don't know why randomtrain needs more messages.
But I don't understand that script.

pi





More information about the Bogofilter mailing list