Training and -o

Tom Allison tallison at tacocat.net
Mon Jul 19 13:38:36 CEST 2004


Tom Anderson wrote:
> On Fri, 2004-07-16 at 23:29, Barsalou wrote:
> 
>>Do folks want to share what values they are generally using so that we
>>can come up with some sort of accepted standard?  Or does this create
>>problems?
> 
> 
> I'd either go with the default in the config file, or else start with a
> spam cutoff of 0.99 and a ham cutoff of 0.01, and adjust it over time. 
> Using someone else's values will probably result in a lot of
> misclassifications.  You don't want to be too liberal too quickly.
> 
> Tom
> 

I agree, you really don't want to inherit all of out peculiarities.

Bogofilter, by it's nature, becomes highly specialized for each user and 
as such, may not bode well in a new environment (like someone elses 
email).  Think of it as a form of natural selection and survival of the 
fittest.

I think most people who have numbers like (ham_cutoff=0.1, 
spam_cutoff=0.4) tend to have very narrow scope of emails compared to 
the text offerings in spam.  I have similar numbers with spam_cutoff 
very low and have found that non-linux users (non-geek?, aoluser types?, 
you decide) tend to score very high.  I think this would eventually show 
itself as variations in diction.  My son has much higher numbers and 
he's in middle-school.  I can only imagine his variation in diction.




More information about the Bogofilter mailing list