Training and -o
tallison at tacocat.net
Mon Jul 19 13:38:36 CEST 2004
Tom Anderson wrote:
> On Fri, 2004-07-16 at 23:29, Barsalou wrote:
>>Do folks want to share what values they are generally using so that we
>>can come up with some sort of accepted standard? Or does this create
> I'd either go with the default in the config file, or else start with a
> spam cutoff of 0.99 and a ham cutoff of 0.01, and adjust it over time.
> Using someone else's values will probably result in a lot of
> misclassifications. You don't want to be too liberal too quickly.
I agree, you really don't want to inherit all of out peculiarities.
Bogofilter, by it's nature, becomes highly specialized for each user and
as such, may not bode well in a new environment (like someone elses
email). Think of it as a form of natural selection and survival of the
I think most people who have numbers like (ham_cutoff=0.1,
spam_cutoff=0.4) tend to have very narrow scope of emails compared to
the text offerings in spam. I have similar numbers with spam_cutoff
very low and have found that non-linux users (non-geek?, aoluser types?,
you decide) tend to score very high. I think this would eventually show
itself as variations in diction. My son has much higher numbers and
he's in middle-school. I can only imagine his variation in diction.
More information about the Bogofilter