Security margins in training (on error and to exhaustion)

Boris 'pi' Piwinger 3.14 at logic.univie.ac.at
Wed Dec 10 14:03:25 CET 2003


David Relson wrote:

>> > To summarize:  a larger margin builds a larger database and gives
>> > better classification results.
>> 
>> At least up to some point. As a KISS answer I'd suggest to
>> use spam_cutoff+-0.3 as an interval (assuming ham_cutoff =
>> spam_cutoff, IOW: ham_cutoff=0).
> 
> I would _never_ use spam_cutoff=ham_cutoff. 
> 
> Using tri-state classification lets my MUA filter all the Unsures into a
> single folder so I know which messages weren't classifiable.  It beats
> the heck out of having the messages filtered into the many possible
> different folders I have.

OK, here is my tri-state KISS answer;-) Let mid be the
middle between your two cutoffs. Train with mid+-0.3. Rate
messages with your original cutoffs (closer to mid).

pi




More information about the Bogofilter mailing list