Train on error vs. tuning

Boris 'pi' Piwinger 3.14 at logic.univie.ac.at
Sun Sep 14 16:59:16 CEST 2003


"Boris 'pi' Piwinger" <3.14 at logic.univie.ac.at> wrote:

>I recently thought about some conflicting approaches.
>
>Train on error (as applied by bogominitrain.pl) makes it
>impossible to use bogotune, which was so happy with the
>output already it would not try to optimizie further. Well,
>I did optimize before switching to that approach.
>
>Now, where is the problem: In train on error as opposed to
>full training the selection of messages to train with is
>highly dependend on the configuration. So this training is
>close to optimal for these settings.
>
>So here is my question? How important are these settings
>after all when train on error is used? Let me give examples:
>
>1) I have a very high min_dev, so if something slips
>through, this is usually due to to few words used in
>calculating bogosity. So this is a factor.
>
>2) I have a cutoff at .501 (result of some tuning). Now I
>could just move it to .4 or .6 say and train with that. It
>would probably not change much. (I add a security margin for
>training as described in previous messages). Yes, it is
>related to values for unseen words, but?

I am wondering if nobody wants to comment. I'd first of all
be interested to hear about min_dev.

pi




More information about the Bogofilter mailing list