0.92.6's bogotune much slower?

Sun Sep 5 15:00:12 CEST 2004

On Sun, 05 Sep 2004 14:36:13 +0200
Valient Gough wrote:

> On Sun, 2004-09-05 at 14:13, David Relson wrote:
> 
> > On Sun, 05 Sep 2004 10:55:55 +0200
> > Valient Gough wrote:
> 
> 
> 
> > > I just started a run with ESF disabled, and I notice that instead
> > > of using the ESF parameters that are in the configuration file,
> > > disabling ESF instead forces spesf and nsesf to 1.0.
> > 
> > You're right.  Having bogotune use the already known ESF values
> > _would_ be faster.  That's the up side.  On the down side, with
> > modified tokens in the wordlist and with a different set of messages
> > used in tuning, the old ESF values may no longer be optimal.  So
> > running bogotune with old ESF values is not ideal.  Of course, using
> > the default ESF values of 1.0 isn't ideal either.
> 
> 
> Right, but I may want to retrain from time to time, but not be willing
> to commit a week's worth of CPU time.  So if the existing ESF values
> will work, then I'd rather use them then the defaults.  Either that or
> allow a small search space around the existing values -- but not if it
> will cost days of CPU time!  This is starting to sounds like a
> parameter transfer problem between related but not identical
> datasets..

That's a correct analysis - "related but not identical".

At this point, bogofilter is about 2 yrs old and has demonstrated that
it works well with many different parameter sets.  There's no one
parameter set that's best for all message scoring.

Bogotune is younger, approx 1 year old.  It's grid search for an optimal
parameter set has revealed that some parameter sets are noticeably
better than others for a given set of tuning messages.  It also shows
that there are many locally "best" parameter sets.  Where the local
"best"s often changes with small changes to the set of tuning messages.

At the beginning of August, I used my current wordlist and 2 tuning
sets.  The first had 2 months of messages (July and August) and the
second had 3 months (June, July, August).  The 2 tuning runs found
different parameters, though the fp/fn percentages were comparable.  I
decided to use the parameter set that looked "nicest" to my eye and it's
working well.

Anyhow, the point of all this is to say that bogofilter is known to work
well with a variety of different parameter sets and there's no known way
to say that a particular parameter set is truly the "right" one and
bogofilter is robust, in spite of parameter variances.

> > In normal operation, since bogotune is looking for a fresh, new set
> > of parameters, it doesn't use the old parameter settings, hence
> > doesn't read a config file.  The good news is that bogotune is able
> > to read a config file and will use the old ESF values.  Run
> > "bogotune -?" to display the help message (which includes the option
> > for reading the config file).  Then run bogotune with config file
> > and "-E" and it'll do as you want.
> 
> 
> Perhaps this has been fixed already, but in 0.92.6 it does not behave
> this way.  It ignored the config values even though I told it to read
> a config file, which is why I sent the previous mail suggesting that
> it use the existing values for untuned parameters.  After I modified
> the init_course and init_fine methods to use the existing values, then
> it worked how I expected (and how you describe).

I'm digging into the code at the moment.  From my earlier, casual glance
I thought it should work, but experimenting indicates that you're right
-- the code is ignoring the config file's esf values.  I'll have a fix
in a while and will post it to the list.

> One more thing:  after my latest run with -E (and bogotune modified to
> use existing ESF values), the result seems to me better then the full
> run which included ESF tuning.  The dataset was nearly identical
> (except for the extra spam messages received over the course of 5 days
> while the full run was grinding away).  
> 
> Some results from the full run:
> spam_cutoff=0.986189    # for 0.10% fp (1); expect 3.61% fn (824).
> #spam_cutoff=0.981340   # for 0.20% fp (2); expect 2.17% fn (497).
> 
> And results from the partial run (-E), using the ESF values calculated
> from the full run above:
> spam_cutoff=0.990644    # for 0.10% fp (1); expect 4.86% fn (1143).
> #spam_cutoff=0.970202   # for 0.20% fp (2); expect 0.80% fn (187).
> 
> So the search solution doesn't seem particularly stable, which is why
> I expect to have to run bogotune periodically to adjust parameters
> (and why I'd rather not have a 5-day run each time)...

Yes, there's a lack of consistency.  Given that messages are composed of
discrete tokens with distinct spam scores, comparing parameter sets can
be difficult.  As you've seen, one set can be better at 0.10% and poorer
at 0.20%.  

Perhaps some budding mathematician will do a Ph.D. thesis on bayesian
content filters and get some real answers for us.

Regards,

David