Bogotune results

David Relson relson at osagesoftware.com
Thu Sep 5 05:13:55 CEST 2013


Hi Tamer,

Yes.  The many high scoring ham and low scoring spam are a cause for
concern.  In bogofilter's view of the world, high scores mean spam and
low scores mean ham.  Is it possible that your inputs to bogotune were
reversed, i.e. that your ham was labeled spam and vice versa?  If this
did not happen, then you might wish to manually check for incorrectly
classified messages in the inputs you provided to bogotune.

Regards,

David


On Wed, 4 Sep 2013 12:43:38 -0400
Tamer Yousef wrote:

> any comments about the results I got?
> 
> 
> On Wed, Aug 21, 2013 at 12:38 PM, Tamer Yousef
> <tamer.yousef at gmail.com>wrote:
> 
> > I was able to finally get bogotune to run, and here are the results
> > below. Here are some questions that I have:
> > 1- The warning message indicates that the training set needs to
> > be-classified?
> > 2- applying the recommendation without the "sp_esf&ns_esf" values is
> > totally screwing up the spam scores a lot of the text that
> > previously got scores below .5 is now over .9.
> >
> > Changing the value of the min_div affects that final results
> > significantly but the warning message bogotune is outputting is
> > really making me doubt the while thing that I may need to
> > re-annotate and rebuild my training set....
> >
> >
> > and a side note:For applying the sp_esf&ns_esf , the "-E" option is
> > not supported by recent bogfilter?1
> >
> >
> > wordlist's ham to spam ratio is 1.2 to 1.0
> > Warning: test messages include many high scoring nonspam.
> >          You may wish to reclassify them and rerun.
> >     high ham scores:
> >        1 1.000000
> >        2 1.000000
> >        3 1.000000
> >        4 1.000000
> >        5 1.000000
> >        6 1.000000
> >        7 1.000000
> >        8 1.000000
> >        9 1.000000
> >       10 1.000000
> >     low spam scores:
> >        1 0.000013
> >        2 0.000043
> >        3 0.007088
> >        4 0.029865
> >        5 0.040703
> >        6 0.046916
> >        7 0.054538
> >
> >
> > Minimum found at s 0.3162, md 0.286, x 0.528, spesf 0.004228, nsesf
> > 0.011573
> >         fp 30 (2.8708%), fn 644 (62.2824%)
> >
> > Performing final scoring:
> > Spam...  Non-Spam...
> > 0.254923 0.941857
> > 0.298609 0.936104
> > 0.307635 0.927185
> > 0.362183 0.899385
> > 0.413051 0.895696
> > 0.466462 0.895564
> > 0.470619 0.892552
> > 0.471655 0.892334
> > 0.472590 0.887125
> > 0.477892 0.884211
> >
> > Recommendations:
> >
> > ---cut---
> > db_cachesize=100
> > robs=0.3162
> > min_dev=0.286
> > robx=0.527809
> > sp_esf=0.004228
> > ns_esf=0.011573
> > spam_cutoff=0.936104    # for 0.10% fp (1); expect 99.32% fn (1027).
> > #spam_cutoff=0.927185   # for 0.20% fp (2); expect 98.94% fn (1023).
> > ham_cutoff=0.308
> > ---cut---
> >
> _______________________________________________
> Bogofilter mailing list
> Bogofilter at bogofilter.org
> http://www.bogofilter.org/mailman/listinfo/bogofilter



More information about the Bogofilter mailing list