bogotune results
Tom Allison
tallison at tacocat.net
Wed Mar 24 03:45:35 CET 2004
Well, it took a while, but this is what bogotune finally spit out as the
results.
These are seom pretty weird numbers...
I retested my archive and of the Ham I got: 3 Unsure and 1 Spam
of the Spam I got: all spam.
The highest fpos (Spam) was 0.238562. So I guess it's safe to set that
as my spam_cutoff for now. I didn't expect the cutoffs to be so close
to 0.00
tallison at janus:~> bogotune -n Maildir/.training.ham/cur/ -s
Maildir/.training.spam/cur/
Calculating initial x value...
Initial x value is 0.600000
Too few high-scoring non-spams in this data set.
At target 1, cutoff is 0.238562.
False-positive target is 1 (cutoff 0.238562)
Performing final scoring:
Non-Spam...
Spam...
### The following recommendations are provisional.
### Run bogotune with more messages when possible.
Recommendations:
---cut---
db_cachesize=4
robx=0.600000
min_dev=0.020
robs=0.0100
spam_cutoff=0.069 # for 0.05% fpos (1); expect 0.00% fneg (0).
#spam_cutoff=0.040 # for 0.10% fpos (2); expect 0.00% fneg (0).
#spam_cutoff=0.020 # for 0.20% fpos (4); expect 0.00% fneg (0).
ham_cutoff=0.020
---cut---
note: fpos means 'false positive' and fneg means 'false negative'.
The small number and relative uniformity of the test messages imply
that the recommended values (above), though appropriate to the test set,
may not remain valid for long. Bogotune should be run again with more
messages when that becomes possible.
Tuning completed.
More information about the Bogofilter
mailing list