New version
Tom Anderson
tanderso at oac-design.com
Tue Mar 16 14:15:27 CET 2004
On Tue, 2004-03-16 at 07:46, Greg Louis wrote:
> robx = 0.610600 (6.11e-01)
> robs = 0.017800 (1.78e-02)
> min_dev = 0.020000 (2.00e-02)
> ham_cutoff = 0.281000 (2.81e-01)
> spam_cutoff = 0.532200 (5.32e-01)
>
> gives me 1.1% fn and I haven't had an fp in 8 weeks now (150,000-odd
> messages). Same basic setup: a somewhat spammy robx and minimal
> minimum deviation. Unknowns bias the scoring spamward, which is ok,
> because -- especially these days -- spams do contain more unknowns.
> Ok, at least, if you have enough registered nonspam to balance out that
Wow, that sounds incredibly dangerous. If I sent you an email about a
subject you've never received before, maybe about igpe atinle, or
silghlty rareraegnd lteetrs, or lysdexia, or some obscure science or
sport with a strange vernacular, then you would likely classify it as
spam. I'd much rather get it as unsure, and at least have a chance to
register it as spam once. Therefore, the robx ought to be less than the
spam_cutoff if not within the min_dev range. Biasing unknowns strongly
toward spam (above the cutoff and min_dev) is crazy IMHO.
Tom
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20040316/defe62ac/attachment.sig>
More information about the Bogofilter
mailing list