more test results

Matt Armstrong matt at lickey.com
Thu Feb 13 21:43:54 CET 2003


David Relson <relson at osagesoftware.com> writes:

> 02/13 10:37
>               spam  reg   good  reg     s-s  s-h  s-u  h-s   h-h  h-u
> def          1745   82   5044  123    1609   3   133   2   4918  124
> asc          1745   83   5044  123    1608   3   134   2   4918  124
> net          1745   81   5044  113    1604   5   136   2   4934  108
> tag          1745   81   5044  125    1604   3   138   2   4918  124
> asc-net      1745   83   5044  113    1602   4   139   2   4934  108
> asc-tag      1745   82   5044  125    1603   3   139   2   4918  124
> net-tag      1745   85   5044  114    1599   4   142   2   4928  114
> net-tag-asc  1745   87   5044  114    1597   3   145   2   4928  114

It is surprising to see these numbers and how little the various
options matter.  I'd venture to say the differences are statistically
insignificant.

The biggest point for me is the size of the "unsure" group -- about
3.7% of all incoming mail must be examined by hand and classified.

With the idea of pseudo-automating this, I've been thinking about
combining bogofilter with a whitelist+auto-responder approach.  It'd
work like this:

1. if the sender is in the whitelist, let the mail through.
2. if bogofilter calls it HAM, let the mail through.
3. if bogofilter calls it SPAM, file it in a SPAM folder.
4. if bogofilter is unsure, auto-respond to the sender asking them to
   confirm their mail.  This'd work in a way similar to most mailing
   list subscriptions, etc.
5. if the mail is a response to a step 4 confirmation request, let the
   original mail through and train bogofilter that it is not SPAM.

The idea is to reduce the frequency for which you must examine your
"unsure" mailbox, since folks who end up there have an option of
releasing their mail without your action.


-- 
matt




More information about the Bogofilter mailing list