more test results

David Relson relson at osagesoftware.com
Thu Feb 13 22:08:43 CET 2003


At 03:43 PM 2/13/03, Matt Armstrong wrote:

>David Relson <relson at osagesoftware.com> writes:
>
> > 02/13 10:37
> >               spam  reg   good  reg     s-s  s-h  s-u  h-s   h-h  h-u
> > def          1745   82   5044  123    1609   3   133   2   4918  124
> > asc          1745   83   5044  123    1608   3   134   2   4918  124
> > net          1745   81   5044  113    1604   5   136   2   4934  108
> > tag          1745   81   5044  125    1604   3   138   2   4918  124
> > asc-net      1745   83   5044  113    1602   4   139   2   4934  108
> > asc-tag      1745   82   5044  125    1603   3   139   2   4918  124
> > net-tag      1745   85   5044  114    1599   4   142   2   4928  114
> > net-tag-asc  1745   87   5044  114    1597   3   145   2   4928  114
>
>It is surprising to see these numbers and how little the various
>options matter.  I'd venture to say the differences are statistically
>insignificant.

"Little difference" is certainly an accurate description for my 
results.  What isn't known is whether my results are typical or 
atypical.  If you (or someone else) were to run a similar test, would the 
results be the same (roughly) or very different.  If you have a reasonable 
amount of saved data, I invite you to test the option(s) that are of most 
interest to you.

>The biggest point for me is the size of the "unsure" group -- about
>3.7% of all incoming mail must be examined by hand and classified.
>
>With the idea of pseudo-automating this, I've been thinking about
>combining bogofilter with a whitelist+auto-responder approach.  It'd
>work like this:
>
>1. if the sender is in the whitelist, let the mail through.
>2. if bogofilter calls it HAM, let the mail through.
>3. if bogofilter calls it SPAM, file it in a SPAM folder.
>4. if bogofilter is unsure, auto-respond to the sender asking them to
>    confirm their mail.  This'd work in a way similar to most mailing
>    list subscriptions, etc.
>5. if the mail is a response to a step 4 confirmation request, let the
>    original mail through and train bogofilter that it is not SPAM.
>
>The idea is to reduce the frequency for which you must examine your
>"unsure" mailbox, since folks who end up there have an option of
>releasing their mail without your action.

Seems like that could be done with a script run by procmail or postfix or 
whatever.  Others have done similar things with, I believe, success.  If 
you do come up with a nice solution, the bogofilter/contrib directory still 
has space for additional modules :-)





More information about the Bogofilter mailing list