Templates [was: Prediction ...]

Tom Allison tallison at tacocat.net
Fri Jul 2 08:11:01 EDT 2004


David Relson wrote:
> On Thu, 01 Jul 2004 09:32:27 +0100
> Peter Bishop wrote:
> 
> 
>>On 29 Jun 2004 at 18:06, Tom Anderson wrote:
>>
>>
>>>>I've looked at spamitarium's regexes and confess that, to my
>>>>inexperienced eye, they're complex.  Give me a simple rule for
>>>>distinguishing them and I can try to implement it.

<lots of blah blah chopped because it's so very long>

I'm really not sure that there's much value in this.

What I've seen is that >>99% of my spam is from ip addresses that only 
send one message over 4 months time.  So there is little net improvement 
on detecting spam since almost every IP address will be dominated on the 
bases of robx/robs settings.  So there is not much effect on detecting Spam.
And all this ASN/header_stripping that spamitarium managed to do didn't 
have much net effect on 30,000 emails that I studied.  I posted all of 
these results on the mailing list months ago with little response.

But this is only my experience.

Where IP addresses might gain some value is the fact that 99% of my HAM 
comes from consistent IP addresses.  But then again, those IP addresses 
are also related to consistent domains, users, and other tokens in the 
Header.  While IP addresses might help identify HAM, so does all the 
other legitimate Headers that are contained therein.  So there is not 
much effect on detecting Ham.

Personally, I'm a little leary of all the new features that are being 
pushed into bogofilter.  We still don't have a good understanding of ESF 
and have already had at least one person do a measure on a variety of 
options related to headers, ASN, and IP addresses.  Again, speaking only 
from my own experience, I think that all of these options combined 
picked up maybe 3 spams at the cost of 3 ham out of ~30,000 emails.

I'm tending to lean on the "ain't broke, don't fix it" mentality here. 
Maybe that's a bad thing for Open Source projects, but bogofilter is 
already wildly successful in it's ability to capture spam.  The only 
deficiency that I see with respect to other spam filters is 1) 
daemon-capability in order to run more easily "site wide" and 2) strong 
PR campaign to bring up awareness.

The basics of bogofilter seem pretty rock solid and highly effective. 
If this could be daemonized to work in junction with something like smtp 
delivery ports similar to postfix + amavisd-new then you would have a 
totally awesome product.  I would love to be able to set up a default 
configuration for postfix + amavisd-new + bogofiler_d + clamscand but 
bogofilter's hell-bent on staying on the command line.  I know this was 
tried once, but with uncertain results.


More information about the Bogofilter mailing list