Templates [was: Prediction ...]
tallison at tacocat.net
Fri Jul 2 08:11:01 EDT 2004
David Relson wrote:
> On Thu, 01 Jul 2004 09:32:27 +0100
> Peter Bishop wrote:
>>On 29 Jun 2004 at 18:06, Tom Anderson wrote:
>>>>I've looked at spamitarium's regexes and confess that, to my
>>>>inexperienced eye, they're complex. Give me a simple rule for
>>>>distinguishing them and I can try to implement it.
<lots of blah blah chopped because it's so very long>
I'm really not sure that there's much value in this.
What I've seen is that >>99% of my spam is from ip addresses that only
send one message over 4 months time. So there is little net improvement
on detecting spam since almost every IP address will be dominated on the
bases of robx/robs settings. So there is not much effect on detecting Spam.
And all this ASN/header_stripping that spamitarium managed to do didn't
have much net effect on 30,000 emails that I studied. I posted all of
these results on the mailing list months ago with little response.
But this is only my experience.
Where IP addresses might gain some value is the fact that 99% of my HAM
comes from consistent IP addresses. But then again, those IP addresses
are also related to consistent domains, users, and other tokens in the
Header. While IP addresses might help identify HAM, so does all the
other legitimate Headers that are contained therein. So there is not
much effect on detecting Ham.
Personally, I'm a little leary of all the new features that are being
pushed into bogofilter. We still don't have a good understanding of ESF
and have already had at least one person do a measure on a variety of
options related to headers, ASN, and IP addresses. Again, speaking only
from my own experience, I think that all of these options combined
picked up maybe 3 spams at the cost of 3 ham out of ~30,000 emails.
I'm tending to lean on the "ain't broke, don't fix it" mentality here.
Maybe that's a bad thing for Open Source projects, but bogofilter is
already wildly successful in it's ability to capture spam. The only
deficiency that I see with respect to other spam filters is 1)
daemon-capability in order to run more easily "site wide" and 2) strong
PR campaign to bring up awareness.
The basics of bogofilter seem pretty rock solid and highly effective.
If this could be daemonized to work in junction with something like smtp
delivery ports similar to postfix + amavisd-new then you would have a
totally awesome product. I would love to be able to set up a default
configuration for postfix + amavisd-new + bogofiler_d + clamscand but
bogofilter's hell-bent on staying on the command line. I know this was
tried once, but with uncertain results.
More information about the Bogofilter