Templates [was: Prediction ...]
Tom Allison
tallison at tacocat.net
Sat Jul 3 02:01:53 CEST 2004
Tom Anderson wrote:
> From: "Tom Allison" <tallison at tacocat.net>
>
>>What I've seen is that >>99% of my spam is from ip addresses that only
>>send one message over 4 months time. So there is little net improvement
>>on detecting spam since almost every IP address will be dominated on the
>>bases of robx/robs settings. So there is not much effect on detecting
>
> Spam.
>
> This discussion is not about detecting IPs for filtering, but for outputting
> IPs in the logs so that people can use them for a blacklist. Bogofilter
> already uses IPs for filtering. No big deal there, since it's just another
> token taken together with the rest of them in the message. But if you're
> going to single out a token and say, "this is definitely the IP of the
> connecting mail server, use it in your blacklist," then it's much more
> important to be absolutely certain that it really is. The fact that there
> is uncertainty and the difficulty in obtaining any measure of certainty is
> the heart of this discussion. I've suggested not adding this functionality
> because of this.
>
That's what I get for barging into a conversation. :)
As for the IP's for filtering:
Sounds like something that could well be managed by a brief perl script
to parse out the Received: headers for identified spam and load them
into an access list suitable for your MTA.
The two problems are the identification of a proper regex for parsing
out the IP address correctly. I do think perl could do this really well
in one line.
For example:
gizmo11ps.bigpond.com (gizmo11ps.bigpond.com [144.140.71.21])
by cling.tacocat.net (Postfix) with SMTP id 5F3C54C081
Should work out to:
/(\d+\.\d+\.\d+\.\d+).+?by $fqdn_localhost/o
Should set $1 to the IP address every time.
This is taking the octect of numbers closest to the left of the string
"by cling.tacocat.net" which is going to be the identifying string of
the connecting server to your localhost machine (cling.tacocat.net for
me, of Cling and Clang fame).
Even with amavisd this will work as subsequent lines are 'localhost' and
not FQDN (I think). That or NOT 127.0.0.1.
I don't know much about C. But this is effectively the rule to put into
place. I would expect that if there are any cases where the IP address
is not provided as a numerical octet, then you're kind of S.O.L.
Postfix already provides 99% of this under their new version under the
topic of GreyListing and "Access Policy Delegation" It's extremely
effective on what it can do.
From a functional view, I think bogofilter should stick to the
"Mission" of filtering that mail which is delivered to your mailbox into
selections of spam/ham. My relatively incomplete implimentation of
postfix UCE capabilities has introduced a new problem into spam
filtering for me. I can't get enough to do anything meaningful with
respect to bogofilter.
I haven't exceeded 37 spam in 24 hours since I installed it.
I'm rejecting >550 (probable) spam at the MTA level.
I believe that this is where the spam blocking is most effective since
it's not adding any overhead to my machine to turn the email away at the
wire interface.
More information about the Bogofilter
mailing list