spam addrs

Tom Anderson tanderso at oac-design.com
Tue Jun 29 16:49:23 CEST 2004


From: "David Relson" <relson at osagesoftware.com>
> My mistake.  MTA is correct.  The MTA should always have the connecting
> address (1.2.3.4 or whatever).  A DNS problem means there's no name,
> rather than no address.  Postfix handles that with:
>
> Received: from blaster3.omessage.com (unknown [204.180.130.223])
>
> > Received: from spammer.com by 192.168.1.1 for you at localhost

Ok, perhaps you're right, but I have seen received lines of the form above.
Maybe this is just from people who turn off the option in their MTA to
output the IP.  I guess if they screw it up that way, it's their own fault.
I doubt any recent MTA versions suppress the IP by default.

> > Received: from [1.2.3.4] (helo=5.6.7.8) with smtp (Exim 4.12)
> > vs
> > Received: from 5.6.7.8 [1.2.3.4] with smtp (8.9.10)
>
> Looking at these examples makes me wonder about having a "receive" mode
> in the lexer grammer that, in addition to returning the usual tokens
> would also deal with square brackets, perhaps look for
> "[digits.digits.digits.digits]" or something similar.  Do you think that
> would resolve the various issues?

That would certainly help with 99% of the cases.  However, there are still
some such as Squirrelmail which do things like this:

Received: from 1.2.3.4 (proxying for 5.6.7.8) (user 4.3.2.1) by 9.8.7.6
(7.6.5.4) with SMTP id blah for user+3.4.5.6 at 8.7.6.5.abc.com; date

No square brackets.  In fact, my spamitarium program removes any square
brackets too.

Also, is it possible to use square brackets in your HELO string?  Eg:

Received: from [5.6.7.8] [1.2.3.4] with smtp (8.9.10) ...
or
Received: from [5.6.7.8] (spammer.com [1.2.3.4]) ...

Yep, it is possible, at least with sendmail... I just tried it.
Then how would you determine the correct one?  There has to be more
intelligent parsing which looks at more than just the address... it has to
take into account the variations of MTA output including the entire string.
My regexes in spamitarium were able to handle the square-bracketed HELO:

Received: from helo-[1.2.3.4] 65.126.137.220 as209
   by oac-design.com 216.109.145.120
   for <tanderso.public+helotest at oac-design.com>; Tue, 29 Jun 2004
10:36:29 -0400

And bogofilter actually scored it very confidently as ham, even with no
body:
X-Bogosity: No, tests=bogofilter, spamicity=0.000938, version=0.17.5

But if bogofilter was looking for square brackets in the above string, it
would get it wrong.

> There will always be people who know more and people who know less.  I,
> for one, am still learning.  I expect that displaying the message
> address will always be optional, which will require some level of
> hacking on the user's part (if only to read bogofilter.cf and discover
> that '%I' exists).

The spammers are still learning too.  Likely, if there is a hole, they will
find it.  If I were a spammer and I noticed that a relatively popular spam
filter was identifying my IP, I'd try everything to prevent it.  Such as
using square bracketed IPs in the HELO string as above.  The filter (or
filter writers) should attempt to predict and prevent any such circumvention
methods.

Tom




More information about the Bogofilter mailing list