info about spam messages

Fri Jun 11 18:11:17 CEST 2004

From: "David Relson" <relson at osagesoftware.com>
> I trust only the ip address in the first Received: stanza.  My thought
> is to have bogofilter cache that value so it can be included (using
> '%I') in the X-Bogosity line (or the logging message).
>
> The From: address is easily forged and less reliable.  However while
> implementing the '%I' ability, adding '%F' for the From: address is
> easy.

Well, that depends on whether you can correctly identify the IP address in
these lines:


Hint:
helo = 1.2.3.4
rnds = 8.7.6.5.s.com
ip = 5.6.7.8
luser = 4.3.2.1

The IP (5.6.7.8) is the only completely reliable info in any of those
lines... the other stuff can possibly or definitely be set by the spammer.
If bogofilter puts any other string of numbers in the log, then you may
cause some pretty major adverse effects.  Imagine the spammer uses your own
IP as one of the other strings, for instance.

Also, the spam may be a virus which was actually sent from your best
friend's computer, thus you would log their IP.  In a company, if everyone
is running Bogofilter and using IPs to block at the MTA level, it may
prevent anyone from accepting anybody else's mail once a virus hits.  While
I'm not necessarily against quarantining a host machine this way, it could
possibly be very detrimental to a company.  Also, the quarantine would only
happen on a case by case basis after everyone else was already sent the
virus, thus defeating any benefit if they actually opened it.

Finally, the first received line may just be a proxy, or even a machine in
your own mail setup.  To get to the culprit, you need to follow the chain of
received lines to the first public IP address you see for the external
proxy, and the last public IP address you see for the originator ISP.  Going
all the way to the very last received line won't help if the spammer had
some local machines in the chain.  Moreover, at any point in the chain of
received lines, the spammer may have inserted a chain of bogus ones.  For
example, someone may have a spam that looks like this:


Can you tell which IP is the originator of the spam?  Which lines are
forged?  Clearly, using 127.0.0.1 is not going to be useful.  Maybe you can
trust the next line then since it's your ISP.  Ok, does that mean that
mail1.your.isp.com is the spammer?  No, of course not, it's in the same
subnet.  What about some.server.com?  Well, it's suspicious, and may be an
open proxy, or maybe an innocent messenger, or it could be the spammer.
>From that point on, it's hard to tell which lines are real or forged.  You
could assume that if the chain is unbroken (from and by match up), then the
line is not forged, but a clever spammer could make this work anyway.
Assume that above some.server.com is an open proxy, and another.server.com
is the actual spammer.  The spammer can make himself look like just another
proxy in an unbroken chain which eventually leads back to innocent
addresses, nonsense addresses, even your own mail server.

In "spamitarium", I make such assumptions that an unbroken chain contains
valid addresses.  Therefore, it may be possible that some forged addresses
become spammy.  Overall, this is no big deal since the point is for
bogofilter to consider this as only a small part of the whole message.  And
if any of those forged addresses are actually hammy to you, then your hams
will cause them to be neutral anyway.  But if you're going to base an MTA
rejection on an IP in one of your spam's received lines, then you may cause
yourself lots of grief when you run into a clever enough spammer.

As I said before, I don't think it's a very good idea to filter at the MTA.
Use extreme caution when doing so.

Tom