spam addrs

Mon Jun 28 11:05:08 CEST 2004

On Mon, 14 Jun 2004, David Relson wrote:

> The second version is a bit more complex.  Save the last IP address of
> the first Received: statement containing an IP address.  That will give
> the correct answer for:

This should be restricted to the value following "from" and subsequent 
comments. It is possible (although quite unlikely) to encounter an IP 
address in other fields of Received header, esp. "by" and "for" (see RFC 
2821, section 4.4 for details). Eg.
   Received: from 1.2.3.4 by 5.6.7.8 for xyz@[9.10.11.12]; ....

The bad news is the tokenizer does not care about parentheses, and the 
following lines having a completely different meaning
   Received: from word1 (by word2) by word3....
   Received: from word1 by word2 (by word3)....
are indistinguishable after tokenization.

Personally, I'd restrict it further to IP addresses
1. enclosed by brackets i.e. "[1.2.3.4]" (Sendmail/Postfix style), or
2. enclosed by parentheses and optionally prefixed by "user@"
   i.e. "(1.2.3.4)" or "(user at 1.2.3.4)" (qmail style), or
3. (unless anything matching (1) or (2) is found) following "from"
   immediately i.e. "from 1.2.3.4"

But again, this cannot be done after tokenization.

> The third version excludes "but not 127.0.0.1".

It would be cool if the list of "trusted relays" was configurable.
The program would (try to) skip any Received headers indicating the mail 
arrived from one of the listed trusted relays. This would solve the
problem of mail received indirectly.

But there's a catch:
   Received: from trusted.relay ([1.2.3.4])...
   Received: from localhost ([127.0.0.1]) ...
would return bogus 127.0.0.1 rather than the trusted.relay's IP address.

--Pavel Kankovsky aka Peak  [ Boycott Microsoft--http://www.vcnet.com/bms ]
"Resistance is futile. Open your source code and prepare for assimilation."