message tags

David Relson relson at osagesoftware.com
Tue Aug 31 22:19:15 CEST 2004


Matthias,

This month I've received 18,000+ spam.  Out of curiosity I ran a
modified bogolexer and printed the 3 attributes that bogofilter
identifies - msg_addr, msg_id, and queue_id.  I then sorted and counted
them.  Here's a table of counts vs repeats:

  15841 1
    537 2
    304 3
    256 4
     18 5

For example, 15,841 messages were unique (count=1) and 18 messages for
which 5 copies were received.  Checking further each of the 18 message
sets contains 5 addresses for my domain, e.g.:

  To: linda at example.com
  Cc: relson at example.com, eric at example.com,
	mark at example.com, webmaster at example.com

The actual messages differ only in the "X-Original-To:" and
"Delivered-To:" attributes:

--- /home/relson/Mail/2004-08-Spam/26425  2004-08-23 07:52:50.000000000
-0400
+++ /home/relson/Mail/2004-08-Spam/26426  2004-08-23 07:52:50.000000000
-0400
@@ -1,6 +1,6 @@
 Return-Path: <wrzjgv.gstinxa at smithvilledsl.net>
-X-Original-To: mark at example.com
-Delivered-To: mark at examplesoftware.com
+X-Original-To: relson at example.com
+Delivered-To: relson at example.com
 Received: from x.y.z.w (unknown [203.229.101.148])
 	by mail.example.com (Postfix) with SMTP id 312712FF2B;
 	Mon, 23 Aug 2004 01:36:35 -0400 (EDT)

In some of the sets of 5, the X-Bogosity line also varies since
different recipients have different spamicity score.

As you said, if bogofilter is going to have a unique message ID, it
needs to be something we generate.  It can't be msg_id or queue_id.

David

-- 
David Relson                   Osage Software Systems, Inc.
relson at osagesoftware.com       Ann Arbor, MI 48103
www.osagesoftware.com          tel:  734.821.8800



More information about the Bogofilter mailing list