message tags
David Relson
relson at osagesoftware.com
Tue Aug 31 22:19:15 CEST 2004
Matthias,
This month I've received 18,000+ spam. Out of curiosity I ran a
modified bogolexer and printed the 3 attributes that bogofilter
identifies - msg_addr, msg_id, and queue_id. I then sorted and counted
them. Here's a table of counts vs repeats:
15841 1
537 2
304 3
256 4
18 5
For example, 15,841 messages were unique (count=1) and 18 messages for
which 5 copies were received. Checking further each of the 18 message
sets contains 5 addresses for my domain, e.g.:
To: linda at example.com
Cc: relson at example.com, eric at example.com,
mark at example.com, webmaster at example.com
The actual messages differ only in the "X-Original-To:" and
"Delivered-To:" attributes:
--- /home/relson/Mail/2004-08-Spam/26425 2004-08-23 07:52:50.000000000
-0400
+++ /home/relson/Mail/2004-08-Spam/26426 2004-08-23 07:52:50.000000000
-0400
@@ -1,6 +1,6 @@
Return-Path: <wrzjgv.gstinxa at smithvilledsl.net>
-X-Original-To: mark at example.com
-Delivered-To: mark at examplesoftware.com
+X-Original-To: relson at example.com
+Delivered-To: relson at example.com
Received: from x.y.z.w (unknown [203.229.101.148])
by mail.example.com (Postfix) with SMTP id 312712FF2B;
Mon, 23 Aug 2004 01:36:35 -0400 (EDT)
In some of the sets of 5, the X-Bogosity line also varies since
different recipients have different spamicity score.
As you said, if bogofilter is going to have a unique message ID, it
needs to be something we generate. It can't be msg_id or queue_id.
David
--
David Relson Osage Software Systems, Inc.
relson at osagesoftware.com Ann Arbor, MI 48103
www.osagesoftware.com tel: 734.821.8800
More information about the Bogofilter
mailing list