mass processing with mutt and Fcc
David Relson
relson at osagesoftware.com
Tue Apr 1 15:23:34 CEST 2003
At 08:01 AM 4/1/03, Boris 'pi' Piwinger wrote:
>David Relson wrote:
>
> > Bogofilter looks at nearly all the tokens of a message. Some stuff is
> > ignored - for example message IDs, because they tend to be unique, and
>
>All of them? The local part could be interesting.
>
> > innards of html tags
>
>What exactly?
>
>pi
At the present time, when processing html, bogofilter does discards html
comments, valid html tags (and their innards), and invalid html tags (and
their innards). Basically everything between angle brackets is being
ignored at this time.
The rationale is that that many tokens within html tags are not worth
scoring as spam indicators. The html keywords themselves are very common,
hence have little diagnostic value. Stuff like colors (black, white, etc)
is also common, while other colors (3D3D3D, 11FFAF, etc) are hex values and
have too many possible values to be useful. Html comments can include any
kind of random garbage.
Plans include options so a user can specify whether bogofilter uses any (or
all) of these tokens for scoring.
I can't say when this will be implemented. The place for the additional
code is in the lexer and I'm not good at modifying the grammar. A
volunteer would be very helpful!
David
Return-Path: <>
Delivered-To: relson at osagesoftware.com
Received: by osagesoftware.com (Postfix) via BOUNCE
id 3071327ED2; Tue, 1 Apr 2003 08:26:01 -0500 (EST)
Date: Tue, 1 Apr 2003 08:26:01 -0500 (EST)
From: MAILER-DAEMON at osagesoftware.com (Mail Delivery System)
Subject: Undelivered Mail Returned to Sender
To: relson at osagesoftware.com
MIME-Version: 1.0
Content-Type: multipart/report; report-type=delivery-status;
boundary="B2CC727ECE.1049203561/osagesoftware.com"
Message-Id: <20030401132601.3071327ED2 at osagesoftware.com>
This is a MIME-encapsulated message.
--B2CC727ECE.1049203561/osagesoftware.com
Content-Description: Notification
Content-Type: text/plain
This is the Postfix program at host osagesoftware.com.
I'm sorry to have to inform you that the message returned
below could not be delivered to one or more destinations.
For further assistance, please send mail to <postmaster>
If you do so, please include this problem report. You can
delete your own text from the message returned below.
The Postfix program
<relson at nic.osagesoftware.com>: mail for nic.osagesoftware.com loops back to
myself
--B2CC727ECE.1049203561/osagesoftware.com
Content-Description: Delivery error report
Content-Type: message/delivery-status
Reporting-MTA: dns; osagesoftware.com
Arrival-Date: Tue, 1 Apr 2003 08:26:00 -0500 (EST)
Final-Recipient: rfc822; relson at nic.osagesoftware.com
Action: failed
Status: 5.0.0
Diagnostic-Code: X-Postfix; mail for nic.osagesoftware.com loops back to myself
--B2CC727ECE.1049203561/osagesoftware.com
Content-Description: Undelivered Message
Content-Type: message/rfc822
Received: from osage.osagesoftware.com (osage.osagesoftware.com [192.168.1.10])
by osagesoftware.com (Postfix) with ESMTP id B2CC727ECE
for <relson at nic.osagesoftware.com>; Tue, 1 Apr 2003 08:26:00 -0500 (EST)
Received: by osage.osagesoftware.com (Postfix, from userid 1000)
id 3C7A814494; Tue, 1 Apr 2003 08:26:00 -0500 (EST)
To: relson at mail.osagesoftware.com
Subject: test
Message-Id: <20030401132600.3C7A814494 at osage.osagesoftware.com>
Date: Tue, 1 Apr 2003 08:26:00 -0500 (EST)
From: relson at osagesoftware.com (David Relson)
Tue Apr 1 08:26:00 EST 2003
--B2CC727ECE.1049203561/osagesoftware.com--
More information about the Bogofilter
mailing list