Incorrigible spam

Tom Anderson tanderso at oac-design.com
Thu Apr 8 06:42:43 CEST 2004


On Wed, 2004-04-07 at 06:11, Richard Kimber wrote:
> What's the best strategy for these sort of messages? Should I just put
> up with this and just delete them, or is there an approved way of
> dealing with them (e.g. should I not train with them?)  I tried
> relearning the message four times, but the spamicity hardly changed
> (0.610188 to 0.617225).

To me, that's a pretty high score seeing as my ham never breaches 0.15. 
I'd suggest lowering your spam cutoff if none of your ham gets up that
high.  You could probably stand to lower both cutoffs if you don't get
any hams in your unsure range.

Like David, I use -u to train all of my emails on initial classification
and then register all errors.  However, I use bfproxy to automatically
retrain tough ones multiple times either until it reaches the cutoff or
until it reaches an arbitrary maximum (mine is set to 10).  I could
probably stand to increase that maximum, as there doesn't appear to be
any ill side-effects from multiple trainings... the strong scorers
simply get more neutral rather than threatening their integrity at all. 
However, most emails don't require anywhere near 10 recursions.

In addition, I've found that many of the "incorrigible" emails depend on
many noisy tokens... things like "X-mailer: outlook express", etc., that
are hammy only incidentally and not due to either the content of the
message nor the source address.  And also they are due to intentional
diversion, such as spammers using your own server name in the helo
string.  I strip these out now so that messages are only classified
according to the standardized header tags and the message content, with
received lines verified.  Adding in the ASN also gives bogofilter a good
token to classify on.  This is done with my "spamitarium" filter which I
recently presented to the list.  My results have been very good so far.

Hope that helps.

Tom

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20040408/38112600/attachment.sig>


More information about the Bogofilter mailing list