A Suggestion [was: multipart spam]

David Relson relson at osagesoftware.com
Sun Nov 14 14:33:11 CET 2004


Greetings,

I noticed multipart spam quite a while ago.  'Tis common for it to
include a random article in the text/plain part.  Twice, I happened to
notice that the random bit was an interesting article on the history of
archery and was interesting reading.  On my small site, such spam
doesn't get by bogofilter.  I can't speak for other sites.

Various ways for dealing with this have been suggested.  One way would
be to score each part separately, then "combine" the scores in some way.
One way would be simple averaging, which would take a low ham score and
a high spam score and give an "Unsure" result.  Averaging might be
helpful, or it might not.  Another way would be use the score of the
spammiest section.  I'm sure y'all get the idea -- there are many
possible ways.

A reasonable approach for testing such ideas would be a perl script (or
python program) to separate a message into its parts, score them
separately, and see what the result gives.  I'd suggest having the
header be one part (scored using your usual bogofilter flags) and having
each mime part be scored (using usual flags plus '-H').

So, who's going to do this???

Regards,

David



More information about the Bogofilter mailing list