ignore text/plain part of multipart/alternative messages?

Peter Bishop pgb at adelard.com
Wed Aug 13 08:28:00 CEST 2003


On 12 Aug 2003 at 10:13, David Flanagan wrote:

> Maybe 99% is an overstatement, but the vast majority (think of Windows
> users with Outlook express).  Implicit in multipart/alternative messages
> is the assumption that the "richest" message format that can be
> displayed will be used.  The text/html portion is the default one.  The
> text/plain portion is the fallback for geeks like us.
> 
Perhaps we should score the text/plain and text/html parts separately (i.e.
score them like separate messages) then use the *highest* score to decide
if the overall message is spam.

This would prevent padding ploy from working as the spam payload would be
detected wherever it was located.

There is also a problem if a padded message is used to update the database 
as the padding tokens could dilute the database. So here again it would be 
necessary to score the parts to decide which tokens go in the database.
(header tokens + tokens in the highest scoring part)

-- 
Peter Bishop 
pgb at adelard.com
pgb at csr.city.ac.uk






More information about the Bogofilter mailing list