ignore text/plain part of multipart/alternative messages?

David Relson relson at osagesoftware.com
Wed Aug 13 01:30:22 CEST 2003


At 06:41 PM 8/12/03, David Flanagan wrote:

> > Would you care to send me one of those "text/plain is book excerpt,
> > text/html is UCE" mails so I can have a look? If so, please save the
> > whole mail to a file ("export") and zip it before you attach, so it
> > doesn't get filtered out here. You can omit non-MIME headers if you want
> > to protect your privacy, all I need are MIME-Version: and Content-*:
> > headers.
>
>I use Emacs RMAIL, which is not mime-aware, so attaching zipped files is
>a pain.  Instead, I've posted a sample spam here:
>
>     http://www.djf.net/spam.txt
>
>Note that line endings and other parts looks like they have been mangled a
>bit.  I always get this for messages using a quoted-printable
>encoding. I don't know if that is supposed to happen or whether
>something in my email toolchain (like my antique mail reader) is
>responsible for the corruption.  In any case, you'll still be able to
>get the basic idea, even though the HTML part is trashed beyond repair.
>
>     David Flanagan

Greetings,

It could be done.

As bogofilter processes multi-part mime it could keep the token lists 
separated for the different parts.  Given both text/html and text/plain, it 
would be possible to use the tokens from one and discard the tokens from 
the other.

As a research project, someone could make the changes and test to see how 
much difference the changes make.

Ciao,

David





More information about the Bogofilter mailing list