multipart spam

Matthias Andree matthias.andree at gmx.de
Sun Nov 14 12:33:48 CET 2004


On Sun, 14 Nov 2004, Chris Fortune wrote:

> >The payload of the mail is in the HTML section (consisting of images and urls), but the
> >text section is filled with either conversational text taken from books, etc, or -even worse- authentic "ham" e-mails,
> ---What is that? The spammer cannot know what ham means for me.
> 
> That is untrue for a site-wide (shared) wordlist.

Well, installing software that needs to be trained individually for a
whole site _has_ this downside. Given the multiple wordlists features,
the user can use the site-wide database as a second database and add his
own, trained individually.

> >Other than registering each and every one of these >mails, then
> retraining the wordlist, any suggestions?  ---Training to
> exhaustion:-))
> 
> That's the problem.   I am training bogofilter to recognize hammy
> tokens as spam, then training it again to do the opposite, several
> times a day.

So the idea is to train only on the spam or ham containers of mutt?

We _can_ skip fixed types in multipart/alternative with a little effort
- but it must be a fixed list such as "ignore text/plain in
multpart/alternative" - the parser is single-pass and not supposed to
store several MB of mail in RAM.

> >I guess this could also be the beginning of a thread about
> de-obsfucation.  ---What should that do?
> 
> Reformat the email so that only the parts intended to be displayed to
> the recipient are included, for example.  The resulting email would
> then be used for classification.

That won't do site-wide. I am using mutt to display the "plain" part
from multipart mail. If it's bogus, well, it's F9 (train as spam, move
to spam-ma (for manual) folder). I know other mailers that can be
configured which parts to prefer in multipart/alternative. Another user
would prefer /html for the fluff (or have no choice at all with a
sufficiently asinine mailer), another /plain to avoid all the blinking
junk and images and animations from the mail.

> --- Outgoing mail is certified Virus Free.  Checked by AVG anti-virus
> system (http://www.grisoft.com).  Version: 6.0.796 / Virus Database:
> 540 - Release Date: 11/13/2004

Is this advertising necessary or can it be disabled?
Please don't pollute the mails with such ads.

-- 
Matthias Andree



More information about the Bogofilter mailing list