multipart spam
Matthias Andree
matthias.andree at gmx.de
Sun Nov 14 12:33:48 CET 2004
On Sun, 14 Nov 2004, Chris Fortune wrote:
> >The payload of the mail is in the HTML section (consisting of images and urls), but the
> >text section is filled with either conversational text taken from books, etc, or -even worse- authentic "ham" e-mails,
> ---What is that? The spammer cannot know what ham means for me.
>
> That is untrue for a site-wide (shared) wordlist.
Well, installing software that needs to be trained individually for a
whole site _has_ this downside. Given the multiple wordlists features,
the user can use the site-wide database as a second database and add his
own, trained individually.
> >Other than registering each and every one of these >mails, then
> retraining the wordlist, any suggestions? ---Training to
> exhaustion:-))
>
> That's the problem. I am training bogofilter to recognize hammy
> tokens as spam, then training it again to do the opposite, several
> times a day.
So the idea is to train only on the spam or ham containers of mutt?
We _can_ skip fixed types in multipart/alternative with a little effort
- but it must be a fixed list such as "ignore text/plain in
multpart/alternative" - the parser is single-pass and not supposed to
store several MB of mail in RAM.
> >I guess this could also be the beginning of a thread about
> de-obsfucation. ---What should that do?
>
> Reformat the email so that only the parts intended to be displayed to
> the recipient are included, for example. The resulting email would
> then be used for classification.
That won't do site-wide. I am using mutt to display the "plain" part
from multipart mail. If it's bogus, well, it's F9 (train as spam, move
to spam-ma (for manual) folder). I know other mailers that can be
configured which parts to prefer in multipart/alternative. Another user
would prefer /html for the fluff (or have no choice at all with a
sufficiently asinine mailer), another /plain to avoid all the blinking
junk and images and animations from the mail.
> --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus
> system (http://www.grisoft.com). Version: 6.0.796 / Virus Database:
> 540 - Release Date: 11/13/2004
Is this advertising necessary or can it be disabled?
Please don't pollute the mails with such ads.
--
Matthias Andree
More information about the Bogofilter
mailing list