Inline image based spam

Thomas Anderson tanderso at oac-design.com
Sun Oct 8 18:44:44 CEST 2006


My wordlist is at least 2-3 years old and it has lots of tokens, but I
don't find inertia to be a problem because I train to exhaustion (until
an email classifies correctly).  Doing this makes random tokens neutral
and picks out the true ham and spam tokens easily.  Bfproxy makes
exhaustive training simple.  No need to modify your wordlist or start
from scratch.  Also, in my config 0.5 is not unsure, it is squarely
spam.  robx=0.41, robs=0.2, min_dev=0.2, spam_cutoff=0.42,
ham_cutoff=0.1.  If you get lots of spam unsures, consider changing your
values.  I've gotten maybe one or two image spams to classify as unsure
in the past month, but after training again, it stopped.

Tom


On Fri, 2006-10-06 at 13:50 -0700, John Villalovos wrote:
> I am noticing that I am getting a lot of inline image based spam.
> Typically has a lot of nonsense text and then an image (contained in
> the email) for the spam.
> 
> I keep training on the message but they still keep landing in my
> "Unsure" folder.  Has anyone figured out some good ways to help
> bogofilter deal with this better?
> 
> I just added this to my .procmailrc:
> :0 HB
> # If it has an inline image, put in a header to indicate so.  Hopefully this
> # will help with the image spam
> * src=\"cid:.*@.*\"
> {
>     :0 fwh
>     | formail -I"X-Inline-Image:"
> }
> 
> I'm hoping that this will help with the issue, though I was thinking
> maybe bogofilter could add the presence of inline images as a keyword
> that it stores.  That way it could be added to the word database.
> 
> Thanks,
> John
> _______________________________________________
> Bogofilter mailing list
> Bogofilter at bogofilter.org
> http://www.bogofilter.org/mailman/listinfo/bogofilter




More information about the Bogofilter mailing list