scaling and learning [wasRe: Inline image based spam]

Tony L. Svanstrom tony at moon.pp.se
Sat Oct 7 10:52:46 CEST 2006


On Fri, 6 Oct 2006 the voices made David Relson write:

DR> The messages commonly have a passage from a book (or some such) in hopes of
DR> fooling filters.  Since those passages rarely match my ham email, I
DR> anticipate that bogofilter will eventually come to recognize the new words
DR> as spammish.

 I've gotten some spam that seem to almost target just me (in theory it is of
course possible that they use different randomish text based on information
like where they got the e-mailaddress from; but it's most likely just a random-
thing, or someone using computergeek-speak just because he's got a bunch of
addresses from a recent whois-scraping) which are way too spot on for me to
really want to retrain the message as spam (I still do it though, fighting fear
with "it's all about the headers"-logic).

 Before I used to think that these image-spam might be about posioning our
spam/ham-token databases, and I still think that a very few people _might_ be
working on just that; but the amount of this type of spam we're getting is
simply drowning out the few evil ones by retraining our filters to better
handle this new phenomenon.

 But as you all can see, it's not working perfectly yet, and it might not be a
bad idea to beef up our defences a bit; one thing is, like I said in another
posting a few minutes ago, to add more headers with easily available
information.
 One simple thing is to filter outgoing e-mails so that you can store the
e-mailaddresses of everyone you e-mail in a database (or simple flatfile which
you can grep in a procmail-recipe); then you can automatically add a "this came
from a good e-mailaddress"-header (or do like I do, and automatically train
such e-mails as ham; maybe one spam every other year's managed to get by by
faking a "good" e-mailaddress, but as long as you don't whitelist addresses at
your own domain you're pretty safe).


	/Tony
-- 
        /\___/\                                          /\___/\
        \_@ @_/                                          \_@ @_/
   .--oOO-(_)-OOo--------------------------------------oOO-(_)-OOo--.
   |  perl -e'print$_{$_} for sort%_=`lynx -dump svanstrom.org/t`'  |
   `---ôôô---ôôô----------------------------------------ôôô---ôôô---´
       \O/   \O/        ©1998-2006 svanstrom.org        \O/   \O/




More information about the Bogofilter mailing list