Spam in images
Tom Anderson
tanderso at oac-design.com
Tue Aug 1 22:13:05 CEST 2006
Tony L. Svanstrom wrote:
> On Mon, 31 Jul 2006 the voices made Bill Wohler write:
> BW> What's the current best practice with these? Classify as spam, or just
> BW> delete?
>
> If it's spam, then it's spam... My view is that if you've got a "learning"
> filter then just hand it all spam and ham, and let it sort it out; if it can't,
> then it's either broken by design or outdated.
I tend to agree with that view. However, while I don't think that
Bogofilter is "broken" or "outdated", I do like to give it a little more
info in cases like this where it's just a big image and relatively
neutral headers and perhaps a paragraph of random text. Except perhaps
for the religious or political variety, there's one feature that all
spams share... a profit motive. That means enticing you or tricking you
into clicking on a link. And that's a very, very powerful giveaway.
Bogofilter already matches domain names as a part of filtering, but
spammers notoriously move around from server to server, thus defeating
your built-in greylist created in the course of training on errors.
However, through the power of sheer numbers, URIBLs are able to list
many of these URLs thanks to their addition via reports from early
victims or honeypots. In order to provide Bogofilter with this extra
level of research on each email, I built a pre-filter called
"stripsearch" which parses the email body and looks up all URLs to see
if they are listed on URIBLs, and if so, it inserts the token
SPAM-ADDRESS into the email, providing both a visual cue to the reader
and also giving Bogofilter some extra info with which to make a
spamicity decision.
You would add this step if using procmail:
:0
{
:0 fbw
| stripsearch
:0 fw
| bogofilter -uep
}
When there are few other tokens, as is the case when it's just a big
image, then this can send it over to the spam side. Since I've started
using it over a year ago, I receive no more of those image spams.
They're all classified correctly as spam.
You should have stripsearch in your Bogofilter /contrib/ directory, or
you can download it here: http://orderamidchaos.com/bogofilter/stripsearch
Tom
More information about the Bogofilter
mailing list