Stripsearch
Tom Anderson
tanderso at oac-design.com
Thu Jun 9 15:38:55 CEST 2005
----- Original Message -----
From: "Tom Anderson" <tanderso at oac-design.com>
> For anyone not fundamentally opposed to using DNS block lists who is
> receiving false negatives due to "thesaurus attacks" or textless image
> links, I've written a pre-filter which parses the email body and looks up
> any URLs in various DNSBLs or URIBLs and then replaces offending addresses
> with a SPAM-ADDRESS token as well as a link to look up the URL in the
> block lists which listed it. Here's the source with documentation inside:
> http://orderamidchaos.com/bogofilter/stripsearch
I'm pleased to report that in the past three days since I started using
"stripsearch", my unsure rate has plummeted. I had been receiving several
unsures per day. I've only received two in the past three days. Those
emails which used to be classified as unsure (mortgage offers, wholesale
software, image-only viagra ads, etc), almost all link to websites contained
in URIBLs. Image-only and dictionary spams are finally defeated. My
wordlist shows "SPAM-ADDRESS" registered 29 times as spam and 0 as ham,
giving it a 0.999706 score, and now those spam emails which used to be
unsure aren't even being registered anymore since they're over my
thresh-update. Similarly, the token "SCAM-ADDRESS" -- for those deceptive
links not in a URIBL, but pointing other than to where they say they do --
was registered 4 times as spam, 0 as ham, and has a 0.997873 score, and now
contributes to regular spam classifications. Moreover, the site to which I
link to lookup the address (rulesemporium.com) was registered in 60 spams
for a score of 0.999858, and further contributes to spam classifications.
The only unsures that still got through contained links not hidden in HTML
and not contained in a block list (yet). Overall, I'd call this an
overwhelming success so far. I'd like to hear the results of anyone else
who tries it.
Tom
More information about the Bogofilter
mailing list