image-only spam -- ideas, what to do?

Tue Dec 12 19:06:55 CET 2006

Bill McClain wrote:
> Bogofilter relies more on header information in these cases. I found it
> useful to set "block_on_subnets=yes", which adds ip address information to
> the database (and expands the database token count by about 20%). Ip address
> ranges can be very good discriminators.
> 
> These past few months I've been getting bursts of spam into my "unsure"
> category. Training handles them; just keep at it. Of my last 64,000 spams,
> only 0.12% were uncaught (although: I count my large "unsure" zone as
> "caught").

I'm of the same opinion.  Although I've consistently had a few more 
false negatives and unsures per day over the past few months, persistent 
training continues to keep it under control.  And since my overall spam 
volume has also increased as of late, my accuracy is still above 99%. 
Training 3-5 emails per day (out of the hundreds directed at my address) 
is not a big chore, so I remain unconcerned at present.

I agree that the headers are where it's at.  Using "block_on_subnets" is 
vital.  I also developed my "spamitarium" script for validating header 
info and adding ASNs, which also helps control the inline image spam:
http://www.orderamidchaos.com/bogofilter/spamitarium

Scanning the body for URIBL-listed links also helps on the 
penis-enlargement and other spams with URLs, but not so much with the 
pump and dump scams:
http://www.orderamidchaos.com/bogofilter/stripsearch

And recursive training always helps:
http://www.orderamidchaos.com/bogofilter/bfproxy

Tom