Spam in images

David Relson relson at osagesoftware.com
Thu Aug 10 02:34:00 CEST 2006


On Wed, 09 Aug 2006 20:27:49 -0400
Tom Anderson wrote:

> .rp wrote:
> >>These spams don't have a single URL in them, so I suppose
> >>stripsearch wouldn't help.
> >>
> >>Or can stripsearch read the URLs in the GIF?
> >>
> > 
> > 
> > hmm, I wonder if it would be worth hooking in an OCR program to
> > read the image and what the min hardware requirements would be to
> > scan them without bringing a system to a crawl.
> 
> Feel free to test, but my feeling is that even a moderate rate of
> spams would be unfeasible.  And you could easily DOS yourself given
> just a slight uptick from expected.  You would have to build in
> limits so that OCR is skipped on new emails when a few other OCR
> processes are already running.  You could also only perform OCRs on
> emails in the unsure range, allowing the text alone to damn or bless
> the fairly certain ones. But in the end, spammers would simply add
> characteristics to baffle the OCR reader, like CAPTCHAs do.  At least
> they couldn't do those phishing emails where they make it look like a
> regular text email though.
> 
> Tom

I'd suggest that OCR be used only when the message scores "unsure".
Using procmail or maildrop it'd be easy enough to test for unsure, then
run a script (to test if there's an image attachment), then run OCR and
bogofilter.



More information about the Bogofilter mailing list