Levenshtein distance as a useful pattern matching algorithm todecipher scrabble spam

Edvard Majakari edvard.majakari at staselog.com
Fri Feb 25 09:31:10 CET 2005


"Lee Dowthwaite" <lee at dowthwaite.net> writes:

[...]
> for spam: indeed, almost all such juxtapositions were the result of spam.
> Another thing it was very good at spotting - again, with minimal DB usage -
> was foreign content. On these grounds it may well have caught the "Jmaes"
> example also.

What about code? Wouldn't procmail recipes, perl code, sendmail
configuration files etc. in e-mail seem like spam then?

-- 
# Edvard Majakari		Software Engineer
# PGP PUBLIC KEY available    	Soli Deo Gloria!

$_ = '456476617264204d616a616b6172692c20612043687269737469616e20'; print
join('',map{chr hex}(split/(\w{2})/)),uc substr(crypt(60281449,'es'),2,4),"\n";
_______________________________________________
Bogofilter mailing list
Bogofilter at bogofilter.org
http://www.bogofilter.org/mailman/listinfo/bogofilter



More information about the Bogofilter mailing list