What to do with this kind of Spam?

Seth de l'Isle szoth at ubertechnique.com
Mon Jul 14 23:23:06 CEST 2003


I've been working with a small business that receives about 200 email a day,
and using bogofilter to filter incoming mail at the MTA.  I've seen a lot of
"good word poisening" as you call it, and it seems that they aren't either
reducing the ability of the database to detect spam in general or sneaking
their messages through.

Note bogofilter (I'm using 0.13.7 to test)strips the html comments out of the
message anyway.

so "sp<!-- egad -->am" is treated like "spam"



On Mon, Jul 14, 2003 at 03:49:46PM -0500, John McCain wrote:
> We talked about this a while back.  The effect will be that your database will 
> develop a very large population of unique tokens.  Since a unique token will 
> return a score of .415 (I believe), it will not affect the score of a message 
> at all.  It will, however, fill your database up with trash information over 
> time, but no one has really experienced a bloat or performance impact related 
> to this.
> 
> My concern has recently become "good word poisoning", where they are including 
> hammy words at the end of the message to defeat bayesian analysis.  My 
> bogofilter performance has suffered a great deal lately due to these bayesian 
> evasion tactics, so I've added Spamassassin.  
> 
> I think we're nearing the end of the phase where Bayesian analysis alone can 
> win the spam battle.  The good thing is that Bayesian methods are forcing 
> spammers to use easily identifiable evasion techniques.  Now it's our turn to 
> adapt.
> 

-- 
pgp public key http://ubertechnique.com/seth/pgp_key.txt ID: 0x60F9B67A




More information about the Bogofilter mailing list