Problems with Asian Spam

Tom Anderson tanderso at oac-design.com
Wed Nov 22 02:22:14 CET 2006


I run mail through "spamitarium" before bogofilter, and this tags 
headers with the ASN of the sending IP.  The ASN should help bogofilter 
identify spammy vs hammy regions of the planet for you.  This extra 
token may help your filtering.

http://www.orderamidchaos.com/bogofilter/spamitarium

Tom



David Relson wrote:
> On Tue, 21 Nov 2006 19:33:05 -0500
> dhottinger at harrisonburg.k12.va.us wrote:
> 
> 
>>I keep following the list to see if anyone else has been having
>>issues with asian spam.  Im pulling some of it out with procmail.
>>But it seems to keep flooding in.  If I understand this thread
>>correctly, if I have my encoding set to utf-8 (which I do) I should
>>be catching it. Is this correct?
>>
>>thanks,
>>ddh
> 
> 
> Unicode, a.k.a. utf-8, is the best setting for recognizing it.  If you
> want to see how bogofilter is parsing the message and the scores for
> the individual tokens, you can do so using bogofilter's "-v" flag.  See
> the FAQ for the discussion of "-vv" and "-vvv".
> 
> As with any new foreign language, it takes training before bogofilter
> starts recognizing tokens as spammish.  Also, since bogofilter's
> parsing is based on the alphabetic languages (think "english" and
> "european" and "abc...z" and "01...9", etc), the parsing may produce
> gibberish when applied to asian languages.
> 
> On the other hand, most asian language messages arriving on my mail
> server are classified as spam.  Another group is classified as "unsure"
> since they come through a mailing list to which I'm subscribed (and the
> mailing list headers provide a lot of hammish tokens).
> 
> An alternate approach is using a procmail (or maildrop) rule that says
> "if asian charset, redirect to /dev/null".
> 
> Regards,
> 
> David
> _______________________________________________
> Bogofilter mailing list
> Bogofilter at bogofilter.org
> http://www.bogofilter.org/mailman/listinfo/bogofilter
> 
> 




More information about the Bogofilter mailing list