chinese-korean-non_latin spam

Tim Freeman tim at fungible.com
Sat Mar 8 01:32:33 CET 2003


>But unfortunately I get many spam mails from Asia,
>encoded in some eastern language.

spamoracle is willing to mark attachments.  If I filter an email
through spamoracle, it adds a header that summarizes the attachments.
Then I use these procmail rules to filter out emails with attachments
I'm sure I don't want:

   :0fw
   | /home/tim/spamoracle/bin/spamoracle mark
   # I omit a few recipes that implement whitelists.
   # I don't want Windows executables
   :0
   * ^X-Attachments:.*name=".*\.(pif|scr|exe|bat)"
   presumed_spam
   # I don't want music in email.
   :0
   * ^X-Attachments:.*type="audio/(x-wav|x-midi)
   presumed_spam
   # I don't want Korean, Chinese, or Japanese
   :0
   * ^(X-Coding-System:.*|Content-type:.*|X-Attachments:.*cset="|^Subject:.*=\?)(ks_c|gb2312|iso-2|euc-|big5|windows-1251)
   presumed_spam
   # Use bogofilter to sort the rest.
   :0HB:
   * ? bogofilter -2 -o 0.44
   presumed_spam

If bogofilter would mark attachments, then I could rely upon one
filtering tool rather than two.  I don't use spamoracle's scores any
more because bogofilter seems to do slightly better.

I want to use real attachment parsing rather than just telling
procmail to look through the body of the message because I might get
an email concerning viruses or spam that mentions the
"Content-type: big5" in some quoted text, like this email does.  :-).

-- 
Tim Freeman                                                  tim at fungible.com
Which is worse: ignorance or apathy? Who knows? Who cares?
GPG public key fingerprint ECDF 46F8 3B80 BB9E 575D  7180 76DF FE00 34B1 5C78 




More information about the Bogofilter mailing list