chinese-korean-non_latin spam
Tim Freeman
tim at fungible.com
Sat Mar 8 01:32:33 CET 2003
>But unfortunately I get many spam mails from Asia,
>encoded in some eastern language.
spamoracle is willing to mark attachments. If I filter an email
through spamoracle, it adds a header that summarizes the attachments.
Then I use these procmail rules to filter out emails with attachments
I'm sure I don't want:
:0fw
| /home/tim/spamoracle/bin/spamoracle mark
# I omit a few recipes that implement whitelists.
# I don't want Windows executables
:0
* ^X-Attachments:.*name=".*\.(pif|scr|exe|bat)"
presumed_spam
# I don't want music in email.
:0
* ^X-Attachments:.*type="audio/(x-wav|x-midi)
presumed_spam
# I don't want Korean, Chinese, or Japanese
:0
* ^(X-Coding-System:.*|Content-type:.*|X-Attachments:.*cset="|^Subject:.*=\?)(ks_c|gb2312|iso-2|euc-|big5|windows-1251)
presumed_spam
# Use bogofilter to sort the rest.
:0HB:
* ? bogofilter -2 -o 0.44
presumed_spam
If bogofilter would mark attachments, then I could rely upon one
filtering tool rather than two. I don't use spamoracle's scores any
more because bogofilter seems to do slightly better.
I want to use real attachment parsing rather than just telling
procmail to look through the body of the message because I might get
an email concerning viruses or spam that mentions the
"Content-type: big5" in some quoted text, like this email does. :-).
--
Tim Freeman tim at fungible.com
Which is worse: ignorance or apathy? Who knows? Who cares?
GPG public key fingerprint ECDF 46F8 3B80 BB9E 575D 7180 76DF FE00 34B1 5C78
More information about the Bogofilter
mailing list