korean spam

Boris 'pi' Piwinger 3.14 at logic.univie.ac.at
Thu Oct 10 18:07:13 CEST 2002


Boris 'pi' Piwinger wrote:

> Q: There is lots of spam which is in charsets which I cannot read or
> not even display. Should I let bogofilter work with it? What else can
> I do?
> 
> A: As it stands now these messages don't work properly with bogofilter
> due to a limitation with 8bit characters which are used heavily in
> those languages. A solution using Procmail would be to drop that spam
> before bogofilter is called. The following does the trick:
> 
> ## Silently drop all completely unreadable spam
> :0E
> * 1^0 ^\/Subject:.*=\?(.*big5|iso-2022-jp|ISO-2022-KR|euc-kr|gb2312|ks_c_5601-1987|windows-1251|windows-1256)\?
> * 1^0 ^Content-Type:.*charset="?(.*big5|iso-2022-jp|ISO-2022-KR|euc-kr|gb2312|ks_c_5601-1987|windows-1251|windows-1256)
> /dev/null
> 
> :0HB:
> * ? bogofilter -u
> spam-bogofilter

David pointed me to some mistake, I have to apologize.

Drop the E from the first reipe. In my .procmailrc this makes sure
that good mail will not be deleted (say, someone from Japan sends a
mail in his charset, using ASCII only).

Secondly, the \/ which is missing in the second rule anyways, is
without effect here (does not hurt, though), it puts the match into
$MATCH.

pi


For summay digest subscription: bogofilter-digest-subscribe at aotto.com



More information about the Bogofilter mailing list