Problems with Asian Spam

'Stefan Geißler' stefan.geissler at theimagingsource.com
Tue Nov 21 09:46:17 CET 2006


Hi,

I have Bogofilter successfully integrated into a procmail script. 
European and Russian spam is detected well, but Asian spam, especially 
Chinese spam is not detected. Japanese spam is detected with more 
success but not very well. Training with Asian spam results in "unsure". 
Similar spam mails as the ones used for training are not detected as 
spam. The problem is that I may receivce Asian ham mails, thus I can not 
simply delete them through procmail (as suggested in the FAQ, at least 
many of the Asian spam mails state they use US-ASCII charset).

My Bogofilter setup uses the default configuration like unicode=no and 
charset_default=iso-8859-1. I wonder what would happen if I change to
unicode=yes and charset_default=yes. What would the wordlist database 
think about this?

Any suggestions and help are welcome.
Thank you

Stefan



More information about the Bogofilter mailing list