further advice for asian spam and spam assassin text

David Relson relson at osagesoftware.com
Wed Sep 24 17:31:43 CEST 2003


Pete,

I used the procmail asian charset recipes for quite a while and found
them effective.  Then someone said "Why not let bogofilter do it?".  

Using "replace_non_ascii=yes" in bogofilter.cf, asian language text
becomes tokens like "?'?d", "?'?u?g?8", "?'?z?", "?'?~?????", etc. 
Bogofilter is perfectly able to handle these and I've not seen any asian
spam in my inbox in a long time.

This option works by converting non-alphabetic characters above 0x80 to
'?'.  This will also affect some accented characters such as used in
French, German, Spanish, etc.

The option is available and you may find it useful.

Also, I wouldn't worry too much about SpamAssassin's message mark-up. 
Bogofilter recognizes whether tokens are important or not.  FWIW, "spam"
is probably already in your wordlist, especially if you train on this
mailing list which talks about spam a whole lot.

Hope this helps!

David




More information about the Bogofilter mailing list