Problems with default charset and map_xlate_characters

David Relson relson at osagesoftware.com
Thu Sep 25 14:02:29 CEST 2003


On Thu, 25 Sep 2003 15:10:16 +0400
Evgeny Kotsuba <evgen at shatura.laser.ru> wrote:

> Hi,
> 
> It seems that with default charset wrong things are doing for any
> mail's charset exept well knowng  to  bogofilter, Even if  
> allow_nonascii_replacement = 0. Problem is with map_xlate_characters 
> wich has nothing common with ascii.   Say I  have letter in russian 
> koi-8R coding wich should be standart for russian and used in unix and
> 
> in "right" mailers. There also may be a number of codings for other 
> ex-ussr rebublics like Ukrainian and more, we have now codings for 
> russia's national republics (something like states in US or provinces
> in Canada)
> 
> Also next comment to: map_nonascii_characters - this is very bad 
> function for any statictics etc.  I have made some russian's codepage 
> decoder for decoding mails with wrong double and triple recodings and 
> have name such coding as "Debillnaia" (de-billy's) because in case if 
> you have message like ???? ??? ?? ???? any  decoding will false.
> So if you have a lot of messages in foreing coding as  spam that map
> to ???? ????? etc. and than have any short letter with some foreing
> words (say signature, user's name etc.) - than what will be ?
> 
> SY,
> EK

Evgeny,

Charset support is a known incompleteness in bogofilter.  It was written
by an English-centric coder, namely me.  It works well for people whose
ham is all ISO-8859-1.  Similarly, the replace_nonascii_character option
was included as a way to deal with Asian spam.  Again, it works fine in
an English-centric environment.

Recognizing that there are other needs, a basic framework was created to
support other character sets.  Support for different charsets and
languages is a task (set of tasks) waiting for an interested person (or
persons) to fill in the details.

If you'd care to take on the task of supporting russian koi-8R (or other
languages), we'd be glad to include it in bogofilter.

Peace,

David




More information about the bogofilter-dev mailing list