Russian charsets and functions

Evgeny Kotsuba evgen at shatura.laser.ru
Tue Jan 4 18:11:41 CET 2005


Clint Adams wrote:

>>Evgeny Kotsuba <evgen at shatura.laser.ru> writes:
>>
>>    
>>
>>>one problem is that charset may be set impropelly - by mail client 
>>>and/or spammer, second problem will be doubling data base. Really  
>>>english/americans don't  need russian or asian spam or mail,   russian 
>>>don't need asian spam/mail and all english letterrs are placed to 0-127 
>>>and russian - to 128-255. All really multy-lang documents I see was sent 
>>>in .doc or .pdf  and so on.
>>>      
>>>
>>Some mails earlier you documented how the same Cyrillic characters were
>>encoded differently in the different character sets, so I presume some
>>spammer actually exploiting this (we saw a time when spammers massively
>>used ISO-8859-* accented Latin characters) will have to specify the
>>proper character set lest he wants to produce garbage.
>>    
>>
>
>Plus, if someone sends Evgeny a message like this one, bogofilter
>probably will not behave the manner he wants.
>
>шапку с дурака не снимают
>  
>
;-))
Well, I know about utf8 and is ready to handle it. Moreover I am ready 
for autodetecting codepage of russian text ;-) All that I need for it - 
pure text piece. But it seems that still it is  not very actually.

>人文
>
hmm... At inbox Mozilla shows two hieroglyphics, at mail compose it 
makes two question marks.

SY,
EK




More information about the bogofilter-dev mailing list