Russian charsets and functions
Evgeny Kotsuba
evgen at shatura.laser.ru
Tue Jan 4 18:11:41 CET 2005
Clint Adams wrote:
>>Evgeny Kotsuba <evgen at shatura.laser.ru> writes:
>>
>>
>>
>>>one problem is that charset may be set impropelly - by mail client
>>>and/or spammer, second problem will be doubling data base. Really
>>>english/americans don't need russian or asian spam or mail, russian
>>>don't need asian spam/mail and all english letterrs are placed to 0-127
>>>and russian - to 128-255. All really multy-lang documents I see was sent
>>>in .doc or .pdf and so on.
>>>
>>>
>>Some mails earlier you documented how the same Cyrillic characters were
>>encoded differently in the different character sets, so I presume some
>>spammer actually exploiting this (we saw a time when spammers massively
>>used ISO-8859-* accented Latin characters) will have to specify the
>>proper character set lest he wants to produce garbage.
>>
>>
>
>Plus, if someone sends Evgeny a message like this one, bogofilter
>probably will not behave the manner he wants.
>
>шапку с дурака не снимают
>
>
;-))
Well, I know about utf8 and is ready to handle it. Moreover I am ready
for autodetecting codepage of russian text ;-) All that I need for it -
pure text piece. But it seems that still it is not very actually.
>人文
>
hmm... At inbox Mozilla shows two hieroglyphics, at mail compose it
makes two question marks.
SY,
EK
More information about the bogofilter-dev
mailing list