... convert_unicode.c ...

Matthias Andree matthias.andree at gmx.de
Thu Jun 23 14:56:35 CEST 2005


On Thu, 23 Jun 2005, David Relson wrote:

>    --default-from-charset=US-ASCII
>    --default-to-charset=UTF-8

--default-to-charset is bogus.

Either we run iconv, then the output charset is always UTF-8 and not
user-configurable for consistency, or we don't, in which case a
configuration option doesn't make sense, as we're storing raw data.

I figured today that "Big5" is usually used for Traditional Chinese
(Taiwan, perhaps south Fujian, don't know for sure) and GB2312 is used
mainly for Simplified Chinese (Continental China). UTF-8 reflects
either.

> to allow full flexibility.  I suggest US-ASCII because it's consistent
> with the RFC's.  Realistically, you're probably right.  With most of
> the world's computers running Windoze, windoze-1252 is reasonable.
> However it bothers me to have a windoze charset be _our_ default.

Then let's use US-ASCII, I don't mind.

> Having both options also allows personal preferences.  I know some of
> our cyrillic users have CP866 and KOI8-R as their default charsets.
> Perhaps those would be useful as their default-to-charset.  Maybe so,

It is their default-from-charset, not "-to-charset".

-- 
Matthias Andree



More information about the bogofilter-dev mailing list