... convert_unicode.c ...
David Relson
relson at osagesoftware.com
Thu Jun 23 21:12:51 CEST 2005
On Thu, 23 Jun 2005 20:49:13 +0400
Yar Tikhiy wrote:
> On Thu, Jun 23, 2005 at 02:56:35PM +0200, Matthias Andree wrote:
> > On Thu, 23 Jun 2005, David Relson wrote:
> >
> > > Having both options also allows personal preferences. I know some of
> > > our cyrillic users have CP866 and KOI8-R as their default charsets.
> > > Perhaps those would be useful as their default-to-charset. Maybe so,
> >
> > It is their default-from-charset, not "-to-charset".
>
> Hoping I may speak for Cyrillic users, they would rather choose
> between Windows-1251 and KOI8-R as their default-from-charset since
> literally nobody uses CP866 on the Net side. Interestingly, I
> receive most ham in KOI8-R and most spam in Windows-1251, and I've
> never seen an email in CP866. However, today most non-English
> spammers seem to specify charset right for their recipients to be
> able to read the junk in one click--who will ever spend two clicks
> to read spam? Therefore US-ASCII is a reasonable default-from-charset
> for Cyrillic users. I hope that it is for Chinese folks, too :-)
>
> And of course I vote for using UTF-8 as the default -to-charset
> without giving special support for national encodings like CP866.
> So bogofilter developers from all over the world won't have to fight
> over national issues, while people trying to localize software will
> have free time to spend on more fruitful projects than l10n :-)
Yar & Matthias,
It sounds like there's a consensus. "./configure --with-charset=name"
will set the "from" charset (with US-ASCII being used if the option
isn't specified) and the "to" charset will be "utf-8" (with no ./
configure option). If configure's "--disable-unicode" option is used,
bogofilter will operate as it has done in the past.
David
More information about the bogofilter-dev
mailing list