What does charset do?

David Relson relson at osagesoftware.com
Fri Nov 11 00:54:33 CET 2005


On Thu, 10 Nov 2005 15:35:47 +0100
Torsten Veller wrote:

> Hi,
> 
> what is the use of the --with-charset option and what is
> charset_default used for? The documentation i found didn't help me to
> understand this features.
> 
> In my old ~/bogofilter.cf file there was the setting:
> | ##charset_default=unicode
> which i tried. But it made latest bogofilter doing strange things.
> Is unicode valid? (It's not in the latest bogofilter.cf.example.)
> Is it from/for iconv lib?
> 
> 
> charset_default=iso-8859-1 seems to be the default, so
> bogofilter.cf.example is not correct:
> | #charset_default=us-ascii               # default
> | ##charset_default=iso-8859-1            # (alternate)
> 
> -- 
> Regards Torsten

Hi Torsten,

As you rightly observer, "charset_default=unicode" is bogus.  A quick
check of my bogofilter archives doesn't find that construct in any
config files.  I wonder how it got on your machine.

charset_default was useful for people who wanted a standard character
set other than "us-ascii".  It's still useful now for people who don't
want bogofilter to convert tokens to unicode, which is more precisely
known as utf-8.

The "charset_default" statement provides a value which is passed to
iconv_open() which expects an official charset name (like iso-8859-1 or
utf-8) is needed.  While "utf-8" and "unicode" are interchangeable in
normal conversation, the function wants "utf-8".

I'll changed bogofilter.cf.example so it shows iso-8859-1 as the
default.

Thanks for spotting that out-of-date value.

Regards,

David




More information about the Bogofilter mailing list