character sets

David Relson relson at osagesoftware.com
Sun Dec 8 21:51:42 CET 2002


At 03:34 PM 12/8/02, Matthias Andree wrote:

>On Sun, 08 Dec 2002, David Relson wrote:
>
> > At 01:46 PM 12/8/02, Matthias Andree wrote:
> >
> > >Looking at man iconv_open as of FreeBSD's iconv-1.8_2 port, it seems the
> > >character set canonicalization issue is solved, see man 3 iconv_open,
> > >man 3 iconv for details. iconv is GNU stuff.
> >
> > How well does it mesh with character sets as used in lexer.l?
>
>It's purely for conversion, you'd basically use it in place of your
>character set tables. However, care must be taken because the input can
>become longer, for example for -to-UTF-8 -to-UTF-7 conversion. We cannot
>do that in lexer's buffers now, we'd need to reserve some space. We may
>need wchar_t.

That's what keeps the project interesting - problems to solve.


>BTW, charset.c cannot be compiled on older compilers:
>
>../charset.c: In function `map_us_ascii':
>../charset.c:269: parse error before `static'
>../charset.c:276: `xlate_us' undeclared (first use this function)
>../charset.c:276: (Each undeclared identifier is reported only once
>../charset.c:276: for each function it appears in.)
>
>Means: you mixed code and declaration. Unsupported before C99. C89 is in
>wide use and will remain for some time.

Ah, nesting is a new feature.  I started my C coding long before C89 and 
assume that if one compiler thinks the code is correct, then they all will.

The code has been patched.





More information about the bogofilter-dev mailing list