[cvs] bogofilter wordlists.c,1.36.2.4,1.36.2.5

Matthias Andree matthias.andree at gmx.de
Wed Jan 8 17:28:03 CET 2003


David Relson <relson at osagesoftware.com> writes:

> Matthias,
>
> Since bogofilter doesn't want sign extension when token characters are
> converted to ints (or words or longs...), why not simply use "unsigned
> char" and "unsigned char *" throughout bogofilter?

You'll need (char *) warnings for virtually any call to standard C
library or unix library functions if you do that to avoid the compiler
jumping in your face and prodding you backwards.

I might think that casting the ctype.h-style functions arguments to
(unsigned char) will give us less casts than the other way round.

> Undoubtedly there are a number of library functions that wouldn't like
> this, but the overall number of casts would lessen.

Hardly. We need casts for the handful of is*()/to*() functions from
ctype.h only, it's the (byte) or (unsigned char) stuff that makes us use
casts for str*() functions instead which is not good and more work.

> Alternate approachs would be to use gcc's "-funsigned-char" flag or
> define a "byte" typedef (denoting a value of 0..255).

I don't think this is helpful. It will introduce another dependency (on
gcc) that we don't want or need.

> would do all the work for us.  Unfortunately the flag is a gnu
> extension.  An advantage of using "byte" is that it is short (like
> "char").

typedef char byte; /* :-) */

> Having lots of casts makes the code harder to read. I feel strongly
> enough about this that I'm willing to do all the edits.

Yup. My vote: make byte a char and let's figure the remaining
is*()/to*() that lack the cast, if any.

Remember, some functions take int not to annoy the programmer but
because their range exceeds that of an unsigned char, many if not most
character-oriented functions have a domain of { EOF, 0, 1, 2, 3,
... 255}. Without proper typing, you might even end up with 0xFF
being mistaken as EOF on some systems!

So my vote is: let's consider byte as "unsigned char" as a failed
experiment and do it the conservative way. Cast all "char" that type
data that is fed to "int" functions to unsigned char explicitly and
leave everything else alone.

-- 
Matthias Andree




More information about the bogofilter-dev mailing list