Problem compacting databases (again!)

Matthias Andree matthias.andree at gmx.de
Sun Jan 23 23:26:52 CET 2005


"Juan J. Martinez" <reidrac at blackshell.usebox.net> writes:

>> Can you compile and run this program?  The output I get is
>> 0xFFFFFFE0 0
>> I bet you get 0xFFFFFFE0 1 (or something similar).
>
> $ ./main
> 0xffffffe0 0

That has no relevance. isspace() is only valid for EOF and the values
that can be represented in an "unsigned char". 0xffffffe0 does not fall
into this category, hence the isspace() behavior is undefined.¹

The bogoutil code does the right thing however and passes unsigned char*
(we call that "byte"), so that's not it. Only that we don't know yet if
OpenBSD's isspace() is working because the test was wrong, please see
my other post as of five minutes ago for a new test program the produces
relevant output.

> I'm not sure "d'informÃ" it's a real word but part of "d'informÃ
> tica".

Doesn't matter.

> Do you mean this is bogoutil bug?

No it isn't.

> May be it's handling in wrong way a unicode string? Seems isspace is
> working right...

Bogofilter operates in the POSIX locale exclusively. It need not know
anything about UTF-8 because UTF-8 guarantees it's not misdetected this
way.

__________
¹ Except if Theo decreed to use 0xffffffe0 for EOF yesterday, which I
  doubt he did.

-- 
Matthias Andree



More information about the Bogofilter mailing list