Problem compacting databases (again!)

David Relson relson at osagesoftware.com
Sun Jan 23 22:59:47 CET 2005


On Sun, 23 Jan 2005 22:20:29 +0100
Juan J. Martinez wrote:

> Hello,
> 
> It happened again:
> 
> # bogoutil -d wordlist.db | bogoutil -l wordlist.db.new
> # bogoutil: Unexpected input [d'informÃ] on line 25173. Expecting 
> whitespace before count.
> 
> It's the same bug last time (the same word also!).
> 
> I did as David pointed:
> 
> # bogoutil -d wordlist.db > wordlist.txt
> # head -25173 wordlist.txt | tail -1
> d'informà tica 0 1 20050122

It looks like there's an 0xE0 character in that position.

#include <stdio.h>
#include <ctype.h>

int main(int argc, char **argv)
{
    char x = 0xE0;
    printf ("0x%02x %d\n", x, isspace(x));
    return (0);
}


Can you compile and run this program?  The output I get is

0xFFFFFFE0 0

I bet you get 0xFFFFFFE0 1 (or something similar).

If I'm right, then bogoutil needs a more thorough check than isspace()
because OpenBSD is doing something unusual.


David




More information about the Bogofilter mailing list