Problem compacting databases (again!)
Juan J. Martinez
reidrac at blackshell.usebox.net
Sun Jan 23 22:20:29 CET 2005
Hello,
It happened again:
# bogoutil -d wordlist.db | bogoutil -l wordlist.db.new
# bogoutil: Unexpected input [d'informÃ] on line 25173. Expecting
whitespace before count.
It's the same bug last time (the same word also!).
I did as David pointed:
# bogoutil -d wordlist.db > wordlist.txt
# head -25173 wordlist.txt | tail -1
d'informà tica 0 1 20050122
Yeah... it's a space in a bad place.
I think the word is "d'informàtica", and appears to be in utf-8. The
system is stock OpenBSD, I don't know if this is related. That's
bogofilter 0.92.8 (with BerkeleyDB 4.2.52).
I remember a post in a mail list with the charset unset or wrong set,
but I cannot find such message in the archives.
You can download the wordlist.db and a gzipped wordlist.txt at:
http://blackshell.usebox.net/bogofilter/
Lemme know if you need more info.
Regards,
Juanjo
--
Desarrollo y Sistemas: http://usebox.net/
Página personal: http://usebox.net/jjm/
More information about the Bogofilter
mailing list