A weird wordlist.db problem

David Relson relson at osagesoftware.com
Sat Jun 11 03:48:16 CEST 2005


On Sat, 11 Jun 2005 13:23:35 +1200
Tom Eastman wrote:

> Tom Eastman wrote:
> 
> > Okay.. I'm calm again.  Is there really no way to repair the database?
> > Surely if only one branch is broken, the rest should be repairable?  on
> > *either* side of the corrupted area?  You say that 'bogoutil -d' dumps
> > things in order... maybe I can hack the source code to make it dump in
> > *reverse* order?
> 
> Regarding repairs:
> 
>   tom at luna test $ db4.1_verify wordlist.db
>   db_verify: Page 1071: last item on page sorted greater than parent entry
>   db_verify: Page 1071: incorrect prev_pgno 47 found in leaf chain (should
> be 742)
>   db_verify: Page 1071: linked twice
>   db_verify: Page 1071: incorrect next_pgno 670 found in leaf chain (should
> be 233)
>   db_verify: DB->verify: wordlist.db: DB_VERIFY_BAD: Database verification
> failed
> 
> Is there any way of fixing this?  It doesn't look like extensive damage 
> -- perhaps just somehow amputating Page 1071?  I don't mind losing SOME
> information... I just don't want to have to cut out everything *after*
> the bad page just because it has freakouts somewhere in the middle.
> 
> It looks like it starts to repeat at 'head:I412924' through to
> 'ip:219.240.142.59'
> 
> So if I were to just dump and reload I would lose half the lowercase 
> alphabet (possibly including all the 'subj:' and stuff, too?)
> 
> What should I try and do?
> 
>         Tom

Hi Tom,

Start by saving a copy of the bad wordlist.  At some point you _might_
have need for it (besides sending me the copy I want :-)

Then build a new wordlist with everything up through ip:219... and use
it.  Given the link error, 'j' through 'z' are inaccessible (in the bad
copy) and absent (in the new wordlist), and that makes bad and new
comparably useful.  What's better about "new" is that as you register
additional ham and spam, the new list will be able to absorb it all
(which the old broken one presumably can't do).

I know it's frustrating to lose a chunk of a valued database.  However
as you've seen, bogofilter is still functioning well and there _is_ a
path back to health :-)

Regards,

David



More information about the Bogofilter mailing list