hash table to replace Judy

Gyepi SAM gyepi at praxis-sw.com
Thu Sep 19 00:43:36 CEST 2002


On Wed, Sep 18, 2002 at 12:19:26PM +0200, Matthias Andree wrote:
> On Tue, 17 Sep 2002, Gyepi SAM wrote:
> > I have attached the diffs to bogofilter.[ch], the files wordhash.[ch], and xmalloc.[ch], which are pretty simple.
> > The corresponding changes to configure.in is pretty simple and has been omitted.
> 
> I'm not the release manager, but can we hold this until after 0.7.4? In
> that time, someone can do some benchmarking.

Absolutely.
 
> I'm not sure if I recall this correctly, but did perchance DB offer a
> "no-file" hash mode? If so, that might be far easier, because we depend
> on DB anyways.

It does, and would be easier, but also slower. Incrementing a word requires that

1. Retrieve the current value, if any
2. Set or change the value
3. Write it back to the database (which also involves another search)

With the hash implementation, set 1 also adds the word to the list if it does not already exist.
In either case, the lookup returns a pointer to the associated data (a struct, in this case). We can
set or increment the value without having to write back. Intuitively, fewer searches should be faster.

-Gyepi



More information about the bogofilter-dev mailing list