cdb preliminary results

Greg Louis glouis at dynamicro.on.ca
Mon Jul 14 18:55:58 CEST 2003


On 20030714 (Mon) at 1837:41 +0200, Matthias Andree wrote:
> Greg Louis <glouis at dynamicro.on.ca> writes:
> 
> > I'd been thinking in terms of building a hash in memory, then reading
> > the old .cdb file and mergeing the changes into a new .cdb, and finally
> > traversing the hash to add any new records.  Clearly not something
> 
> BTW, that's going to get difficult with bigger files, you need the whole
> file plus changes in core.

Only the changes.  I'd loop reading from the old, checking the hash and
writing to the new, then when the loop finished I'd append what's left
over (new records) in the hash.  But it amounts to the same thing,
really, because if I do a full rebuild of the training db, then I will
have the whole thing in core.  At present my training db seems to be
around 40 MB, which though bulky is manageable.  (I had lots of
distractions over the weekend so not much got done; with ols coming up
it may be a while before I get time on this.)

> I home to find time to try a transactional, native-locking, mode of the
> data base, if it really does forward-logging, it might be quite fast
> still, because we have the synchronous writes sequentially, i. e. fast.

Hope you can find the time.  I wouldn't be surprised if it did make a
big difference.

-- 
| G r e g  L o u i s          | gpg public key: finger     |
|   http://www.bgl.nu/~glouis |   glouis at consultronics.com |
| http://wecanstopspam.org in signatures fights junk email |




More information about the bogofilter-dev mailing list