cdb preliminary results
Greg Louis
glouis at dynamicro.on.ca
Fri Jul 11 12:47:08 CEST 2003
On 20030711 (Fri) at 0245:21 +0200, Matthias Andree wrote:
> People would use batch mode updates for these data bases.
>
> If we were to support that, we'd rather write change instructions in
> append-mode to a file, and at some time, merge these change instructions
> into the source for cdb and recompile the cdb.
I'd been thinking in terms of building a hash in memory, then reading
the old .cdb file and mergeing the changes into a new .cdb, and finally
traversing the hash to add any new records. Clearly not something
you'd do for one message at a time -- batch mode only. That way saves
having to maintain a separate source file, though you'd still need
double the disk space while building the new .cdb.
WRT dbt_v vs ASCII: for my experimental multiprocess version ASCII is
actually faster for reading, because the data get piped to the
classifier in msg-count format -- storing them in ASCII saves s
conversion. In monolithic bogofilter, it adds a conversion, so there
your point about speed is well taken.
--
| G r e g L o u i s | gpg public key: finger |
| http://www.bgl.nu/~glouis | glouis at consultronics.com |
| http://wecanstopspam.org in signatures fights junk email |
More information about the bogofilter-dev
mailing list