memory usage

David Relson relson at osagesoftware.com
Sat Mar 12 01:10:38 CET 2005


On Fri, 11 Mar 2005 14:34:02 +0100
Matthias Andree wrote:

> David Relson <relson at osagesoftware.com> writes:
> 
> > probability appear to be definable with a union (thus saving 8 bytes
> 
> Please don't. Unions are one of the ugliest warts in the C
> language. They are unchecked casts.
> 
> One thing that would work is registering messages or sets of messages as
> a "checkpoint" (but still the whole process should be one huge
> transaction because we have only one exit code even though we read the
> huge mailbox piece-wise).

At the moment, we have 1 structure filling 3 roles - 
  1. token frequencies for registration
  2. ham & spam token and ham & spam message counts for token scoring
  3. token score (for message scoring)

To handle these 3 roles, it has 5 uint32 values and a float -- total
cost 28 bytes.  The most storage any of the 3 roles actually needs is
16 bytes (4 uint32's for role #2; or 2 uint32's and 1 float for role
#3).  Along similar lines, bogotune uses some of the same storage
structures and is less efficient now than it used to be.  I'd like to
see these inefficiencies cleaned up.

In any case, I'm not yet ready to make changes.  I want to check for
further storage inefficiencies affecting mbox registration.

Ciao,

David
_______________________________________________
Bogofilter mailing list
Bogofilter at bogofilter.org
http://www.bogofilter.org/mailman/listinfo/bogofilter



More information about the Bogofilter mailing list