wordhashes [was: time test]

Matthias Andree matthias.andree at gmx.de
Mon Nov 25 20:32:59 CET 2002


On Mon, 25 Nov 2002, Gyepi SAM wrote:

> > 1)collect_words and determine their counts (current).
> > 2)at the end of the message, make a pass over the list adding current count 
> > to cumulative count and clearing current count.
> > ... repeat 1 & 2 for each message
> > 3) at end, use cumulative counts to update database.
> 
> This is exactly what we do now. We need to avoid doing step 2 for
> every message, and instead, do it once, at the end of the message
> processing.

Not quite, and that's what's given me the 140 million wordhash_next
count in my profile, because we remember the former results of all step
#1's, just to iterate over them without doing anything, for each
message. That's quite expensive. See my patch.

-- 
Matthias Andree



More information about the bogofilter-dev mailing list