speed [was: token pairs]

Boris 'pi' Piwinger 3.14 at logic.univie.ac.at
Wed Apr 14 08:51:19 CEST 2004


David Relson <relson at osagesoftware.com> wrote:

>> I am just running a test on this. It took about two and a
>> half hours to bulk verify (-Mv) 25k messages. Interesting
>> enough, treating those same messages individually took only
>> about 800 seconds! This indicates that there is some bug,
>> the results are plausible, though.
>
>As a quick test, using a set of 112 messages (listed in file "l"), I ran
>bogofilter with and without the "-P" switch and used "time" to print the
>info.  Here are the results:
>
>[relson at osage src]$ time bogofilter -C -b < l
>Command exited with non-zero status 1
>1.93user 0.32system 0:02.43elapsed 92%CPU (0avgtext+0avgdata
>0maxresident)k
>0inputs+0outputs (349major+1506minor)pagefaults 0swaps
>
>[relson at osage src]$ time bogofilter -C -b < l -P
>Command exited with non-zero status 1
>2.52user 0.52system 0:03.21elapsed 94%CPU (0avgtext+0avgdata
>0maxresident)k
>0inputs+0outputs (349major+2254minor)pagefaults 0swaps
>
>Using token pairs increased the time from 1.93 to 2.52 seconds (of user
>time).  Given the additional work of creating token pairs and looking
>them up, this seems reasonable.

I plan to do similar tests (with larger files). But my test
I started yesterday is still running:-( Right now it builds
the wordlist for full training. I don't have exact times
when it started, but it must be around five hours it is
already working on 15k messages.

pi




More information about the Bogofilter mailing list