bogoutil -m STILL not removing "singletons"?

Benji Tittle benji at tittle.net
Tue Sep 9 19:14:25 CEST 2003


Armed with a better understanding of bogoutil (thanks, Chris Wilkes), I've
tried this again... but I'm STILL don't seem to be getting results.

Here's the new sequence of commands & output.  I started with a database
freshly rebuilt from my corpora.  Single wordlist.db file, 8224768 bytes.

$ bogoutil -d ./wordlist.db | wc -l
 224841
$ bogoutil -m ./wordlist.db -c1
(c_get): Successful return: 0
$ bogoutil -d ./wordlist.db | wc -l
 224840

I double-count my ham, so there should be no ham singletons, but I'm
having trouble believing that I had only ONE singleton in my entire spam
corpus!

Size was unchanged at first.  I then compacted the database with:
$ bogoutil -d ./wordlist.db | bogoutil -l wordlist.db.new

After compaction the db was 8220672... a reduction of only 4k.  What does
the database compaction actually do, anyway?  Because I *did* compact the
database before doing any of this.  Is "singleton removal"  an
undocumented feature of a "-d | -l" compaction?

I should mention that the bogoutil -m command returns the "(c_get)" line
instantly... i.e. it doesn't seem like it's actually doing anthing.





More information about the Bogofilter mailing list