bogoutil -m STILL not removing "singletons"?
Benji Tittle
benji at tittle.net
Tue Sep 9 19:14:25 CEST 2003
Armed with a better understanding of bogoutil (thanks, Chris Wilkes), I've
tried this again... but I'm STILL don't seem to be getting results.
Here's the new sequence of commands & output. I started with a database
freshly rebuilt from my corpora. Single wordlist.db file, 8224768 bytes.
$ bogoutil -d ./wordlist.db | wc -l
224841
$ bogoutil -m ./wordlist.db -c1
(c_get): Successful return: 0
$ bogoutil -d ./wordlist.db | wc -l
224840
I double-count my ham, so there should be no ham singletons, but I'm
having trouble believing that I had only ONE singleton in my entire spam
corpus!
Size was unchanged at first. I then compacted the database with:
$ bogoutil -d ./wordlist.db | bogoutil -l wordlist.db.new
After compaction the db was 8220672... a reduction of only 4k. What does
the database compaction actually do, anyway? Because I *did* compact the
database before doing any of this. Is "singleton removal" an
undocumented feature of a "-d | -l" compaction?
I should mention that the bogoutil -m command returns the "(c_get)" line
instantly... i.e. it doesn't seem like it's actually doing anthing.
More information about the Bogofilter
mailing list