merging databases & multiple wordlists

David Relson relson at osagesoftware.com
Wed Jan 26 00:57:44 CET 2005


Hi all,

There have been a couple of questions about merging databases.  As y'all
know, "bogoutil -d wordlist.db > wordlist.txt" writes out all the
tokens, their ham and spam counts, and their timestamps.   When
"bogoutil -l wordlist.db < wordlist.txt" is run, it's really _adding_
the incoming numbers to the wordlist.  If wordlist.db doesn't exist,
bogoutil creates it and loads the new tokens into it.  If wordlist.db
_does_ exist, bogoutil adds new tokens and increases counts of tokens
that are already in the wordlist.  So, merging two wordlists can be done
with:

   bogoutil -d first.db > first.txt
   bogoutil -d second.db > second.txt
   cat first.txt second.txt | bogotuil -l new.db 

Bogofilter also supports message scoring using multiple wordlists.  The
effect of the above bogoutil commands can be realised by using the following:

   wordlist r,first,/dir/first.db,1
   wordlist r,second,/dir/second.db,1

Given the same precedence number, i.e. "1", for both wordlists,
bogofilter will check both wordlists.   If you also had:

   wordlist r,other,/dir/third.db,2

bogofilter would first look for the token in the "1" lists.  If found,
it wouldn't search any further.  If not found, it would try the "2"
list. 

HTH,

David



More information about the Bogofilter mailing list