wordlist.db problem

OTR Comm otrcomm at isp-systems.com
Fri Jun 18 06:45:37 CEST 2004


Hello,

David Relson wrote:

> You're the first to report such a problem with 0.90.  Are you sure
> you've got the proper path for your wordlist file and that it has proper
> permissions?

I don't think the problem is with bogofilter, I think the problem is
that I don't have a large enough corpus in my database.  That is, I
discovered that if I don't use the -u switch to automatically update the
database when I test for spamicity and let bogofilter work with what is
in the database, it makes a best guess based on what it knows.  Then
after I correct the database on an error, and go back at it with:

bogofilter -e -p -d ./.bogofilter < spam/spam.31514,

it works fine.

I don't want to automatically update the database anyway, so this
problem made me dig a little deeper into how bogofilter works.  Still a
long way to go, but this has been helpful.  I may be wrong about not
using the -u switch, but I don't see that it buys me much.  Or does it?

> 
> Evidently you're using .bogofilter as your directory name.  The usual
> directory is ~/.bogofilter, which can also be written as
> $HOME/.bogofilter.  However, .bogofilter is also acceptable, so long as
> you're running bogofilter from the proper directory.
> 
> What is the output of command "ls -l .bogofilter/wordlist.db"?

-rw-r--r--    1 otrcomm  otrcomm     16384 Jun 17 19:24 wordlist.db

> 
> Bogofilter has some debugging switches which will make it print info
> about database operations.  Try adding switches "-x d -vv" to your
> command, as in "bogofilter -d .bogofilter -s -x d -vv < spam/spam.31514"
> and post the output.

[pid 28247] DB->open(db=0x806c0e0, file=.bogofilter/wordlist.db,
database=NIL, type=1, flags=0=, mode=0664) -> 0 Successful return: 0
# 139 words, 1 message
db_close (.bogofilter/wordlist.db) sync

> 
> Also you can use bogoutil to display the contents of the wordlist, as in
> "bogoutil -d .bogofilter/wordlist.db".  If you're interested in the spam
> and ham counts for particular words, run "bogoutil -w
> .bogofilter/wordlist.db word1 word2 word3".
> 
> In your example, the score of 0.520000 is what I'd expect when
> bogofilter is running with an empty wordlist (or has been given the
> wrong location for the wordlist).

I think I figured something else out.  Not sure, but when I was
originally testing bogofilter, I was logged in as 'root', but when I
logged in as 'otrcomm' and ran the tests you suggested above, everything
worked fine.  However, I still don't understand the value of the -u
switch.

One other question though, how does bogofilter ever come up with an
'Unsure' classification?  It always classifies mine as either Yes or
No?  I thought that it would have some bound around .5 probabliity that
would trigger an 'Unsure' classification.  Is this somewhere in
bogofilter.cf.example that I missed?

Thanks,
Murrah Boswell

> 
> HTH,
> 
> David
> _______________________________________________
> Bogofilter mailing list
> Bogofilter at bogofilter.org
> http://www.bogofilter.org/mailman/listinfo/bogofilter



More information about the Bogofilter mailing list