New bogofilter TXN snapshot available.

David Relson relson at osagesoftware.com
Sun Oct 31 01:10:08 CEST 2004


On Sat, 30 Oct 2004 23:35:15 +0200
Torsten Veller wrote:

...[snip]...

> I don't think so, David. I think the wordlist must be rebuild (?).

I'm not yet convinced. 
> $ bogofilter -V
> bogofilter version 0.92.8
>     Database: BerkeleyDB (4.1.25)
> $ rm wordlist.db
> $ bogoutil -l wordlist.db < wordlist.txt
> $ db4.1_verify wordlist.db
> 
> (install bogofilter-0.92.99.cvs)
> 
> $ bogofilter -V
> bogofilter version 0.92.99.cvs
>     Database: BerkeleyDB (4.1.25)
> $ db4.1_verify wordlist.db
> 
> $  $ echo abc | bogofilter -v
> bogofilter: (db) DB->get(TXN=134789408,  '.MSG_COUNT' ), err: -30989,
> DB_PAGE_NOTFOUND: Requested page not found

In the command you've used, bogofilter will look for wordlist.db in the
normal place, i.e. ~/.bogofilter/wordlist.db, rather than the current
directory.  Instead of "bogofilter -v", try "bogofilter -v -x d -v". 
With the extra
flags, bogofilter will print debug information, like:

DB_ENV->open(home=test-0.92.99.d/)
db_version: Header version 4.2, library version 4.2
[pid 10683] DB->open(db=0x8072e70, file=wordlist.db, database=NIL,
  type=1, flags=0x1000000=DB_AUTO_COMMIT , mode=0664) -> 0 Successful
  return: 0

To force bogofilter to use the current directory for the wordlist, use
option "-d ."

I wrote a test script to do reproduce what you did.  It assumes that you
have 4 executables available:

  bogofilter-0.92.8  and bogoutil-0.92.8
  bogofilter-0.92.99 and bogoutil-0.92.99

It creates a directory "test-0.92.99.d" and uses it for all operations. 
I've attached it as file test-0.92.99.sh and its output as
test-0.92.99.out

After testing "echo abc | bogofilter -x d -vv" to verify your wordlist
path, run "echo abc | bogofilter -d . -v" to show what happens when
using the wordlist you build.  Then run my script.  Hopefully this will
make clear exactly what is happening.

Also worth noting is the contents of directory test-0.92.99.d after the
test is run.  With my wordlist.txt file (which contains 100,000
entries), this is what I have:

[relson at osage src]$ ls -l test-0.92.99.d
total 28120
-rw-r--r--  1 relson relson     8192 Oct 30 18:42 __db.001
-rw-r--r--  1 relson relson  5251072 Oct 30 18:42 __db.002
-rw-r--r--  1 relson relson    98304 Oct 30 18:42 __db.003
-rw-r--r--  1 relson relson  4063232 Oct 30 18:42 __db.004
-rw-r--r--  1 relson relson    16384 Oct 30 18:42 __db.005
-rw-r--r--  1 relson relson        0 Oct 30 18:42 lockfile-d
-rw-r--r--  1 relson relson     1024 Oct 30 18:48 lockfile-p
-rw-r--r--  1 relson relson 10485729 Oct 30 18:42 log.0000000001
-rw-r--r--  1 relson relson  8332319 Oct 30 18:48 log.0000000002
-rw-r--r--  1 relson relson  3588096 Oct 30 18:42 wordlist.db

As described in doc/README.db, to perform transactions (thus ensuring
database integrity) Berkeley DB's Transactional Data Store capability
creates several extra files.  Some of them will always be present.
Others (like the log.0000* files) will grow in size and number, but can
be removed after running db_archive (as described in README.db).  If you
haven't read README.db do so.

As another warning, if you have a _large_ wordlist built by 0.92.8 (or
older) and run 0.92.99 you may run out of locks.  This produces a
message like:

Lock table is out of available locks
bogoutil: (db) DB->get(TXN=134571712,  'GQjZvFNTkabd' ), err: 12, Cannot
allocate memory

The cure is to add a DB_CONFIG file in the wordlist directory that sets
a larger number of locks for Berkeley DB.  This is described in section
"3. LOCK TABLE EXHAUSTION" of README.db.   I mention this because the
message appeared when I used my _complete_ wordlist.txt (which contains
approx 1,500,000 tokens, not the 100,000 used with the test output I
sent to you).

It _may_ be that you're having a Berkeley DB problem since we're using
different versions.  You've got 4.1.25 and I'm running 4.2.52.  However,
I don't really think that's what's happening.

I hope you're your still online so we can determine what's really
happening! 

Regards,

David
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test-0.92.99.sh
Type: application/x-sh
Size: 442 bytes
Desc: not available
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20041030/eb9bcb3d/attachment.sh>


More information about the Bogofilter mailing list