t.bulkmode problem

David Relson relson at osagesoftware.com
Sat Nov 20 22:56:14 CET 2004


Matthias,

t.bulkmode seems to be running into a problem with pipes.  It uses
bogotune to convert a file to message count fourmat and that result is
piped to bogofilter:

$ ../bogotune -M -c ./bulkmode.20041120/test.cf -I ./inputs/msg.8.txt |
\  ../bogofilter -c ./bulkmode.20041120/test.cf -t -v -D
PANIC: fatal region error detected; run recovery
bogotune(datastore_db.c:1045): DBE->txn_checkpoint returned
    DB_RUNRECOVERY: Fatal error, run database recovery
PANIC: fatal region error detected; run recovery


t.bulkmode can be made to work by using a temp file instead of the pipe,
i.e.

--- t.bulkmode	17 Nov 2004 15:18:29 -0000	1.13
+++ t.bulkmode	20 Nov 2004 21:52:45 -0000
@@ -104,7 +104,8 @@
 
 NAME="bogolex-single"
 for f in $pattern ; do 
-    map_rc "$BOGOLEX_SH -c $CFG -I $f | $BOGOFILTER
>>${TMPDIR}/$NAME.out"
+    map_rc "$BOGOLEX_SH -c $CFG -I $f >${TMPDIR}/tmp"
+    map_rc "$BOGOFILTER < ${TMPDIR}/tmp >>${TMPDIR}/$NAME.out" done
 
 # test scoring of mbox batch of msg-count messages


Two notes:

For debugging, I modified print_error() to display progname, file, and
lineno.

The point of a .mc file is speed up scoring (for running tests).  A
.mc file gives bogofilter all the info it needs to score a message
(without having to lookup the tokens in a wordlist), i.e. bogofilter
doesn't need to open wordlist.db (and there's no reason to do so).



More information about the bogofilter-dev mailing list