bogofilter over NFS

Piotr KUCHARSKI chopin at sgh.waw.pl
Fri Feb 7 18:35:43 CET 2003


On Fri, Feb 07, 2003 at 12:35:45PM +0100, Matthias Andree wrote:
> OK, can you give more detail what exactly fails, like running sotruss or
> truss on the original bogofilter code that loops unterminatedly? 

Sure!
 stat("./checks.16676.20030206T100351/", 0xEFFFF7B0) = 0
 getpid()                                        = 16698 [16697]
 open64("././checks.16676.20030206T100351/goodlist.db", O_RDONLY) = 3
 fcntl(3, F_SETFD, 0x00000001)                   = 0
 fstat64(3, 0xEFFFF8D8)                          = 0
 llseek(3, 0, SEEK_SET)                          = 0
 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0".., 256)      = 256
 close(3)                                        = 0
 stat64("/var/tmp", 0xEFFFF0E0)                  = 0
 brk(0x00043640)                                 = 0
 brk(0x00047640)                                 = 0
 open64("./checks.16676.20030206T100351/goodlist.db", O_RDONLY) = 3
 fcntl(3, F_SETFD, 0x00000001)                   = 0
 fstat64(3, 0xEFFFF908)                          = 0
 mmap64(0x00000000, 16384, PROT_READ, MAP_PRIVATE, 3, 0) = 0xEF760000
*fcntl(3, F_SETLK, 0xEFFFFBA8)                   Err#11 EAGAIN
 munmap(0xEF760000, 16384)                       = 0
 close(3)                                        = 0
 poll(0xEFFFDBD0, 0, 1823)                       = 0
 getpid()                                        = 16698 [16697]
 [ and over again from getpid() ]

> I'd
> like to see what BerkeleyDB returns as error and possibly handle it by
> retrying without mmap.

BerkeleyDB doesn't return any error, it is bogofilter looping over fcntl(),
as I wrote in my first letter.

> What you can try and what should help is: tell BerkeleyDB not to mmap()
> the file. Edit datastore_db.c, and find the location show below and
> insert the line
> opt_flags |= DB_NOMMAP;
> before the /* open data base */ comment, as shown below near line #128;

That did it! All tests passed (or skiped), no looping. I guess that should
be somewhere in configure. --enable-olderthanSolaris8NFSfcntlhack :)

(Thanks for detailed info, but I'm a coder myself, there was no need to)

A little fun: when make check is in progress, it does
'rm -rf core check.pid.blabla' -- on NFS it may loop, because
when we delete file in use, NFS creates temporarily .nfsXXXX
files and when we delete them, NFS keeps recreating them.
In fact, it happens always in valgrind, hm. Killing rm makes
valgrind fail and I don't quite see the way to fix it. Hmm.
kill -stop to rm and what does the fuser tell me?
# fuser -u .nfs08B93
.nfs08B93:    27722o(chopin)   27712o(chopin)
# ps -fp 27722,27712
     UID   PID  PPID STIME    CMD
  chopin 27722 27712 17:34:16 rm -r -f core ./checks.27712.20030207T173416
  chopin 27712 27639 17:34:16 /bin/bash ./t.valgrind

Hm, why is bash keeping this file open? As long as it does that,
NFS will keep recreating .nfsXXXX...
After a little play I've found it's vg.out; I changed a little:

exec 3>$TMPDIR/vg.out
VALGRIND="valgrind -q --num-callers=20 --logfile-fd=3"
( $VALGRIND true ) 2>/dev/null || exec 3>/dev/null && exit 77
( $VALGRIND false ) 2>/dev/null && exec 3>/dev/null && exit 77

and added (just in case, as I have no valgrind, so I cannot test)
"exec 3>/dev/null" at the end of t.valgrind

p.

-- 
Beware of he who would deny you access to information, for in his
heart he dreams himself your master.   -- Commissioner Pravin Lal
http://nerdquiz.sgh.waw.pl/  -- polska wersja quizu dla nerdów ;)




More information about the Bogofilter mailing list