bogofilter over NFS

Matthias Andree matthias.andree at gmx.de
Sun Feb 9 23:54:36 CET 2003


Piotr KUCHARSKI <chopin at sgh.waw.pl> writes:

>> OK, can you give more detail what exactly fails, like running sotruss or
>> truss on the original bogofilter code that loops unterminatedly? 
>
> Sure!
>  open64("./checks.16676.20030206T100351/goodlist.db", O_RDONLY) = 3
>  fcntl(3, F_SETFD, 0x00000001)                   = 0
>  fstat64(3, 0xEFFFF908)                          = 0
>  mmap64(0x00000000, 16384, PROT_READ, MAP_PRIVATE, 3, 0) = 0xEF760000
> *fcntl(3, F_SETLK, 0xEFFFFBA8)                   Err#11 EAGAIN

Yup, exactly as documented. I committed a workaround to the current CVS
code that is supposed to automatically retry with DB_NOMMAP should the
fcntl F_SETLK fail with EAGAIN.

Regretfully, EAGAIN isn't very distinctive; it can mean "the lock you
requested is held by another process" (which is kind of temporary) or it
can mean "you must not lock this file because it's mmap()ed" (which is
permanent), so we just need to try at run time and see what works. OTOH,
I'll rather not try going without mmap() altogether because I fear this
to be very slow on slower CISC machines.

> That did it! All tests passed (or skiped), no looping. I guess that should
> be somewhere in configure. --enable-olderthanSolaris8NFSfcntlhack :)

I'll rather not bother the user with such portability junk. The user can
reasonably expect a software that "just runs". Getting fcntl to fly is
no rocket science.

> (Thanks for detailed info, but I'm a coder myself, there was no need
> to)

Yes, but I think that verbosity doesn't hurt on a users list. These
people may not be aware of the intricacies.

Hum, verbosity won't hurt on a user list (we also have the
bogofilter-dev mailing list); someone might look at the mail archive and
feel interested in trying some debugging on another OS.

> A little fun: when make check is in progress, it does
> 'rm -rf core check.pid.blabla' -- on NFS it may loop, because
> when we delete file in use, NFS creates temporarily .nfsXXXX
> files and when we delete them, NFS keeps recreating them.

> In fact, it happens always in valgrind, hm. Killing rm makes
> valgrind fail and I don't quite see the way to fix it. Hmm.
>
> Hm, why is bash keeping this file open? As long as it does that,
> NFS will keep recreating .nfsXXXX...
> After a little play I've found it's vg.out; I changed a little:
>
> exec 3>$TMPDIR/vg.out
> VALGRIND="valgrind -q --num-callers=20 --logfile-fd=3"
> ( $VALGRIND true ) 2>/dev/null || exec 3>/dev/null && exit 77
> ( $VALGRIND false ) 2>/dev/null && exec 3>/dev/null && exit 77
>
> and added (just in case, as I have no valgrind, so I cannot test)
> "exec 3>/dev/null" at the end of t.valgrind

Thanks for the detailed analysis.

exec 3>&- would be the proper magic.  The change you suggest is not
robust enough IMO; I moved the redirection into the VALGRIND variable
itself:

[...]
| : ${srcdir=.}
| relpath="`pwd`/../.."
| NODB=1 . ${srcdir}/../t.frame
| 
| VALGRIND="valgrind 3>${TMPDIR}/vg.out -q --num-callers=20 --logfile-fd=3"
| ( eval $VALGRIND true  ) 2>/dev/null || exit 77
| if ( eval $VALGRIND false ) 2>/dev/null ; then exit 77 ; fi
[...]

This is safe because we check the vg.out file right after calling
VALGRIND. While this leaves room for further changes
(--logfile=${TMPDIR}/vg.out), but I'll leave it this way for "maintainer
efficiency" (a. k. a. laziness), it's equivalent and works.

BTW, you cannot have valgrind because that's a tool that only works on
ix86 based Linux currently. You may have heard of the commercial Purify
tool; valgrind has similar goals.

-- 
Matthias Andree




More information about the Bogofilter mailing list