[henning at makholm.net: Bug#218638: bogofilter: deadlocks when controlling through pipes]

Clint Adams schizo at debian.org
Mon Nov 17 15:58:20 CET 2003


Forgot to forward this earlier.

----- Forwarded message from Henning Makholm <henning at makholm.net> -----

The documentation for bogofilter's "-b" option implies that it shold
be possible for a script to start a "bogofilter -T -b" process in the
background and have it perform multiple classifications by feeding it
requests through a stdin pipe and reading verdicts from stdout.

However, when I tried this I found it to be hard to synchronize the
transactions between the bogofilter and the parent/client.

First and most important: When bogofilter outputs to a pipe rather
than a terminal, it uses block-buffered output (that being libc's
default choice for a pipe).  That means that the parent/client cannot
expect to read the result of one analysis before requesting the next
one. I think that the output stream should be flushed explicitly after
each analysis.

Second: In my application, the parent/client process held the messages
to be analyzed in memory rather than on disk. So I tried using a named
pipe to feed the message into bogofilter, with client code reading
something like (in perl):

   sub bogoclassify {
       print BOGOFILTER_STDIN "named-pipe" ;
       open MSGPIPE, ">named-pipe" ;
       print MSGPIPE @_ ;
       close MSGPIPE ;
       my $x = <BOGOFILTER_STDOUT> ;
       if( $x =~ /named-pipe [HUS] ([-+0-9.e]+)/ ) {
           return $1 ;
       } else {
           return 'huh?' ;
       }
   }

Running this subroutine twice in a row led to a race condition:

Parent/client                 bogofilter
  write "named-pipe\n"
                                gets "named-pipe\n"
  open "named-pipe" O_WRONLY    open "named-pipe" O_RDONLY (synchronized)
  pipe out first message
                                read in first message
  close "named-pipe"            (chew on the message)
                                write out "named-pipe S 0.25318\n"
  gets "named-pipe S 0.25318\n"
  write "named-pipe\n"
                                gets "named-pipe\n"
  open "named-pope" O_WRONLY
  pipe out second message
                                close "named-pipe"
                                open "named-pipe" O_RDONLY
  <stuck waiting for answer>    <stuck waiting for message>

The problem is that bogofilter only closes the input file after having
read the name of the next file (at the beginning of open_mailstore()).
At that point the client may already have tried sending the next
message to be classified, and at least with some Linux kernels
(observed on 2.2.20), these data do not become avalable the next time
the pipe is opened.

In order to prevent such deadlocks, bogofilter must close its input
file before writing out the (last) answer for it. The following patch
makes it do that, and also adds the flush after a non-passthrough output.

I have tried hard to make the patch not break the other input modes
- but it took me several attempts to do that, so probably someone more
familiar with the code than me should look it over before applying it
to official versions.


diff -ur bogofilter-0.15.8.old/src/passthrough.c bogofilter-0.15.8/src/passthrough.c
--- bogofilter-0.15.8.old/src/passthrough.c	Tue Oct 28 23:48:11 2003
+++ bogofilter-0.15.8/src/passthrough.c	Sat Nov  1 19:43:18 2003
@@ -230,6 +230,8 @@
 	if (fflush(fpo) || ferror(fpo) || (fpo != stdout && fclose(fpo))) {
 	    cleanup_exit(2, 1);
 	}
+    } else {
+        fflush(fpo);
     }
 }
 
@@ -288,6 +290,11 @@
 	}
 	fprintf(fpo, "passthrough mode: %s\n", m);
     }
+}
+
+int passthrough_keepopen()
+{
+    return passthrough && passmode == PASS_SEEK ;
 }
 
 void passthrough_cleanup()
diff -ur bogofilter-0.15.8.old/src/bogofilter.c bogofilter-0.15.8/src/bogofilter.c
--- bogofilter-0.15.8.old/src/bogofilter.c	Sun Oct 12 22:30:10 2003
+++ bogofilter-0.15.8/src/bogofilter.c	Sat Nov  1 19:43:18 2003
@@ -99,6 +99,9 @@
 	collect_words(w);
 	msgcount += 1;
 
+        if( !passthrough_keepopen() )
+            bogoreader_closeifeof();
+        
 	if (register_opt && DEBUG_REGISTER(1))
 	    fprintf(dbgout, "Message #%ld\n", (long) msgcount);
 	if (register_bef)
diff -ur bogofilter-0.15.8.old/src/bogoreader.c bogofilter-0.15.8/src/bogoreader.c
--- bogofilter-0.15.8.old/src/bogoreader.c	Thu Oct 16 16:40:01 2003
+++ bogofilter-0.15.8/src/bogoreader.c	Sat Nov  1 20:21:46 2003
@@ -108,6 +108,9 @@
     size_t i;
     int c;
     reader_line_t *fcn = mailbox_getline;
+
+    if (fp==NULL)
+        return simple_getline ; /* which will return EOF immediately */
     
     c = fgetc(fp);
     ungetc(c, fp);
@@ -639,6 +642,15 @@
 	fprintf(stderr, "Can't read '%s'\n", name);
 	exit(EX_ERROR);
     }
+}
+
+/* cleanup after reading a message, exported. */
+/* Only called if the passthrough code says it is ok to close the file. */
+
+void bogoreader_closeifeof(void)
+{
+    if (fpin && feof(fpin))
+       bogoreader_close();
 }
 
 /* global cleanup, exported */
diff -ur bogofilter-0.15.8.old/src/fgetsl.c bogofilter-0.15.8/src/fgetsl.c
--- bogofilter-0.15.8.old/src/fgetsl.c	Sun Sep  7 02:40:17 2003
+++ bogofilter-0.15.8/src/fgetsl.c	Sat Nov  1 19:43:18 2003
@@ -29,7 +29,7 @@
 	abort();
     }
 
-    if (feof(in))
+    if (in == NULL || feof(in))
 	return(EOF);
 
     while ((cp < fin) && ((c = getc(in)) != EOF)) {
diff -ur bogofilter-0.15.8.old/src/bogoreader.h bogofilter-0.15.8/src/bogoreader.h
--- bogofilter-0.15.8.old/src/bogoreader.h	Tue Sep  2 01:10:39 2003
+++ bogofilter-0.15.8/src/bogoreader.h	Sat Nov  1 19:43:18 2003
@@ -18,6 +18,7 @@
 /* Function Prototypes */
 
 extern void bogoreader_init(int argc, char **argv);
+extern void bogoreader_closeifeof(void);
 extern void bogoreader_fini(void);
 void bogoreader_name(const char *name);
 
diff -ur bogofilter-0.15.8.old/src/passthrough.h bogofilter-0.15.8/src/passthrough.h
--- bogofilter-0.15.8.old/src/passthrough.h	Mon Sep 22 01:08:44 2003
+++ bogofilter-0.15.8/src/passthrough.h	Sat Nov  1 19:43:18 2003
@@ -14,6 +14,7 @@
 extern FILE *fpo;
 extern void passthrough_setup(void);
 extern void passthrough_cleanup(void);
+extern int  passthrough_keepopen(void);
 extern void write_message(rc_t status);
 extern void write_log_message(rc_t status);
 extern void output_setup(void);





More information about the bogofilter-dev mailing list