t.bulkmode problem

David Relson relson at osagesoftware.com
Tue Nov 23 01:12:33 CET 2004


On Tue, 23 Nov 2004 00:00:36 +0100
Matthias Andree wrote:

> David Relson <relson at osagesoftware.com> writes:
> 
> > t.bulkmode seems to be running into a problem with pipes.
> 
> bogotune opened the environment twice, jamming locks, causing
> bogofilter to trigger recovery, which didn't work as bogotune was
> still running.
> 
> I've needed to apply some minor fixes throughout bogotune.c,
> particularly to OPTIONS and dbgout default.
> 
> It passes make check now, but we're still not ready to release, the
> multiple-environment is still not backed by a multiple-lock scheme.

Multiple wordlists are not commonly used, AFAICT.  Why not document the
limitation and go ahead?  One possibility is to (for the time being)
recommend disabling transactions (use olddb) with multiple wordlists.
 
> There's also a problem in the message-count parser. Somehow, flex can
> propagate junk that starts with a leading space through to collect.c,
> which causes a segfault in wordhash_insert because we're stuffing
> ULONG_MAX in the marked lines. For some reason, we cannot assume that
> we don't have a leading space.
> 
> .       if (cls == BOGO_LEX_LINE)
> .       {
> .           char *s = (char *)(yylval->text+1);
> >           char *f = strchr(s, ' ') - 1;
> .           token->text = (unsigned char *) s;
> >           token->leng = f - s;
> .       }

As message-count files are used to speed scoring when running scoring
tests, weaknesses in their handling don't affect production use.  Did
you encounter the problem with a real file? or with something you
created another way?  Having spaces in tokens is bogus -- to the parser
spaces are always delimiters.

Regards,

David



More information about the bogofilter-dev mailing list