bogotune broken? Larger data* revision committed to CVS.

Matthias Andree matthias.andree at gmx.de
Mon Nov 22 00:00:50 CET 2004


On Sat, 20 Nov 2004, David Relson wrote:

> > We need to use a different means to access somebody else's database,
> > and inter-process communication comes to mind. For instance RPC, FIFO
> > or unix-domain or perhaps IP sockets.
> 
> Ah!  So the problem is an inherent limitation of locking.  Why not
> document it, so bogofilter admins will know it and can choose to live
> with the limitation.

Yes, it is a locking limitation, and a trust limitation - BerkeleyDB
doesn't know access control. If you can access all the files, you can
access the database, if you don't have access to the locking region,
then you cannot access the file. The old fcntl() stuff appears to be a
bit laxer but I'm not sure if that isn't a design flaw in POSIX file
locking.

> Bogofilter only writes to 1 wordlist in any given execution.  When
> dealing with multiple wordlists, only the first "regular", i.e.
> non-ignore, list is opened with write permission.  All others are always
> opened read-only.  I wonder if the code can take advantage of that
> knowledge.

We need write access to the environment even if we only read from the
database. That is the price we pay for concurrent access.

> Having pure functions is of value.  Putting lots of (formerly) global
> variables into a structure to purify the functions doesn't seem
> valuable.  I'd rather see accessor functions defined as inline
> functions.

That cannot work in portable C. The variable is in a different
compilation unit but if you want to inline the accessor function, it
must be in the same compilation unit as the variable. Hence, variable
and accessor function must be in the same unit of compilation as the
user of that variable. But if that's the case, the accessor function
becomes redundant.

> At least, grep could then be used to determined whether the
> variable is being read or modified.

True.

> > I'm not in favor of accessor functions for static variables, we aren't
> > gaining anything through them except additional complexity. We may as
> > well do it right and allocate dynamically.
> 
> "right" and "dynamically" would be accurate if we were using an object
> oriented language with class variables.  As we're not doing that,
> whatever we have is a compromise -- and I can live with that fact.

We are effectively emulating OOP through abstraction layers.

> > Presentation is a matter of the editor software. emacs's speedbar for
> > one. I rarely access functions by scrolling, I usually access via tags
> > - way faster regardless of where the function is, and the static
> > declarations confuse the tags generator.
> 
> The compiler imposes its own ordering requirements.  For example, that's
> why main() is usually at the end of a file even though it's the first
> function to execute.  Including forward declarations allows functions to
> be ordered as seems most logical or convenient.

True. We aren't getting anywhere as we have different opinions on that,
so I'm willing to go whatever we have. I'll need to try what happens if
I steal emacs the GNU ctags and give it Darren Hiebert's Exuberant Ctags
instead.

> > > The handler was needed to ensure the database gets closed.  If that
> > > assurance is no longer necessary, it can be dropped.
> > 
> > We may rather need to fix the paths out of our software for those
> > error conditions that are recoverable and/or do not damage the
> > database.
> 
> atexit() provides a poor man's form of exceptions.  'Tis too bad that
> proper exception handling requires a language like C++ (which would cost
> us portability).

Poor man's exceptions are called setjmp ("try/catch" in one) and longjmp
("throw") and the sig* variants. They're even ANSI-C which has been
around for 15 years so there'll be no excuse if some system doesn't have
these. :-)

> > "make check" doesn't appear to run exit paths that fail to close the
> > databases, except perhaps t.bulkmode. You'll find out.
> 
> I'll tweak bogofilter to let us know when (if) it reaches the exit
> handler before closing the databases.

Good idea.

-- 
Matthias Andree



More information about the bogofilter-dev mailing list