subtle "algorithm" definition issue.

David Relson relson at osagesoftware.com
Sun Nov 3 20:37:06 CET 2002


At 01:57 PM 11/3/02, Matthias Andree wrote:

>On Sun, 03 Nov 2002, David Relson wrote:
>
> > The statement in bogofilter.h wasn't intended to declare the variable.  It
> > probably should be the enum statement in your change.  However, when I do
> > that, I see that global variable "algorithm" is used in programs
> > bogofilter, bogoutil, configtest, and debugtest.  Bogofilter has at least
> > one other global variable that I see declared in the main routines of
> > several programs, i.e. run_type.
>
>Well, yes.
>
> > This, and the implementation of config.c, leads to the broader questions:
> >
> >         Where do global variables belong?
> >         Where do variables for configuration options belong?
>
>setup_lists does not need the algorithm. It's only ever called from
>main.c, and I have main.c now pass in the good/bad weights, with the
>latter fixed to 1.0 and the former to whatever GOOD_BIAS expands to. (We
>might consider making GOOD_BIAS a function though.)

bogofilter.c includes init_constants(), which is the function that knows 
the Robinson/Graham differences for constants used during calculating 
spamicity.  I think that good_bias should be set there.  If we want to be 
object oriented, let's add accessor functions get_spam_bias() and 
get_good_bias() to bogofilter.c and let setup_lists() use them.


>That way, the global variable which does definitely not belong into
>wordhash.? can be kept out of this.
>
> > The answer that comes to mind first is:
> >
> >         create a globals.c to hold run_type, algorithm, etc.
> >         use config.c as the home for configuration options.
> >
> > I suspect some variables fit both categories (config and global).  I doubt
> > there'll be any code in globals.c, so maybe it shouldn't even
> > exist.  Perhaps we should put them all in config.c.
>
>I believe we have a problem with the interfaces, not with where the
>variables are. We should review all interfaces and make sure they are
>complete -- global variables usually give the hint that some interface
>is not powerful enough.
>
>Passing in a handle for the configuration should be fair enough.

I'm reminded here of the singleton pattern, which I think applies well to 
configuration options.  One implementation would be to pass a 
"configuration *" parameter (aka, a handle) to all functions that need 
it.  Unfortunately, often the routines needing the information are 
low-level so the parameter needs to be passed to a high-level routine which 
never uses it except to pass it to another routine, etc, etc.  An alternate 
approach is for a central module to "own" the global structure and have a 
pointer to the structure be available to anyone who wants use the values in 
it.  With appropriate use of "const" we should be able to prevent the 
config data from being modified (except by read_config_file() in config.c 
and by process_args() in main.c).  Since these routines deal with 
configuration and execution parameters, it might be a good idea to have 
them in the same module.





More information about the bogofilter-dev mailing list