compile time options

Boris 'pi' Piwinger 3.14 at logic.univie.ac.at
Tue Sep 30 14:51:39 CEST 2003


Boris 'pi' Piwinger wrote:

> OK, I'll start from the man page (1.15.4):

Next is bogofilter.cf.example:

> #### WORDLIST: define additional word lists
> #
> #       char type: 's','g','i' (denoting spam, good, or ignore)
> #       char *name: name of list, e.g. "good", "spam", "ignore"
> #       char *path: path to file
> #       double weight - probability BIAS for list
> #       int override - skip lower valued lists
> 
> ##wordlist i,ignore,.ignorelist.db,1,0,0

Has anybody used this?

> #wordlist_mode=combined
> ##wordlist_mode=separate

We should drop the second completely.

> #       terse_format - an abbreviated form of header_format;
> #               selected by command line option '-t'

Should go with -t.

> ##### TERSE
> #
> #       if enabled, format the X-Bogosity using the 'terse_format' specificaton.
> 
> #terse=no

Not needed, you can set that directly.

> ##### STRICT_CHECK
> #
> #       if enabled, html comments are delimited by "<!--" and "-->".
> #       if disabled, html comments are delimited by "<!" and ">".
> 
> #strict_check=no

I think, this can go.

> #### BLOCK ON SUBNETS
> #
> #       convert IPADDRs into a special token, url:1.2.3.4,
> #       and also return url:1.2.3, url:1.2, and url:1
> #       to allow identifying spammers by ip address / subnets.
> 
> #block_on_subnets=no

Can go.

> #### CHARSET handling
> #
> #       specify default charset
> 
> #charset_default=us-ascii
> ##charset_default=iso-8859-1

Where is that used?

> ##### HEADER_LINE_MARKUP
> #
> #       if enabled, add a "subj:" prefix to tokens in the Subject: line
> 
> #header_line_markup=yes
> 
> 
> ##### IGNORE_CASE
> #
> #       if enabled, all letters are converted to lower case
> 
> #ignore_case=no
> 
> 
> #### TOKENIZE_HTML_TAGS
> #
> #       when enabled, the innards of html tags are tokenized
> 
> #tokenize_html_tags=yes

Should all go.

> #### TOKENIZE_HTML_SCRIPT ---   *** NOT YET WORKING ***
> #
> #       when enabled, the innards of html script blocks are tokenized
> 
> #tokenize_html_script=no

Whatever this is supposed to do, I think we should do it (or
not), but not make it optional.

> ##### THRESHOLD Values
> #
> #       used to determine if/when spamicity
> #       values are output by print_bogostats()

This should be explained by an option, not internal
functions. Anyhow, I don't know if we need that.

> #### ALGORITHM
> #
> #       specify scoring algorithm
> 
> ##algorithm=graham
> ##algorithm=robinson
> #algorithm=fisher

If the algorithms go, we don't need options;-)

pi





More information about the Bogofilter mailing list