Floating point errors?

Matthias Andree matthias.andree at gmx.de
Wed Jul 25 18:04:50 CEST 2007


Ingomar Wesp schrieb:

> I'm using KMail 1.9.7, which is KDE's default mail user agent. KMail has a 
> built-in feature called "Anti-Spam Wizard" that automatically creates filters 
> for using external anti-spam software like bogofilter or spamassasin. 
> Unfortunately the filter setup that is created for bogofilter looks like this 
> (stuff that is irrelevant for bogofilter has been removed):
> 
> +----------------------+----------------------------------------+-------+
> | Filter name          | Action                                 | Auto? |
> +----------------------+----------------------------------------+-------+
> | Bogofilter Check     | Pipe through    "bogofilter -p -e -u"  | Yes   |
> | Classify as SPAM     | Execute command "bogofilter -N -s"     | No    |
> | Classify as NOT SPAM | Execute command "bogofilter -S -n"     | No    |
> +----------------------+----------------------------------------+-------+
> 
> Obviously, each time the user applies "Classify as SPAM" on a message that has 
> not previously been registered (either because it's an old message that has 
> not been piped through bogofilter before or because bogofilter was unsure 
> about whether the message was ham or spam), the ham values for all tokens in 
> this message (and the .MSG_COUNT) get decremented. Which is bad, because they 
> were not wrongly incremented in the first place. The same applies 
> for "Classify as NOT SPAM" and spam-counts respectively.
> 
> If I'm not mistaken, this is a bug in KMail and should be corrected. In case 
> there's no one else with a better grip on the English language who wants to 
> do it, I'll be filing a bug report soon.

Ingomar,

Your English is decent enough for a bug report :-)

Anyways, the alternative action set corresponding to the order you're
showing above is:

check:    bogofilter -p -e
SPAM:     bogofilter -s
NOT SPAM: bogofilter -n

It would also speed things up a bit as the "check" no longer involves
costly synchronous writes and keep the database size smaller.

HTH

Matthias



More information about the Bogofilter mailing list