compile error with 0.16.4

Joerg Over Dexia over at dexia.de
Wed Mar 10 12:13:22 CET 2004


Am 15:00 09.03.2004 -0500 teilte David Relson mir folgendes mit:
->Using 0.15.13 for reference, the fix appears to be what's shown
below. 

Well, that fix only seems to change the name of the undeclared
variables... :

bogoutil.c: In function `display_words':
bogoutil.c:338: `spam_msgs' undeclared (first use in this
function)
bogoutil.c:339: `good_msgs' undeclared (first use in this
function)

But thx anyway for the hint, I checked the sources of 0.15.1 and
the fix seems to be:

@309:
-    unsigned long spam_count;
-    unsigned long good_count;
+    unsigned long spam_count, spam_msg_count = 0 ;
+    unsigned long good_count, good_msg_count = 0 ;

->May I ask why you need the deprecated code?  It's been removed
from
->0.17.0 and subsequent releases.

Yes, I know, that's why I wanted a most recent 0.16 at last. I
believe I have answered that a lot before, anyway:

For some experiments, I want to easily exchange the spam
databases. Therefore I prefer separate wordlists. I don't mind
the computing and space overhead as much as I mind the overhead
of combining these lists s * h.

And then, there is the spamicity value. Essentially, the chi
square calculation is dumbing down the robinson value to a
tri-state-value of 0/0.5/1 (I'm oversimplifying here, I know).
Well, I prefer to be able to discern between a 0.63456 spam and a
0.61245 spam (even a 0.63457 spam). I find it informative,
especially if I get several identical spam and find a difference
with the following when I'm registering the first. Besides, I
already get a tri-state-value in the tagline (which is
yes/no/unsure) and don't need a numerical one in addition. (I
know, to be fair, there are sometimes 0.9998 spams and 0.0002
hams, but that's not very often, and, essentially, information is
lost. Actually I never knew what that chi-square-squeezing was
good for in the end, and probably never will. I know what it is
*supposed* to be good for, enhancing the "confidence" of a
decision and, somehow, simplifying tuning.)

This is not about algorithms (since the Robinson and
Robinson/Fisher should produce the same results
spam/ham/unsurewise), it's merely about representation in the
Bogofilter tag line.

Thanks again for the hint and all your efforts, regards, JO




More information about the Bogofilter mailing list