[PATCH] combined wordlist a.k.a. single list

David Relson relson at osagesoftware.com
Sat May 31 21:08:13 CEST 2003


Greetings Bogofilter Developers!

Today there's a treat for you :-)

At present, bogofilter's database consists of two files: goodlist.db and 
spamlist.db.  Each record contains a token, a count of messages in which 
that token had been encountered, and (optionally) a timestamp.

This patch converts bogofilter so that it stores all the data in a single 
file, wordlist.db, with records containing token, spam count, nonspam count 
and (optionally) the timestamp.

It's logically simpler, and should give better performance, to do just one 
database lookup per token instead of two.  Testing has shown that this is 
generally the case.  However, Berkeley DB operates with a memory cache 
(defaulting to 256 Kbytes), and it has been found that a bad choice of 
cache size can lead to no performance improvement, or in some cases even to 
severe performance degradation, as discussed in the next few 
paragraphs.  We are therefore offering the patch for evaluation, and 
seeking feedback, rather than incorporating the change officially into 
bogofilter at this time.  The patch includes modifications for the 
bogoupgrade script so that you can convert an existing two-list database to 
the single-list form.

For a given database, the single wordlist.db file is typically lots larger 
than either the goodlist.db or the spamlist.db from the corresponding 
two-list version.  That's because there's relatively little overlap -- 
typically only 40,000 tokens or so out of close to a million -- between the 
tokens found in spam and those found in nonspam.  Therefore, a bigger cache 
is needed when a single list is used.

The ordinary, two-list version of bogofilter has up to now used the default 
256K cache size.  Increasing this to several megabytes can considerably 
improve the speed of evaluation for large messages, but there is a gotcha: 
if the cache size is more than about 40% and less than about 90% of the 
size of the larger of goodlist.db and spamlist.db, thrashing between the 
Berkeley-DB cache and the kernel disk cache can slow throughput by a factor 
of 10 to 50 or more!  The same problem exists with the single-list version.

Users of the single-list patch are advised to set the db cache size to some 
even number of megabytes, around 25% of the size of the wordlist.db 
file.  The cache size can be set by command line switch '-kddd' or by 
config file option "db_cachesize=ddd", where 'ddd' is the desired cache 
size (in megabytes).

To convert from two lists to one list, run "bogoupgrade -d 
$BOGOFILTER_DIR".  The script will dump the two lists 
($BOGOFILTER_DIR/spamlist.db and $BOGOFILTER_DIR/goodlist.db) and load the 
tokens and counts into one list ($BOGOFILTER_DIR/wordlist.db).

Greg and I developed the single list code about a month ago.  Since then, 
parsing changes and the improvementes they offer have been the focus of 
development efforts - resulting in the 0.13.x releases.  Bogofilter-0.13.x 
will soon become the "stable release" , so it's time to make the single 
list changes available to bogofilter developers.  Once 0.13 is released as 
"stable", these patches will be the base of bogofilter-0.14.  They're being 
released at this time to allow y'all to test them and provide feedback.

Cheers!

David
-------------- next part --------------
Index: configure.ac
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/configure.ac,v
retrieving revision 1.92
diff -u -r1.92 configure.ac
--- configure.ac	31 May 2003 13:23:39 -0000	1.92
+++ configure.ac	31 May 2003 18:47:10 -0000
@@ -1,5 +1,5 @@
 # $Id: configure.ac,v 1.92 2003/05/31 13:23:39 relson Exp $
-AC_INIT(bogofilter, 0.13.4.1)
+AC_INIT(bogofilter, 0.13.4.1.tst)
 AC_PREREQ(2.53)
 AC_CONFIG_SRCDIR([src])
 AC_CANONICAL_TARGET
Index: src/bogoconfig.c
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/bogoconfig.c,v
retrieving revision 1.70
diff -u -r1.70 bogoconfig.c
--- src/bogoconfig.c	31 May 2003 18:46:48 -0000	1.70
+++ src/bogoconfig.c	31 May 2003 18:47:10 -0000
@@ -424,7 +424,8 @@
 #define	F "f"
 #endif
 
-#define	OPTIONS	":23bBc:Cd:DefFghI:lL:m:MnNo:O:pP:qQRrsStTuvVx:y:" G R F
+#define	OPTIONS	":23bBc:Cd:DefFghH:I:k:lL:m:MnNo:O:pqQRrsStTuvVx:y:" G R F
+
 
 /** These functions process command line arguments.
  **
@@ -518,6 +519,10 @@
 		fprintf(stderr, "Can't read file '%s'\n", optarg);
 		exit(2);
 	    }
+	    break;
+
+	case 'k':
+	    db_cachesize=atoi(optarg);
 	    break;
 
 	case 'L':
Index: src/bogoupgrade
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/bogoupgrade,v
retrieving revision 1.1
diff -u -r1.1 bogoupgrade
--- src/bogoupgrade	3 Feb 2003 16:55:15 -0000	1.1
+++ src/bogoupgrade	31 May 2003 18:47:10 -0000
@@ -7,10 +7,31 @@
 
 Author:
 Gyepi Sam <gyepi at praxis-sw.com>
+David Relson <relson at osagesoftware.com>
 
 =cut
 
-my $VERSION = '0.1';
+# bogofilter-0.3 through bogofilter-0.6.3
+#
+#	HEADER "# bogofilter email-count (format version A): %lu"
+#
+
+# bogofilter-0.7.x
+#
+#	HEADER "# bogofilter email-count (format version B): %lu"
+#
+
+# bogofilter-0.8 to bogofilter-0.13.x
+#
+# BerkeleyDB with double wordlists 
+#	spamlist.db and goodlist.db
+
+# bogofilter-0.14 and later
+#
+# BerkeleyDB with single wordlist
+#	wordlist.db
+
+my $VERSION = '0.2';
 
 my ($in, $out, $help);
 
@@ -20,7 +41,10 @@
 
   my $arg = $ARGV[$i];
 
-  if ($arg eq '-i'){
+  if ($arg eq '-d'){
+    $dir = $ARGV[++$i];
+  }
+  elsif ($arg eq '-i'){
     $in = $ARGV[++$i];
   }
   elsif ($arg eq '-o'){
@@ -42,18 +66,31 @@
   }
 }
 
+if ( $dir ) {
+    convert_double_to_single();
+}
+else {
+    die "Missing input filename\n" unless $in;
+    die "Missing output filename\n" unless $out;
 
-die "Missing input filename\n" unless $in;
-die "Missing output filename\n" unless $out;
-
-my $msg_count_token = '.MSG_COUNT';
+    my $msg_count_token = '.MSG_COUNT';
 
-open(F, $in) or die "Cannot open input file [$in]. $!.\n";
-my $sig = <F>;
-chomp($sig);
+    open(F, $in) or die "Cannot open input file [$in]. $!.\n";
+    my $sig = <F>;
+    chomp($sig);
+    if ($sig =~ m/^\# bogofilter wordlist \(format version A\):\s(\d+)$/){ 
+	convert_format_A();
+    }
+    elsif ($sig =~ m/^\# bogofilter wordlist \(format version A\):\s(\d+)$/){ 
+	convert_format_B();
+    }
+    else {
+	warn "Cannot recognize signature [$sig].\n";
+	exit(2);
+    }
+}
 
-if ($sig =~ m/^\# bogofilter wordlist \(format version A\):\s(\d+)$/){ 
-  
+sub convert_format_A() {  
   my $msg_count = $1;
   my $cmd = qq[$bogoutil $yday -l $out];
   open(OUT, "|$cmd") or die "Cannot run command [$cmd]. $!\n";
@@ -64,7 +101,8 @@
   close(OUT);
   close(F);
 }
-elsif ($sig =~ m/^\# bogofilter email-count \(format version B\):\s(\d+)/){
+
+sub convert_format_B() {
   my $msg_count = $1;
   my $in_db = $in;
   $in_db =~ s/count$/db/;
@@ -99,9 +137,26 @@
   close(F);
   close(OUT);
 }
-else {
-  warn "Cannot recognize signature [$sig].\n";
-  exit(2);
+
+sub convert_double_to_single() {  
+    $word = "$dir/wordlist.db";
+    open(F, $word) and die "$word already exists.\n";
+    close(F);
+
+    $spam = "$dir/spamlist.db";
+    open(F, $spam) or die "Cannot open file [$spam].\n";
+    close(F);
+
+    $good = "$dir/goodlist.db";
+    open(F, $good) or die "Cannot open file [$good].\n";
+    close(F);
+
+    $cmd = qq[
+    ( $bogoutil -d $spam | awk '{ printf \"%s %d 0 %d\\n\", \$1, \$2, \$3}' ; \
+      $bogoutil -d $good | awk '{ printf \"%s 0 %d %d\\n\", \$1, \$2, \$3}'   \
+      ) | sort | $bogoutil -l $word ];
+
+    system( $cmd );
 }
 
 exit(0);
Index: src/bogoutil.c
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/bogoutil.c,v
retrieving revision 1.28
diff -u -r1.28 bogoutil.c
--- src/bogoutil.c	19 Apr 2003 14:27:57 -0000	1.28
+++ src/bogoutil.c	31 May 2003 18:47:11 -0000
@@ -54,25 +54,30 @@
 static int db_dump_hook(word_t *key, word_t *data,
 			 /*@unused@*/ void *userdata)
 {
-    dbv_t val = {0, 0};
+    dbv_t val;
+    val.goodcount = val.spamcount = val.date = 0;
     (void)userdata;
 
     dump_count += 1;
 
-    if (data->leng != sizeof(uint32_t) && data->leng != 2 * sizeof(uint32_t)) {
+    if (data->leng != sizeof(dbv_t) && 
+	data->leng != sizeof(uint32_t) && 
+	data->leng != 2 * sizeof(uint32_t)) {
 	print_error(__FILE__, __LINE__, "Unknown data size - %ld.\n", (long)data->leng);
 	return 0;
     }
 
     memcpy(&val, data->text, data->leng);
 
-    if (!keep_count(val.count) || !keep_date(val.date) || !keep_size(key->leng))
+    if ((!keep_count(val.goodcount) && !keep_count(val.spamcount)) ||
+	!keep_date(val.date) ||
+	!keep_size(key->leng))
 	return 0;
     if (replace_nonascii_characters)
 	do_replace_nonascii_characters(key->text, key->leng);
     word_puts(key, 0, stdout);
     putchar(' ');
-    printf("%lu", (unsigned long)val.count);
+    printf("%lu %lu", (unsigned long)val.spamcount, (unsigned long)val.goodcount);
     if (val.date) {
 	printf(" %lu", (unsigned long)val.date);
     }
@@ -83,15 +88,16 @@
 struct robhook_data {
     double *sum;
     uint32_t *count;
-    void *dbh_good, *dbh_spam;
+    void *dbh;
     double scalefactor;
 };
 
-static int count_hook(word_t *key, word_t *data,
-		      void *userdata)
+static int robx_hook(word_t *key, word_t *data, 
+		     void *userdata)
 {
     struct robhook_data *rd = userdata;
 
+    dbv_t val;
     uint32_t goodness;
     uint32_t spamness;
     double   prob;
@@ -103,7 +109,7 @@
 	return 0;
 
     /* ignore short read */
-    if (data->leng < sizeof(uint32_t))
+    if (data->leng < 2 * sizeof(uint32_t))
 	return 0;
 
     if (x == NULL || key->leng + 1 > x_size) {
@@ -114,15 +120,20 @@
 
     word_cpy(x, key);
 
-    memcpy(&goodness, data->text, sizeof(uint32_t));
-    spamness = db_getvalue(rd->dbh_spam, x);
+    db_getvalues(rd->dbh, x, &val);
+    spamness = val.spamcount;
+    goodness = val.goodcount;
 
+    /* tokens in good list were already counted */
+    /* now add in tokens only in spam list */
+/*
+    if (goodness == 0) {
+*/
     prob = spamness / (goodness * rd->scalefactor + spamness);
     if (goodness + spamness >= 10) {
 	(*rd->sum) += prob;
 	(*rd->count) += 1;
     }
-
     /* print if -vv and token in both word lists, or -vvv */
     if ((verbose > 1 && goodness && spamness) || verbose > 2) {
 	printf("cnt: %4lu,  sum: %11.6f,  ratio: %9.6f,"
@@ -132,56 +143,9 @@
 	word_puts(x, 0, stdout);
 	fputc( '\n', stdout);
     }
-    return 0;
-}
-
-static int robx_hook(word_t *key, word_t *data, 
-		     void *userdata)
-{
-    struct robhook_data *rd = userdata;
-
-    uint32_t goodness;
-    uint32_t spamness;
-    double   prob;
-    static word_t *x;
-    static size_t x_size = MAXTOKENLEN + 1;
-
-    /* ignore system meta-data */
-    if (*key->text == '.')
-	return 0;
-
-    /* ignore short read */
-    if (data->leng < sizeof(uint32_t))
-	return 0;
-
-    if (x == NULL || key->leng + 1 > x_size) {
-	if (x) word_free(x);
-	x_size = max(x_size, key->leng + 1);
-	x = word_new(NULL, x_size);
-    }
-
-    word_cpy(x, key);
-
-    memcpy(&spamness, data->text, sizeof(uint32_t));
-    goodness = db_getvalue(rd->dbh_good, x);
-
-    /* tokens in good list were already counted */
-    /* now add in tokens only in spam list */
-    if (goodness == 0) {
-	prob = 1.0;
-	if (spamness >= 10) {
-	    (*rd->sum) += prob;
-	    (*rd->count) += 1;
-	}
-	if (verbose > 2) {
-	    printf("cnt: %4lu,  sum: %11.6f,  ratio: %9.6f,"
-		   "  sp: %3lu,  gd: %3lu,  p: %9.6f,  t: ", 
-		   (unsigned long)*rd->count, *rd->sum, *rd->sum / *rd->count,
-		   (unsigned long)spamness, (unsigned long)goodness, prob);
-	    word_puts(x, 0, stdout);
-	    fputc( '\n', stdout);
-	}
+/*
     }
+*/
     return 0;
 }
 
@@ -235,7 +199,7 @@
     size_t len;
     int load_count = 0;
     unsigned long line = 0;
-    unsigned long count, date;
+    unsigned long count[2], date;
     YYYYMMDD today_save = today;
 
     if ((dbh = db_open(db_file, db_file, DB_WRITE)) == NULL)
@@ -244,6 +208,7 @@
     memset(buf, '\0', BUFSIZE);
 
     for (;;) {
+	dbv_t val;
 	word_t *token;
 	if (fgets((char *)buf, BUFSIZE, stdin) == NULL) {
 	    if (ferror(stdin)) {
@@ -264,9 +229,14 @@
 	p = spanword(buf);
 	len = strlen((const char *)buf);
 
-	count = atoi((const char *)p);
-	if ((int) count < 0)
-	    count = 0;
+	spamcount = atoi((const char *)p);
+	if ((int) spamcount < 0)
+	    spamcount = 0;
+	p = spanword(p);
+
+	goodcount = atoi((const char *)p);
+	if ((int) goodcount < 0)
+	    goodcount = 0;
 	p = spanword(p);
 
 	date = atoi((const char *)p);
@@ -286,8 +256,9 @@
 
 	if (replace_nonascii_characters)
 	    do_replace_nonascii_characters(buf, len);
-
-	if (!keep_count(count) || !keep_date(date) || !keep_size(strlen((const char *)buf)))
+	if ((!keep_count(goodcount) && !keep_count(spamcount)) ||
+	    !keep_date(date) || 
+	    !keep_size(strlen((const char *)buf)))
 	    continue;
 
 	load_count += 1;
@@ -295,8 +266,10 @@
 	/* Slower, but allows multiple lists to be concatenated */
 	set_date(date);
 	token = word_new(buf, len);
-	count += db_getvalue(dbh, token);
-	db_setvalue(dbh, token, count);
+	db_getvalues(dbh, token, &val);
+	val.spamcount += spamcount;
+	val.goodcount += goodcount;
+	db_setvalues(dbh, token, &val);
 	word_free(token);
     }
     db_close(dbh, false);
@@ -350,21 +323,23 @@
 	byte buf[BUFSIZE];
 	buff_t *buff = buff_new(buf, 0, BUFSIZE);
 	while (get_token(buff, stdin) == 0) {
+	    dbv_t val;
 	    word_t *token = &buff->t;
-	    uint32_t count = db_getvalue(dbh, token);
+	    db_getvalues(dbh, token, &val);
 	    word_puts(token, 0, stdout);
-	    printf(" %lu\n", (unsigned long) count);
+	    printf(" %lu %lu\n", (unsigned long) val.spamcount, (unsigned long) val.goodcount);
 	}
 	buff_free(buff);
     }
     else
     {
 	while (argc-- > 0) {
+	    dbv_t val;
 	    const byte *word = (const byte *) *argv++;
 	    word_t *token = word_new(word, strlen((const char *)word));
-	    uint32_t count = db_getvalue(dbh, token);
+	    db_getvalues(dbh, token, &val);
 	    word_puts(token, 0, stdout);
-	    printf(" %lu\n", (unsigned long) count);
+	    printf(" %lu %lu\n", (unsigned long) val.spamcount, (unsigned long) val.goodcount);
 	    word_free(token);
 	}
     }
@@ -376,8 +351,7 @@
 
 static int words_from_path(const char *dir, int argc, char **argv, bool show_probability)
 {
-    void *dbh_good;
-    void *dbh_spam;
+    void *dbh;
     char filepath[PATH_LEN];
     byte buf[BUFSIZE];
     buff_t *buff = buff_new(buf, 0, BUFSIZE);
@@ -389,28 +363,25 @@
     const char *data_format = !show_probability ? "%-20s %6lu %6lu\n" : "%-20s %6lu  %6lu  %f  %f\n";
 
     /* XXX FIXME: deadlock possible */
-    if (build_path(filepath, sizeof(filepath), dir, GOODFILE) < 0)
+    if (build_path(filepath, sizeof(filepath), dir, WORDLIST) < 0)
 	return 2;
 
-    if ((dbh_good = db_open(filepath, GOODFILE, DB_READ)) == NULL)
-	return 2;
-
-    if (build_path(filepath, sizeof(filepath), dir, SPAMFILE) < 0)
-	return 2;
-
-    if ((dbh_spam = db_open(filepath, SPAMFILE, DB_READ)) == NULL)
+    if ((dbh = db_open(filepath, WORDLIST, DB_READ)) == NULL)
 	return 2;
 
     if (show_probability)
     {
-	spam_msg_count = db_get_msgcount(dbh_spam);
-	good_msg_count = db_get_msgcount(dbh_good);
+	dbv_t val;
+	db_get_msgcounts(dbh, &val);
+	spam_msg_count = val.spamcount;
+	good_msg_count = val.goodcount;
     }
 
     printf(head_format, "", "spam", "good", "Gra prob", "Rob prob");
 
     while (argc >= 0)
     {
+	dbv_t val;
 	word_t *token;
 	double gra_prob = 0.0, rob_prob = 0.0;
 	
@@ -426,8 +397,9 @@
 	    token = word_new(word, strlen((const char *)word));
 	}
 
-	spam_count = db_getvalue(dbh_spam, token);
-	good_count = db_getvalue(dbh_good, token);
+	db_getvalues(dbh, token, &val);
+	spam_count = val.spamcount;
+	good_count = val.goodcount;
 
 	if (show_probability)
 	{
@@ -444,8 +416,7 @@
 	    word_free(token);
     }
 
-    db_close(dbh_good, false);
-    db_close(dbh_spam, false);
+    db_close(dbh, false);
 
     return 0;
 }
@@ -474,25 +445,29 @@
     return 0;
 }
 
-static double compute_robx(void *dbh_spam, void *dbh_good)
+static double compute_robx(void *dbh)
 {
     uint32_t tok_cnt = 0;
     double sum = 0.0;
     double robx;
 
+    dbv_t val;
     uint32_t msg_good, msg_spam;
     struct robhook_data rh;
 
-    msg_good = db_get_msgcount( dbh_good );
-    msg_spam = db_get_msgcount( dbh_spam );
+    db_get_msgcounts( dbh, &val );
+    msg_spam = val.spamcount;
+    msg_good = val.goodcount;
+
     rh.scalefactor = (double)msg_spam/msg_good;
-    rh.dbh_good = dbh_good;
-    rh.dbh_spam = dbh_spam;
+    rh.dbh = dbh;
     rh.sum = ∑
     rh.count = &tok_cnt;
 
-    db_foreach(dbh_good, count_hook, &rh);
-    db_foreach(dbh_spam, robx_hook, &rh);
+/*
+    db_foreach(dbh, count_hook, &rh);
+*/
+    db_foreach(dbh, robx_hook,  &rh);
 
     robx = sum/tok_cnt;
     if (verbose)
@@ -504,44 +479,39 @@
 
 static int compute_robinson_x(char *path)
 {
-    wordlist_t wl[2];
+    wordlist_t wl;
 
     double robx;
     word_t *word_robx = word_new((const byte *)ROBX_W, strlen(ROBX_W));
 
-    void *dbh_spam;
+    dbv_t val;
+    void *dbh;
 
-    char db_spam_file[PATH_LEN];
-    char db_good_file[PATH_LEN];
+    char db_word_file[PATH_LEN];
 
-    if (build_path(db_spam_file, sizeof(db_spam_file), path, SPAMFILE) < 0 ||
-	build_path(db_good_file, sizeof(db_good_file), path, GOODFILE) < 0 )
+    if (build_path(db_word_file, sizeof(db_word_file), path, WORDLIST) < 0)
     {
 	fprintf(stderr, "%s: string too long creating .db file name.\n", PROGNAME);
 	exit(2);
     }
 
-    memset(wl, 0, sizeof(wl));
-
-    wl[0].next = &wl[1];
-    wl[0].filepath = db_good_file;
-    wl[0].filename = xstrdup("good");
+    memset(&wl, 0, sizeof(wl));
 
-    wl[1].next = NULL;
-    wl[1].filepath = db_spam_file;
-    wl[1].filename = xstrdup("spam");
+    wl.filepath = db_word_file;
+    wl.filename = xstrdup("word");
 
-    word_lists = wl;
+    word_lists = &wl;
     open_wordlists(DB_READ);
 
-    robx = compute_robx(wl[1].dbh, wl[0].dbh);
+    robx = compute_robx(wl.dbh);
     close_wordlists(false);
-    free(wl[0].filename);
-    free(wl[1].filename);
+    free(wl.filename);
 
-    dbh_spam = db_open(db_spam_file, "spam", DB_WRITE);
-    db_setvalue(dbh_spam, word_robx, (uint32_t) (robx * 1000000));
-    db_close(dbh_spam, false);
+    dbh = db_open(db_word_file, "word", DB_WRITE);
+    val.goodcount = 0;
+    val.spamcount = (uint32_t) (robx * 1000000);
+    db_setvalues(dbh, word_robx, &val);
+    db_close(dbh, false);
 
     word_free(word_robx);
 
Index: src/common.h
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/common.h,v
retrieving revision 1.9
diff -u -r1.9 common.h
--- src/common.h	16 May 2003 14:00:24 -0000	1.9
+++ src/common.h	31 May 2003 18:47:11 -0000
@@ -17,8 +17,11 @@
 /* length of token will not exceed this... */
 #define MAXTOKENLEN	30
 
+#define WORDLIST	"wordlist.db"
+/*
 #define GOODFILE	"goodlist.db"
 #define SPAMFILE	"spamlist.db"
+*/
 #define IGNOREFILE	"ignorelist.db"
 
 #define max(x, y)	(((x) > (y)) ? (x) : (y))
Index: src/datastore.h
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/datastore.h,v
retrieving revision 1.8
diff -u -r1.8 datastore.h
--- src/datastore.h	28 Mar 2003 15:15:58 -0000	1.8
+++ src/datastore.h	31 May 2003 18:47:11 -0000
@@ -21,7 +21,7 @@
 #include "wordlists.h"
 
 typedef struct {
-    uint32_t count;
+    uint32_t count[2];		/* spam and ham counts */
     uint32_t date;
 } dbv_t;
 
@@ -46,32 +46,32 @@
 /** Increments count for given word.  Note: negative results are set to
  * zero.
  */
-void db_increment(void *, const word_t *, uint32_t);
+void db_increment(void *, const word_t *, dbv_t *);
 
 /** Decrement count for a given word, if it exists in the datastore.
  * Note: negative results are set to zero. 
  */
-void db_decrement(void *, const word_t *, uint32_t);
+void db_decrement(void *, const word_t *, dbv_t *);
 
 /** Retrieve the value associated with a given word in a list. 
  * \return zero if the word does not exist in the database. 
  */
-uint32_t db_getvalue(void *, const word_t *);
+bool db_getvalues(void *, const word_t *, dbv_t *);
 
 /** Delete the key */
 void db_delete(void *, const word_t *);
 
 /** Set the value associated with a given word in a list */
-void db_setvalue(void *, const word_t *, uint32_t);
+void db_setvalues(void *, const word_t *, dbv_t *);
 
 /** Update the value associated with a given word in a list */
-void db_updvalue(void *vhandle, const word_t *word, const dbv_t *updval);
+void db_updvalues(void *vhandle, const word_t *word, const dbv_t *updval);
 
 /** Get the database message count */
-uint32_t db_get_msgcount(void*);
+void db_get_msgcounts(void*, dbv_t *);
 
 /** set the database message count */
-void db_set_msgcount(void*, uint32_t);
+void db_set_msgcounts(void*, dbv_t *);
 
 typedef int (*db_foreach_t)(word_t *w_key, word_t *w_value, void *userdata);
 /** Iterate over all elements in data base and call \p hook for each item.
Index: src/datastore_db.c
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/datastore_db.c,v
retrieving revision 1.21
diff -u -r1.21 datastore_db.c
--- src/datastore_db.c	10 Apr 2003 00:28:48 -0000	1.21
+++ src/datastore_db.c	31 May 2003 18:47:12 -0000
@@ -235,29 +235,29 @@
     0 if the word is not found.
     Notes: Will call exit if an error occurs.
 */
-uint32_t db_getvalue(void *vhandle, const word_t *word){
-  dbv_t val;
+bool db_getvalues(void *vhandle, const word_t *word, dbv_t *val){
   int ret;
-  uint32_t value = 0;
   dbh_t *handle = vhandle;
 
-  ret = db_get_dbvalue(vhandle, word, &val);
+  ret = db_get_dbvalue(vhandle, word, val);
 
   if (ret == 0) {
-    value = val.count;
-
     if (DEBUG_DATABASE(3)) {
-      fprintf(dbgout, "[%lu] db_getvalue (%s): [",
+      fprintf(dbgout, "[%lu] db_getvalues (%s): [",
 	      (unsigned long) handle->pid, handle->name);
       word_puts(word, 0, dbgout);
-      fprintf(dbgout, "] has value %lu\n",
-	      (unsigned long)value);
+      fprintf(dbgout, "] has values %lu,%lu\n",
+	      (unsigned long)val->spamcount,
+	      (unsigned long)val->goodcount);
     }
-    if ((int32_t)value < (int32_t)0)
-	value = 0;
-    return value;
+    if ((int32_t)val->spamcount < (int32_t)0)
+	val->spamcount = 0;
+    if ((int32_t)val->goodcount < (int32_t)0)
+	val->goodcount = 0;
+    return true;
   } else {
-    return 0;
+    memset(val, 0, sizeof(*val));
+    return false;
   }
 }
 
@@ -282,11 +282,11 @@
   int ret;
   DBT db_key;
   DBT db_data;
-  uint32_t cv[2] = { 0l, 0l };
+  uint32_t cv[3] = { 0l, 0l, 0l };
 
   dbh_t *handle = vhandle;
 
-  db_enforce_locking(handle, "db_getvalue");
+  db_enforce_locking(handle, "db_get_dbvalue");
 
   DBT_init(db_key);
   DBT_init(db_data);
@@ -307,22 +307,24 @@
       memcpy(cv, db_data.data, db_data.size);
       */
     if (!handle->is_swapped){		/* convert from struct to array */
-      val->count = cv[0];
-      val->date  = cv[1];
+      val->spamcount = cv[0];
+      val->goodcount = cv[1];
+      val->date  = cv[2];
     } else {
-      val->count = swap_32bit(cv[0]);
-      val->date  = swap_32bit(cv[1]);
+      val->spamcount = swap_32bit(cv[0]);
+      val->goodcount = swap_32bit(cv[1]);
+      val->date  = swap_32bit(cv[2]);
     }
     break;
   case DB_NOTFOUND:
     if (DEBUG_DATABASE(3)) {
-      fprintf(dbgout, "[%lu] db_getvalue (%s): [", (unsigned long) handle->pid, handle->name);
+      fprintf(dbgout, "[%lu] db_get_dbvalue (%s): [", (unsigned long) handle->pid, handle->name);
       word_puts(word, 0, dbgout);
       fputs("] not found\n", dbgout);
     }
     break;
   default:
-    print_error(__FILE__, __LINE__, "(db) db_getvalue( '%s' ), err: %d, %s", word->text, ret, db_strerror(ret));
+    print_error(__FILE__, __LINE__, "(db) db_get_dbvalue( '%s' ), err: %d, %s", word->text, ret, db_strerror(ret));
     exit(2);
   }
   return ret;
@@ -333,11 +335,9 @@
 Store VALUE in database, using WORD as database key
 Notes: Calls exit if an error occurs.
 */
-void db_setvalue(void *vhandle, const word_t *word, uint32_t count){
-  dbv_t val;
-  val.count = count;
-  val.date  = today;		/* date in form YYYYMMDD */
-  db_set_dbvalue(vhandle, word, &val);
+void db_setvalues(void *vhandle, const word_t *word, dbv_t *val){
+  val->date = today;		/* date in form YYYYMMDD */
+  db_set_dbvalue(vhandle, word, val);
 }
 
 
@@ -346,16 +346,18 @@
 Adds COUNT to existing count.
 Sets date to newer of TODAY and date in database.
 */
-void db_updvalue(void *vhandle, const word_t *word, const dbv_t *updval){
+void db_updvalues(void *vhandle, const word_t *word, const dbv_t *updval){
   dbv_t val;
   int ret = db_get_dbvalue(vhandle, word, &val);
   if (ret != 0) {
-      val.count = updval->count;
-      val.date  = updval->date;		/* date in form YYYYMMDD */
+      val.spamcount = updval->spamcount;
+      val.goodcount = updval->goodcount;
+      val.date      = updval->date;			/* date in form YYYYMMDD */
   }
   else {
-      val.count += updval->count;
-      val.date  = max(val.date, updval->date);	/* date in form YYYYMMDD */
+      val.spamcount += updval->spamcount;
+      val.goodcount += updval->goodcount;
+      val.date       = max(val.date, updval->date);	/* date in form YYYYMMDD */
   }
   db_set_dbvalue(vhandle, word, &val);
 }
@@ -365,7 +367,7 @@
   int ret;
   DBT db_key;
   DBT db_data;
-  uint32_t cv[2];
+  uint32_t cv[3];
   dbh_t *handle = vhandle;
 
   db_enforce_locking(handle, "db_set_dbvalue");
@@ -377,16 +379,18 @@
   db_key.size = word->leng;
 
   if (!handle->is_swapped){		/* convert from struct to array */
-      cv[0] = val->count;
-      cv[1] = val->date;
+      cv[0] = val->spamcount;
+      cv[1] = val->goodcount;
+      cv[2] = val->date;
   } else {
-      cv[0] = swap_32bit(val->count);
-      cv[1] = swap_32bit(val->date);
+      cv[0] = swap_32bit(val->spamcount);
+      cv[1] = swap_32bit(val->goodcount);
+      cv[2] = swap_32bit(val->date);
   }
 
   db_data.data = &cv;			/* and save array in wordlist */
   if (!datestamp_tokens || val->date == 0)
-      db_data.size = db_data.ulen = sizeof(cv[0]);
+      db_data.size = db_data.ulen = 2 * sizeof(cv[0]);
   else
       db_data.size = db_data.ulen = sizeof(cv);
 
@@ -397,8 +401,9 @@
       fprintf(dbgout, "db_set_dbvalue (%s): [",
 	      handle->name);
       word_puts(word, 0, dbgout);
-      fprintf(dbgout, "] has value %lu\n",
-	      (unsigned long)val->count);
+      fprintf(dbgout, "] has values %lu,%lu\n",
+	      (unsigned long)val->spamcount,
+	      (unsigned long)val->goodcount);
     }
   }
   else {
@@ -411,51 +416,65 @@
 /*
   Increment count associated with WORD, by VALUE.
  */
-void db_increment(void *vhandle, const word_t *word, uint32_t value){
-    uint32_t dv = db_getvalue(vhandle, word);
-    value = UINT32_MAX - dv < value ? UINT32_MAX : dv + value;
-    db_setvalue(vhandle, word, value);
+void db_increment(void *vhandle, const word_t *word, dbv_t *val){
+    dbv_t cur;
+
+    db_getvalues(vhandle, word, &cur);
+
+    cur.spamcount = UINT32_MAX - cur.spamcount < val->spamcount ? UINT32_MAX : cur.spamcount + val->spamcount;
+    cur.goodcount = UINT32_MAX - cur.goodcount < val->goodcount ? UINT32_MAX : cur.goodcount + val->goodcount;
+
+    db_setvalues(vhandle, word, &cur);
+
+    return;
 }
 
 /*
   Decrement count associated with WORD by VALUE,
   if WORD exists in the database.
 */
-void db_decrement(void *vhandle, const word_t *word, uint32_t value){
-    uint32_t dv = db_getvalue(vhandle, word);
-    value = dv < value ? 0 : dv - value;
-    db_setvalue(vhandle, word, value);
+void db_decrement(void *vhandle, const word_t *word, dbv_t *val){
+    dbv_t cur;
+
+    db_getvalues(vhandle, word, &cur);
+
+    cur.spamcount = cur.spamcount < val->spamcount ? 0 : cur.spamcount - val->spamcount;
+    cur.goodcount = cur.goodcount < val->goodcount ? 0 : cur.goodcount - val->goodcount;
+    db_setvalues(vhandle, word, &cur);
+
+    return;
 }
 
 /*
   Get the number of messages associated with database.
 */
-uint32_t db_get_msgcount(void *vhandle){
-    uint32_t msg_count;
-
+void db_get_msgcounts(void *vhandle, dbv_t *val){
     if (msg_count_tok == NULL)
 	msg_count_tok = word_new(MSG_COUNT_TOK, strlen((const char *)MSG_COUNT_TOK));
-    msg_count = db_getvalue(vhandle, msg_count_tok);
+
+    db_getvalues(vhandle, msg_count_tok, val);
 
     if (DEBUG_DATABASE(2)) {
 	dbh_t *handle = vhandle;
-	fprintf(dbgout, "db_get_msgcount( %s ) -> %lu\n", handle->name,
-		(unsigned long)msg_count);
+	fprintf(dbgout, "db_get_msgcounts( %s ) ->  %lu,%lu\n", handle->name,
+	      (unsigned long)val->spamcount,
+	      (unsigned long)val->goodcount);
     }
 
-    return msg_count;
+    return;
 }
 
 /*
  Set the number of messages associated with database.
 */
-void db_set_msgcount(void *vhandle, uint32_t count){
-    db_setvalue(vhandle, msg_count_tok, count);
+void db_set_msgcounts(void *vhandle, dbv_t *val){
+    db_setvalues(vhandle, msg_count_tok, val);
 
     if (DEBUG_DATABASE(2)) {
 	dbh_t *handle = vhandle;
-	fprintf(dbgout, "db_set_msgcount( %s ) -> %lu\n", handle->name,
-		(unsigned long)count);
+	fprintf(dbgout, "db_set_msgcounts( %s ) ->  %lu,%lu\n", handle->name,
+	      (unsigned long)val->spamcount,
+	      (unsigned long)val->goodcount);
     }
 }
 
Index: src/graham.c
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/graham.c,v
retrieving revision 1.12
diff -u -r1.12 graham.c
--- src/graham.c	18 Apr 2003 17:50:48 -0000	1.12
+++ src/graham.c	31 May 2003 18:47:12 -0000
@@ -207,7 +207,6 @@
 {
     wordlist_t* list;
     int override=0;
-    long count;
     double prob;
     int totalcount=0;
 
@@ -217,21 +216,26 @@
 
     for (list=word_lists; list != NULL ; list=list->next)
     {
+	int   i;
+	dbv_t val;
 	if (override > list->override)
 	    break;
-	count=db_getvalue(list->dbh, token);
+	db_getvalues(list->dbh, token, &val);
+	if (val.count[0] == 0 && val.count[1] == 0)
+	    continue;
 
-	if (count) {
-	    if (list->ignore)
-		return EVEN_ODDS;
-	    totalcount+=count*list->weight;
-	    override=list->override;
-	    prob = (double)count;
-	    prob /= list->msgcount;
-	    prob *= list->weight;
+	if (list->ignore)
+	    return EVEN_ODDS;
+	override=list->override;
+
+	for (i=0; i<2; i++) {
+	    totalcount+=val.count[i]*list->weight[i];
+	    prob = (double)val.count[i];
+	    prob /= list->msgcount[i];
+	    prob *= list->weight[i];
 	    prob = min(1.0, prob);
 	    
-	    wordprob_add(&wordstats, prob, list->bad);
+	    wordprob_add(&wordstats, prob, list->bad[i]);
 	}
     }
 
Index: src/maint.c
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/maint.c,v
retrieving revision 1.18
diff -u -r1.18 maint.c
--- src/maint.c	30 Mar 2003 14:12:20 -0000	1.18
+++ src/maint.c	31 May 2003 18:47:12 -0000
@@ -162,7 +162,8 @@
 
     memcpy(&val, data->text, data->leng);
 
-    if (!keep_count(val.count) || !keep_date(val.date) || !keep_size(key->leng)) {
+    if ((!keep_count(val.spamcount) && !keep_count(val.goodcount)) || 
+	!keep_date(val.date) || !keep_size(key->leng)) {
 	db_delete(userdata, key);
 	if (DEBUG_DATABASE(0)) {
 	    fputs("deleting ", dbgout);
@@ -181,7 +182,7 @@
 		db_delete(userdata, key);
 		w.text = tmp;
 		w.leng = key->leng;
-		db_updvalue(userdata, &w, &val);
+		db_updvalues(userdata, &w, &val);
 	    }
 	    xfree(tmp);
 	}
Index: src/msgcounts.c
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/msgcounts.c,v
retrieving revision 1.2
diff -u -r1.2 msgcounts.c
--- src/msgcounts.c	22 Apr 2003 13:49:59 -0000	1.2
+++ src/msgcounts.c	31 May 2003 18:47:12 -0000
@@ -37,10 +37,8 @@
 
     for(list=word_lists; list != NULL; list=list->next)
     {
-	if (list->bad)
-	    msgs_bad += list->msgcount;
-	else
-	    msgs_good += list->msgcount;
+	msgs_bad  += list->msgcount[SPAM];
+	msgs_good += list->msgcount[GOOD];
     }
 }
 
Index: src/register.c
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/register.c,v
retrieving revision 1.13
diff -u -r1.13 register.c
--- src/register.c	16 May 2003 03:34:00 -0000	1.13
+++ src/register.c	31 May 2003 18:47:12 -0000
@@ -33,8 +33,7 @@
   int wordcount = h->count;	/* use number of unique tokens */
 
   wordlist_t *list;
-  wordlist_t *incr_list = NULL;
-  wordlist_t *decr_list = NULL;
+  int incr = -1, decr = -1;
 
   /* If update directory explicity supplied, setup the wordlists. */
   if (update_dir) {
@@ -42,10 +41,10 @@
 	  exit(2);
   }
 
-  if (_run_type & REG_SPAM) r = "s";
-  if (_run_type & REG_GOOD) r = "n";
-  if (_run_type & UNREG_SPAM) u = "S";
-  if (_run_type & UNREG_GOOD) u = "N";
+  if (_run_type & REG_SPAM)	{ r = "s"; incr = SPAM; }
+  if (_run_type & REG_GOOD)	{ r = "n"; incr = GOOD; }
+  if (_run_type & UNREG_SPAM)	{ u = "S"; decr = SPAM; }
+  if (_run_type & UNREG_GOOD)	{ u = "N"; decr = GOOD; }
 
   if (wordcount == 0)
       msgcount = 0;
@@ -56,51 +55,57 @@
     (void)fprintf(dbgout, "# %d word%s, %d message%s\n", 
 		  wordcount, PLURAL(wordcount), msgcount, PLURAL(msgcount));
 
+/*
   set_list_active_status(false);
+*/
 
-  if (_run_type & REG_GOOD) incr_list = good_list;
-  if (_run_type & REG_SPAM) incr_list = spam_list;
-  if (_run_type & UNREG_GOOD) decr_list = good_list;
-  if (_run_type & UNREG_SPAM) decr_list = spam_list;
-
-  if (DEBUG_REGISTER(2))
-      fprintf(dbgout, "%s%s -- incr: %08lX, decr: %08lX\n", r, u,
-	      (unsigned long)incr_list, (unsigned long)decr_list);
-
-  if (incr_list)
-    incr_list->active = true;
-  if (decr_list)
-    decr_list->active = true;
+  for (node = wordhash_first(h); node != NULL; node = wordhash_next(h)){
+      wordprop = node->buf;
+      if (incr >= 0) {
+	  dbv_t val;
+	  val.goodcount = val.spamcount = val.date = 0;
+	  val.count[incr] = wordprop->freq;
+	  db_increment(word_list->dbh, node->key, &val);
+      }
+      if (decr >= 0) {
+	  dbv_t val;
+	  val.goodcount = val.spamcount = val.date = 0;
+	  val.count[decr] = wordprop->freq;
+	  db_decrement(word_list->dbh, node->key, &val);
+      }
+  }
 
   for (list = word_lists; list != NULL; list = list->next){
-    if (list->active) {
-      list->msgcount = db_get_msgcount(list->dbh);
-    }
-  }
+      dbv_t val;
 
-  if (incr_list) incr_list->msgcount += msgcount;
+/*
+      if (!list->active)
+	  continue;
+*/
+
+      db_get_msgcounts(list->dbh, &val);
+      list->msgcount[SPAM] = val.spamcount;
+      list->msgcount[GOOD] = val.goodcount;
+
+      if (incr >= 0)
+	  list->msgcount[incr] += msgcount;
+      
+      if (decr >= 0) {
+	  if (list->msgcount[decr] > msgcount)
+	      list->msgcount[decr] -= msgcount;
+	  else
+	      list->msgcount[decr] = 0;
+      }
 
-  if (decr_list) {
-    if (decr_list->msgcount > msgcount)
-      decr_list->msgcount -= msgcount;
-    else
-      decr_list->msgcount = 0;
-  }
+      val.spamcount = list->msgcount[SPAM];
+      val.goodcount = list->msgcount[GOOD];
 
-  for (node = wordhash_first(h); node != NULL; node = wordhash_next(h)){
-    wordprop = node->buf;
-    if (incr_list) db_increment(incr_list->dbh, node->key, wordprop->freq);
-    if (decr_list) db_decrement(decr_list->dbh, node->key, wordprop->freq);
-  }
+      db_set_msgcounts(list->dbh, &val);
 
-  for (list = word_lists; list != NULL; list = list->next){
-    if (list->active) {
-      db_set_msgcount(list->dbh, list->msgcount);
       db_flush(list->dbh);
       if (verbose>1)
-	(void)fprintf(stderr, "bogofilter: %ld messages on the %s list\n",
-		      list->msgcount, list->filename);
-    }
+	  (void)fprintf(stderr, "bogofilter: list %s - %ld spam, %ld good\n",
+			list->filename, list->msgcount[SPAM], list->msgcount[GOOD]);
   }
 }
 
Index: src/robinson.c
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/robinson.c,v
retrieving revision 1.25
diff -u -r1.25 robinson.c
--- src/robinson.c	10 May 2003 12:07:04 -0000	1.25
+++ src/robinson.c	31 May 2003 18:47:12 -0000
@@ -115,28 +115,37 @@
 static double compute_probability(const word_t *token, wordprop_t *wordstats)
 {
     int override=0;
-    long count;
     wordlist_t* list;
 
     if (wordstats->bad == 0 && wordstats->good == 0)
     for (list=word_lists; list != NULL ; list=list->next)
     {
+	size_t i;
+	dbv_t val;
 	if (override > list->override)
 	    break;
-	count=db_getvalue(list->dbh, token);
+	db_getvalues(list->dbh, token, &val);
 
-	/* Protect against negatives */
-	if (count < 0) {
-	    count = 0;
-	    db_setvalue(list->dbh, token, count);
-	}
+	if (val.count[0] == 0 && val.count[1] == 0)
+	    continue;
+	if (list->ignore)
+	    return EVEN_ODDS;
+	override=list->override;
+
+	for (i=0; i<COUNTOF(val.count); i++) {
+	    /* Protect against negatives */
+	    if ((int) val.count[i] < 0) {
+		val.count[i] = 0;
+		db_setvalues(list->dbh, token, &val);
+	    }
 
-	if (count) {
+	    if (val.count[i] == 0)
+		continue;
 	    if (list->ignore)
 		return EVEN_ODDS;
 	    override=list->override;
 
-	    wordprob_add(wordstats, count, list->bad);
+	    wordprob_add(wordstats, val.count[i], list->bad[i]);
 	    if (DEBUG_ROBINSON(1)) {
 		fprintf(dbgout, "%2d %2d \n", (int) wordstats->good, (int) wordstats->bad);
 		word_puts(token, 0, dbgout);
@@ -297,8 +306,12 @@
 
     if (fabs(robx) < EPS)
     {
+	dbv_t val;
+	long l_robx;
+
 	/* Note: .ROBX is scaled by 1000000 in the wordlist */
-	long l_robx = db_getvalue(spam_list->dbh, word_robx);
+	db_getvalues(word_list->dbh, word_robx, &val);
+	l_robx = val.count[SPAM];
 
 	/* If found, unscale; else use predefined value */
 	robx = l_robx ? (double)l_robx / 1000000 : ROBX;
Index: src/wordlists.c
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/wordlists.c,v
retrieving revision 1.12
diff -u -r1.12 wordlists.c
--- src/wordlists.c	22 Apr 2003 17:47:49 -0000	1.12
+++ src/wordlists.c	31 May 2003 18:47:12 -0000
@@ -27,8 +27,7 @@
 #define	MIN_SLEEP	0.5e+3		/* .5 milliseconds */
 #define	MAX_SLEEP	2.0e+6		/* 2 seconds */
 
-wordlist_t *good_list;
-wordlist_t *spam_list;
+wordlist_t *word_list;
 /*@null@*/ wordlist_t* word_lists=NULL;
 
 /* Function Prototypes */
@@ -56,7 +55,9 @@
 
 /* returns -1 for error, 0 for success */
 static int init_wordlist(/*@out@*/ wordlist_t **list, const char* name, const char* path,
-			 double weight, bool bad, int override, bool ignore)
+			 double sweight, bool sbad, 
+			 double gweight, bool gbad, 
+			 int override, bool ignore)
 {
     wordlist_t *new = (wordlist_t *)xmalloc(sizeof(*new));
     wordlist_t *list_ptr;
@@ -73,8 +74,10 @@
 #endif
     new->override=override;
     new->active=false;
-    new->weight=weight;
-    new->bad=bad;
+    new->weight[SPAM]=sweight;
+    new->weight[GOOD]=gweight;
+    new->bad[SPAM]=sbad;
+    new->bad[GOOD]=gbad;
     new->ignore=ignore;
 
     if (! word_lists) {
@@ -155,12 +158,8 @@
 	rc = -1;
     }
 
-    if ((build_path(filepath, sizeof(filepath), dir, GOODFILE) < 0) ||
-	init_wordlist(&good_list, "good", filepath, good_weight, false, 0, false) != 0)
-	rc = -1;
-
-    if ((build_path(filepath, sizeof(filepath), dir, SPAMFILE) < 0) ||
-	init_wordlist(&spam_list, "spam", filepath, bad_weight, true, 0, false) != 0)
+    if ((build_path(filepath, sizeof(filepath), dir, WORDLIST) < 0) ||
+	init_wordlist(&word_list, "word", filepath, bad_weight, true, good_weight, false, 0, false) != 0)
 	rc = -1;
 
     xfree(dir);
@@ -193,7 +192,10 @@
 			exit(2);
 		} /* switch */
 	    } else { /* db_open */
-		list->msgcount = db_get_msgcount(list->dbh);
+		dbv_t val;
+		db_get_msgcounts(list->dbh, &val);
+		list->msgcount[GOOD] = val.goodcount;
+		list->msgcount[SPAM] = val.spamcount;
 	    } /* db_open */
 	} /* for */
     } while(retry);
@@ -231,8 +233,7 @@
 
     for ( list = word_lists; list != NULL; list = list->next )
     {
-	if ( ! list->bad )
-	    list->weight = weight;
+	list->weight[GOOD] = weight;
     }
 
     return;
@@ -307,7 +308,7 @@
     char* type;
     char* name;
     char* path;
-    double weight = 0.0;
+    double sweight = 0.0, gweight = 0.0;
     bool bad = false;
     bool override = false;
     bool ignore = false;
@@ -336,7 +337,10 @@
     path=tildeexpand(tmp);	/* path to wordlist */
     tmp = spanword(tmp);
 
-    weight=atof(tmp);
+    sweight=atof(tmp);
+    tmp = spanword(tmp);
+
+    gweight=atof(tmp);
     tmp = spanword(tmp);
 
     override=atoi(tmp);
@@ -359,7 +363,7 @@
     }
     tmp = spanword(tmp);
 
-    rc = init_wordlist(&list, name, path, weight, bad, override, ignore);
+    rc = init_wordlist(&list, name, path, sweight, bad, gweight, false, override, ignore);
     ok = rc == 0;
 
     xfree(path);
Index: src/wordlists.h
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/wordlists.h,v
retrieving revision 1.2
diff -u -r1.2 wordlists.h
--- src/wordlists.h	16 Mar 2003 16:24:03 -0000	1.2
+++ src/wordlists.h	31 May 2003 18:47:12 -0000
@@ -7,6 +7,11 @@
 
 #include "system.h"
 
+typedef enum sh_e { SPAM, GOOD } sh_t;
+
+#define	spamcount count[SPAM]
+#define	goodcount count[GOOD]
+
 typedef struct wordlist_s wordlist_t;
 struct wordlist_s
 {
@@ -15,16 +20,17 @@
     /*@owned@*/ char *filename;	/* resource name (for debug/verbose messages) */
     /*@owned@*/ char *filepath;	/* resource path (for debug/verbose messages) */
     /*@owned@*/ void *dbh;	/* database handle */
-    long msgcount;		/* count of messages in wordlist. */
-    double weight;
+    long msgcount[2];		/* count of messages in wordlist. */
+    double weight[2];
     bool active;
-    bool bad;
+    bool bad[2];
     int  override;
     bool ignore;
 };
 
-/*@null@*/ extern wordlist_t *word_lists;
-extern wordlist_t *good_list, *spam_list;
+/*@null@*/ 
+extern wordlist_t *word_list;
+extern wordlist_t *word_lists;
 
 int setup_wordlists(const char* dir, priority_t precedence);
 bool configure_wordlist(const char *val);
Index: src/tests/bogofilter/t.bogodir
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogofilter/t.bogodir,v
retrieving revision 1.10
diff -u -r1.10 t.bogodir
--- src/tests/bogofilter/t.bogodir	10 May 2003 01:12:49 -0000	1.10
+++ src/tests/bogofilter/t.bogodir	31 May 2003 18:47:12 -0000
@@ -34,9 +34,9 @@
     (
 	set +e
 	echo >> $LOG "$@"
-	echo >> $LOG "### expect: $expect/goodlist.db"
+	echo >> $LOG "### expect: $expect/wordlist.db"
 	result=`eval "$@" 2>&1 | tee -a $LOG | grep open | head -1`
-	ok=`echo "$result" | grep "$expect/goodlist.db"`
+	ok=`echo "$result" | grep "$expect/wordlist.db"`
 	if [ -n "$ok" ] ; then
 	    echo >> $LOG "### ok: $ok"
 	    echo >> $LOG "PASS"
Index: src/tests/bogofilter/t.bulkmode
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogofilter/t.bulkmode,v
retrieving revision 1.19
diff -u -r1.19 t.bulkmode
--- src/tests/bogofilter/t.bulkmode	19 May 2003 22:05:24 -0000	1.19
+++ src/tests/bogofilter/t.bulkmode	31 May 2003 18:47:12 -0000
@@ -52,8 +52,7 @@
 mkdir -p $BOGOFILTER_DIR
 $BOGOFILTER -$mth $CFG -s < ${SYSTEST}/inputs/spam.mbx
 $BOGOFILTER -$mth $CFG -n < ${SYSTEST}/inputs/good.mbx
-#$BOGOUTIL -d $BOGOFILTER_DIR/spamlist.db > $BOGOFILTER_DIR/spamlist.txt
-#$BOGOUTIL -d $BOGOFILTER_DIR/goodlist.db > $BOGOFILTER_DIR/goodlist.txt
+#$BOGOUTIL -d $BOGOFILTER_DIR/wordlist.db > $BOGOFILTER_DIR/wordlist.txt
 [ $verbose -gt 0 ] && $BOGOUTIL -w $BOGOFILTER_DIR .MSG_COUNT
 
 BOGOFILTER_DIR="${TMPDIR}"
Index: src/tests/bogofilter/t.grftest
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogofilter/t.grftest,v
retrieving revision 1.2
diff -u -r1.2 t.grftest
--- src/tests/bogofilter/t.grftest	19 May 2003 22:05:24 -0000	1.2
+++ src/tests/bogofilter/t.grftest	31 May 2003 18:47:12 -0000
@@ -90,8 +90,7 @@
     mkdir -p $BOGOFILTER_DIR
     $BOGOFILTER -$alg $CFG -s < ${SYSTEST}/inputs/spam.mbx
     $BOGOFILTER -$alg $CFG -n < ${SYSTEST}/inputs/good.mbx
-    $BOGOUTIL -d $BOGOFILTER_DIR/spamlist.db > $BOGOFILTER_DIR/spamlist.txt
-    $BOGOUTIL -d $BOGOFILTER_DIR/goodlist.db > $BOGOFILTER_DIR/goodlist.txt
+    $BOGOUTIL -d $BOGOFILTER_DIR/wordlist.db > $BOGOFILTER_DIR/wordlist.txt
     [ $verbose -gt 0 ] && $BOGOUTIL -w $BOGOFILTER_DIR .MSG_COUNT
     #
     # run tests for msg.[1-8].txt
Index: src/tests/bogofilter/t.regtest
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogofilter/t.regtest,v
retrieving revision 1.4
diff -u -r1.4 t.regtest
--- src/tests/bogofilter/t.regtest	20 May 2003 15:18:03 -0000	1.4
+++ src/tests/bogofilter/t.regtest	31 May 2003 18:47:12 -0000
@@ -41,8 +41,9 @@
     if  [ $verbose -ne 0 ]; then
 	echo "test #$T"
     fi
-    g=`$BOGOUTIL -d $BOGOFILTER_DIR/goodlist.db | grep -v " 0 " | tee $TMPDIR/good.$T.out | wc -l`
-    s=`$BOGOUTIL -d $BOGOFILTER_DIR/spamlist.db | grep -v " 0 " | tee $TMPDIR/spam.$T.out | wc -l`
+    $BOGOUTIL -d $BOGOFILTER_DIR/wordlist.db > $TMPDIR/word.$T.out
+    g=`grep -v " 0$" < $TMPDIR/word.$T.out | tee $TMPDIR/good.$T.out | wc -l`
+    s=`grep -v " 0 " < $TMPDIR/word.$T.out | tee $TMPDIR/spam.$T.out | wc -l`
     WANT=`printf "%d.%d" $S $G`
     HAVE=`printf "%d.%d" $s $g`
 
Index: src/tests/bogofilter/t.robx
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogofilter/t.robx,v
retrieving revision 1.8
diff -u -r1.8 t.robx
--- src/tests/bogofilter/t.robx	20 May 2003 15:20:19 -0000	1.8
+++ src/tests/bogofilter/t.robx	31 May 2003 18:47:12 -0000
@@ -44,9 +44,7 @@
 if [ ! -z "$RUN_FROM_MAKE" ] ; then
     $BOGOUTIL -R $TMPDIR
 else
-    for w in spamlist goodlist ; do
-	$BOGOUTIL -d $TMPDIR/$w.db > ${TMPDIR}/$w.txt
-    done
+    $BOGOUTIL -d $TMPDIR/wordlist.db > ${TMPDIR}/wordlist.txt
     $BOGOUTIL -vvv -R $TMPDIR > ${TMPDIR}/output.vvv
 fi
 
Index: src/tests/bogofilter/t.robx.sh
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogofilter/t.robx.sh,v
retrieving revision 1.1
diff -u -r1.1 t.robx.sh
--- src/tests/bogofilter/t.robx.sh	3 Feb 2003 17:03:17 -0000	1.1
+++ src/tests/bogofilter/t.robx.sh	31 May 2003 18:47:12 -0000
@@ -44,9 +44,7 @@
 if [ ! -z "$RUN_FROM_MAKE" ] ; then
     $BOGOUTIL -R $TMPDIR
 else
-    for w in spamlist goodlist ; do
-	$BOGOUTIL -d $TMPDIR/$w.db > ${TMPDIR}/$w.txt
-    done
+    $BOGOUTIL -d $TMPDIR/wordlist.db > ${TMPDIR}/wordlist.txt
     $BOGOUTIL -vvv -R $TMPDIR > ${TMPDIR}/output.vvv
 fi
 
Index: src/tests/bogofilter/t.systest
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogofilter/t.systest,v
retrieving revision 1.5
diff -u -r1.5 t.systest
--- src/tests/bogofilter/t.systest	19 May 2003 22:05:25 -0000	1.5
+++ src/tests/bogofilter/t.systest	31 May 2003 18:47:12 -0000
@@ -71,8 +71,7 @@
     mkdir $BOGOFILTER_DIR
     $BOGOFILTER -$alg -s < ${SYSTEST}/inputs/spam.mbx
     $BOGOFILTER -$alg -n < ${SYSTEST}/inputs/good.mbx
-    $BOGOUTIL -d $BOGOFILTER_DIR/spamlist.db > $BOGOFILTER_DIR/spamlist.txt
-    $BOGOUTIL -d $BOGOFILTER_DIR/goodlist.db > $BOGOFILTER_DIR/goodlist.txt
+    $BOGOUTIL -d $BOGOFILTER_DIR/wordlist.db > $BOGOFILTER_DIR/wordlist.txt
     if  [ $verbose -ne 0 ]; then
 	ls -l $BOGOFILTER_DIR/*list.txt
     fi
Index: src/tests/bogofilter/inputs/split.d/msg.gs.0119.text
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogofilter/inputs/split.d/msg.gs.0119.text,v
retrieving revision 1.2
diff -u -r1.2 msg.gs.0119.text
--- src/tests/bogofilter/inputs/split.d/msg.gs.0119.text	1 Mar 2003 00:30:17 -0000	1.2
+++ src/tests/bogofilter/inputs/split.d/msg.gs.0119.text	31 May 2003 18:47:12 -0000
@@ -7,12 +7,12 @@
 
 <html>
 <body>
-Re<!--exdcl-->p<!--exdcl-->l<!--exdcl-->y w<!--
-exdcl-->i<!--exdcl-->th o<!--exdcl-->f<!--
-exdcl-->f a<!--exdcl-->n<!--exdcl-->d I w<!--
-exdcl-->o<!--exdcl-->n't wr<!--exdcl-->i<!--
-exdcl-->te y<!--exdcl-->o<!--exdcl-->u ag<!--
-exdcl-->a<!--exdcl-->in.
+Re<!--exdcl01-->p<!--exdcl02-->l<!--exdcl03-->y w<!--
+exdcl04-->i<!--exdcl05-->th o<!--exdcl06-->f<!--
+exdcl07-->f a<!--exdcl08-->n<!--exdcl09-->d I w<!--
+exdcl10-->o<!--exdcl11-->n't wr<!--exdcl12-->i<!--
+exdcl13-->te y<!--exdcl14-->o<!--exdcl15-->u ag<!--
+exdcl16-->a<!--exdcl17-->in.
 </body>
 </html>
 
Index: src/tests/bogoutil/Makefile.am
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogoutil/Makefile.am,v
retrieving revision 1.2
diff -u -r1.2 Makefile.am
--- src/tests/bogoutil/Makefile.am	8 Mar 2003 00:11:12 -0000	1.2
+++ src/tests/bogoutil/Makefile.am	31 May 2003 18:47:12 -0000
@@ -1,6 +1,8 @@
 # $Id: Makefile.am,v 1.2 2003/03/08 00:11:12 relson Exp $
 
-TESTSCRIPTS = driver.sh t.dump.load t.nonascii.replace
+DRIVER = # driver.sh
+
+TESTSCRIPTS = $(DRIVER) t.dump.load t.nonascii.replace
 TESTS=$(TESTSCRIPTS)
 
 TESTS_ENVIRONMENT = RUN_FROM_MAKE=1 srcdir=$(srcdir) $(SHELL) $(VERBOSE)
Index: src/tests/bogoutil/input-1.txt
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogoutil/input-1.txt,v
retrieving revision 1.1
diff -u -r1.1 input-1.txt
--- src/tests/bogoutil/input-1.txt	3 Feb 2003 17:01:20 -0000	1.1
+++ src/tests/bogoutil/input-1.txt	31 May 2003 18:47:12 -0000
@@ -1,7 +1,7 @@
 # bogofilter wordlist (format version A): 1
-Habanero 1
-Bell	 22
-Marbles  333
-Tabasco  4444
-Serrano  55555
-Pimento  666666
+Habanero 1 0
+Bell	 22 0
+Marbles  333 0
+Tabasco  4444 0
+Serrano  55555 0
+Pimento  666666 0
Index: src/tests/bogoutil/input-2-data.txt
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogoutil/input-2-data.txt,v
retrieving revision 1.1
diff -u -r1.1 input-2-data.txt
--- src/tests/bogoutil/input-2-data.txt	3 Feb 2003 17:01:20 -0000	1.1
+++ src/tests/bogoutil/input-2-data.txt	31 May 2003 18:47:12 -0000
@@ -1,6 +1,6 @@
-Bell 22
-Habanero 1
-Marbles 333
-Pimento 666666
-Serrano 55555
-Tabasco 4444
+Bell 22 0
+Habanero 1 0
+Marbles 333 0
+Pimento 666666 0
+Serrano 55555 0
+Tabasco 4444 0
Index: src/tests/bogoutil/input-3-data.txt
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogoutil/input-3-data.txt,v
retrieving revision 1.1
diff -u -r1.1 input-3-data.txt
--- src/tests/bogoutil/input-3-data.txt	3 Feb 2003 17:01:21 -0000	1.1
+++ src/tests/bogoutil/input-3-data.txt	31 May 2003 18:47:12 -0000
@@ -1,7 +1,7 @@
-Bell 22
-Habanero 1
-Marbles 333
-Pimento 666666
-Serrano 55555
-Tabasco 4444
-.count  4
+Bell 22 0
+Habanero 1 0
+Marbles 333 0
+Pimento 666666 0
+Serrano 55555 0
+Tabasco 4444 0
+.count  4 0
Index: src/tests/bogoutil/input-4-data.txt
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogoutil/input-4-data.txt,v
retrieving revision 1.1
diff -u -r1.1 input-4-data.txt
--- src/tests/bogoutil/input-4-data.txt	3 Feb 2003 17:01:22 -0000	1.1
+++ src/tests/bogoutil/input-4-data.txt	31 May 2003 18:47:12 -0000
@@ -1,7 +1,7 @@
-Bell 22
-Habanero 1
-Marbles 333
-Pimento 666666
-Serrano 55555
-Tabasco 4444
-.MSG_COUNT 2
+Bell 22 0
+Habanero 1 0
+Marbles 333 0
+Pimento 666666 0
+Serrano 55555 0
+Tabasco 4444 0
+.MSG_COUNT 2 0
Index: src/tests/bogoutil/output-1.txt
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogoutil/output-1.txt,v
retrieving revision 1.1
diff -u -r1.1 output-1.txt
--- src/tests/bogoutil/output-1.txt	3 Feb 2003 17:01:22 -0000	1.1
+++ src/tests/bogoutil/output-1.txt	31 May 2003 18:47:12 -0000
@@ -1,7 +1,7 @@
-.MSG_COUNT 1
-Bell 22
-Habanero 1
-Marbles 333
-Pimento 666666
-Serrano 55555
-Tabasco 4444
+.MSG_COUNT 1 0
+Bell 22 0
+Habanero 1 0
+Marbles 333 0
+Pimento 666666 0
+Serrano 55555 0
+Tabasco 4444 0
Index: src/tests/bogoutil/output-2.txt
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogoutil/output-2.txt,v
retrieving revision 1.1
diff -u -r1.1 output-2.txt
--- src/tests/bogoutil/output-2.txt	3 Feb 2003 17:01:22 -0000	1.1
+++ src/tests/bogoutil/output-2.txt	31 May 2003 18:47:12 -0000
@@ -1,7 +1,7 @@
-.MSG_COUNT 1
-Bell 22
-Habanero 1
-Marbles 333
-Pimento 666666
-Serrano 55555
-Tabasco 4444
+.MSG_COUNT 1 0
+Bell 22 0
+Habanero 1 0
+Marbles 333 0
+Pimento 666666 0
+Serrano 55555 0
+Tabasco 4444 0
Index: src/tests/bogoutil/output-3.txt
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogoutil/output-3.txt,v
retrieving revision 1.1
diff -u -r1.1 output-3.txt
--- src/tests/bogoutil/output-3.txt	3 Feb 2003 17:01:22 -0000	1.1
+++ src/tests/bogoutil/output-3.txt	31 May 2003 18:47:12 -0000
@@ -1,7 +1,7 @@
-.MSG_COUNT 4
-Bell 22
-Habanero 1
-Marbles 333
-Pimento 666666
-Serrano 55555
-Tabasco 4444
+.MSG_COUNT 4 0
+Bell 22 0
+Habanero 1 0
+Marbles 333 0
+Pimento 666666 0
+Serrano 55555 0
+Tabasco 4444 0
Index: src/tests/bogoutil/output-4.txt
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogoutil/output-4.txt,v
retrieving revision 1.1
diff -u -r1.1 output-4.txt
--- src/tests/bogoutil/output-4.txt	3 Feb 2003 17:01:23 -0000	1.1
+++ src/tests/bogoutil/output-4.txt	31 May 2003 18:47:12 -0000
@@ -1,7 +1,7 @@
-.MSG_COUNT 2
-Bell 22
-Habanero 1
-Marbles 333
-Pimento 666666
-Serrano 55555
-Tabasco 4444
+.MSG_COUNT 2 0
+Bell 22 0
+Habanero 1 0
+Marbles 333 0
+Pimento 666666 0
+Serrano 55555 0
+Tabasco 4444 0
Index: src/tests/bogoutil/output-dl-1.txt
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogoutil/output-dl-1.txt,v
retrieving revision 1.1
diff -u -r1.1 output-dl-1.txt
--- src/tests/bogoutil/output-dl-1.txt	3 Feb 2003 17:01:24 -0000	1.1
+++ src/tests/bogoutil/output-dl-1.txt	31 May 2003 18:47:12 -0000
@@ -1,23 +1,23 @@
-------------=_1036200996-24054-149 3 20020815
---------=_1036160812-16487-96 3 20020815
---------=_1036180482-12472-42 3 20020815
-years 5 20020801
-yinfobanner.jpg 2 20020801
-york 1 20020801
-you 92 20020801
-you'd 2 20020801
-you'll 8 20020801
-you're 3 20020801
-your 76 20020801
-your-info.net 5 20020801
-yours 3 20020801
-yourself 12 20020801
-yx5 1 20020801
-ziff 1 20020801
-znex 2 20020801
-zyban 3 20020801
-zypb9g 9 20020801
-zzzzzz 2 20020801
-?jim 4 20020815
-?lydia 4 20020815
-?????sardonen 1 20021010
+------------=_1036200996-24054-149 3 0 20020815
+--------=_1036160812-16487-96 3 0 20020815
+--------=_1036180482-12472-42 3 0 20020815
+years 5 0 20020801
+yinfobanner.jpg 2 0 20020801
+york 1 0 20020801
+you 92 0 20020801
+you'd 2 0 20020801
+you'll 8 0 20020801
+you're 3 0 20020801
+your 76 0 20020801
+your-info.net 5 0 20020801
+yours 3 0 20020801
+yourself 12 0 20020801
+yx5 1 0 20020801
+ziff 1 0 20020801
+znex 2 0 20020801
+zyban 3 0 20020801
+zypb9g 9 0 20020801
+zzzzzz 2 0 20020801
+?jim 4 0 20020815
+?lydia 4 0 20020815
+?????sardonen 1 0 20021010
Index: src/tests/bogoutil/output-dl-2.txt
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogoutil/output-dl-2.txt,v
retrieving revision 1.1
diff -u -r1.1 output-dl-2.txt
--- src/tests/bogoutil/output-dl-2.txt	3 Feb 2003 17:01:24 -0000	1.1
+++ src/tests/bogoutil/output-dl-2.txt	31 May 2003 18:47:12 -0000
@@ -1,28 +1,28 @@
-------------=_1036200996-24054-149 3 20020815
---------=_1036160812-16487-96 3 20020815
---------=_1036180482-12472-42 3 20020815
-years 5 20020801
-yinfobanner.jpg 2 20020801
-york 12 20021215
-you 92 20020801
-you'd 2 20020801
-you'll 8 20020801
-you're 3 20020801
-your 152 20021004
-your-34 34 20021004
-your-info.net 10 20021215
-yours 3 20020801
-yours-340 340 20020820
-yourself 12 20020801
-yx5 1 20020801
-ziff 1 20020801
-znex 2 20020801
-zyban 3 20020801
-zypb9g 9 20020801
-zzzzzz 2 20020801
-?jim 4 20020815
-?lydia 4 20020815
-?????sardonen 1 20021010
-????speed~????????? 2 20021020
-???? 2 20021210
-???? 2 20021210
+------------=_1036200996-24054-149 3 0 20020815
+--------=_1036160812-16487-96 3 0 20020815
+--------=_1036180482-12472-42 3 0 20020815
+years 5 0 20020801
+yinfobanner.jpg 2 0 20020801
+york 12 0 20021215
+you 92 0 20020801
+you'd 2 0 20020801
+you'll 8 0 20020801
+you're 3 0 20020801
+your 152 0 20021004
+your-34 34 0 20021004
+your-info.net 10 0 20021215
+yours 3 0 20020801
+yours-340 340 0 20020820
+yourself 12 0 20020801
+yx5 1 0 20020801
+ziff 1 0 20020801
+znex 2 0 20020801
+zyban 3 0 20020801
+zypb9g 9 0 20020801
+zzzzzz 2 0 20020801
+?jim 4 0 20020815
+?lydia 4 0 20020815
+?????sardonen 1 0 20021010
+????speed~????????? 2 0 20021020
+???? 2 0 20021210
+???? 2 0 20021210
Index: src/tests/bogoutil/output-dl-3.txt
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogoutil/output-dl-3.txt,v
retrieving revision 1.1
diff -u -r1.1 output-dl-3.txt
--- src/tests/bogoutil/output-dl-3.txt	3 Feb 2003 17:01:26 -0000	1.1
+++ src/tests/bogoutil/output-dl-3.txt	31 May 2003 18:47:12 -0000
@@ -1,9 +1,9 @@
-york 12 20021215
-your 152 20021004
-your-34 34 20021004
-your-info.net 10 20021215
-yours-340 340 20020820
-?????sardonen 1 20021010
-????speed~????????? 2 20021020
-???? 2 20021210
-???? 2 20021210
+york 12 0 20021215
+your 152 0 20021004
+your-34 34 0 20021004
+your-info.net 10 0 20021215
+yours-340 340 0 20020820
+?????sardonen 1 0 20021010
+????speed~????????? 2 0 20021020
+???? 2 0 20021210
+???? 2 0 20021210
Index: src/tests/bogoutil/output-dl-4.txt
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogoutil/output-dl-4.txt,v
retrieving revision 1.1
diff -u -r1.1 output-dl-4.txt
--- src/tests/bogoutil/output-dl-4.txt	3 Feb 2003 17:01:27 -0000	1.1
+++ src/tests/bogoutil/output-dl-4.txt	31 May 2003 18:47:12 -0000
@@ -1,9 +1,9 @@
-york 12 20021215
-your 152 20021004
-your-34 34 20021004
-your-info.net 10 20021215
-yours-340 340 20020820
-?????sardonen 1 20021010
-????speed~????????? 2 20021020
-???? 2 20021210
-???? 2 20021210
+york 12 0 20021215
+your 152 0 20021004
+your-34 34 0 20021004
+your-info.net 10 0 20021215
+yours-340 340 0 20020820
+?????sardonen 1 0 20021010
+????speed~????????? 2 0 20021020
+???? 2 0 20021210
+???? 2 0 20021210
Index: src/tests/bogoutil/output-dl-5.txt
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogoutil/output-dl-5.txt,v
retrieving revision 1.1
diff -u -r1.1 output-dl-5.txt
--- src/tests/bogoutil/output-dl-5.txt	3 Feb 2003 17:01:27 -0000	1.1
+++ src/tests/bogoutil/output-dl-5.txt	31 May 2003 18:47:12 -0000
@@ -1,5 +1,5 @@
-your-34 34 20021004
-your-info.net 10 20021215
-yours-340 340 20020820
-?????sardonen 1 20021010
-????speed~????????? 2 20021020
+your-34 34 0 20021004
+your-info.net 10 0 20021215
+yours-340 340 0 20020820
+?????sardonen 1 0 20021010
+????speed~????????? 2 0 20021020
Index: src/tests/bogoutil/output-dl-6.txt
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogoutil/output-dl-6.txt,v
retrieving revision 1.1
diff -u -r1.1 output-dl-6.txt
--- src/tests/bogoutil/output-dl-6.txt	3 Feb 2003 17:01:27 -0000	1.1
+++ src/tests/bogoutil/output-dl-6.txt	31 May 2003 18:47:12 -0000
@@ -1,2 +1,2 @@
-????speed~????????? 2 20021020
-your-info.net 10 20021215
+????speed~????????? 2 0 20021020
+your-info.net 10 0 20021215
Index: src/tests/bogoutil/t.dump.load.inp
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogoutil/t.dump.load.inp,v
retrieving revision 1.1
diff -u -r1.1 t.dump.load.inp
--- src/tests/bogoutil/t.dump.load.inp	3 Feb 2003 17:01:27 -0000	1.1
+++ src/tests/bogoutil/t.dump.load.inp	31 May 2003 18:47:12 -0000
@@ -1,23 +1,23 @@
-?jim 4
-?lydia 4
-?????sardonen 1 20021010
-years 5 20020801
-yinfobanner.jpg 2 20020801
-york 1 20020801
-you 92 20020801
-you'd 2 20020801
-you'll 8 20020801
-you're 3 20020801
-your 76 20020801
-your-info.net 5 20020801
-yours 3 20020801
-yourself 12 20020801
-yx5 1 20020801
-ziff 1 20020801
-znex 2 20020801
-zyban 3 20020801
-zypb9g 9 20020801
-zzzzzz 2 20020801
---------=_1036160812-16487-96 3
---------=_1036180482-12472-42 3
-------------=_1036200996-24054-149 3
+?jim 4 0
+?lydia 4 0
+?????sardonen 1 0 20021010
+years 5 0 20020801
+yinfobanner.jpg 2 0 20020801
+york 1 0 20020801
+you 92 0 20020801
+you'd 2 0 20020801
+you'll 8 0 20020801
+you're 3 0 20020801
+your 76 0 20020801
+your-info.net 5 0 20020801
+yours 3 0 20020801
+yourself 12 0 20020801
+yx5 1 0 20020801
+ziff 1 0 20020801
+znex 2 0 20020801
+zyban 3 0 20020801
+zypb9g 9 0 20020801
+zzzzzz 2 0 20020801
+--------=_1036160812-16487-96 3 0
+--------=_1036180482-12472-42 3 0
+------------=_1036200996-24054-149 3 0
Index: src/tests/bogoutil/t.dump.load.upd
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogoutil/t.dump.load.upd,v
retrieving revision 1.1
diff -u -r1.1 t.dump.load.upd
--- src/tests/bogoutil/t.dump.load.upd	3 Feb 2003 17:01:27 -0000	1.1
+++ src/tests/bogoutil/t.dump.load.upd	31 May 2003 18:47:12 -0000
@@ -1,8 +1,8 @@
-york 11
-your 76 20021004
-your-info.net 5
-your-34 34 20021004
-yours-340 340 20020820
-????speed~????????? 2 20021020
-???? 2 20021210
-???? 2 20021210
+york 11 0
+your 76 0 20021004
+your-info.net 5 0
+your-34 34 0 20021004
+yours-340 340 0 20020820
+????speed~????????? 2 0 20021020
+???? 2 0 20021210
+???? 2 0 20021210
Index: src/tests/bogoutil/t.nonascii.replace
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/tests/bogoutil/t.nonascii.replace,v
retrieving revision 1.2
diff -u -r1.2 t.nonascii.replace
--- src/tests/bogoutil/t.nonascii.replace	8 Mar 2003 04:32:54 -0000	1.2
+++ src/tests/bogoutil/t.nonascii.replace	31 May 2003 18:47:12 -0000
@@ -12,16 +12,16 @@
 #
 # test below
 # remember to use ${srcdir}
-echo  	41 A4 BA B5 B5 20 31     20  32 30 30 33 30 33 30 33 0A \
-	41 C1 BA B8 B5 20 32     20  32 30 30 32 31 32 30 32 0A \
-	41 BA C1 B8 B5 20 33     20  32 30 30 33 30 33 30 31 0A \
-  	42 A4 BA B8 B5 B5 20 31  20  32 30 30 33 30 33 30 33 0A \
-	42 C1 BA B8 B5 B5 20 32  20  32 30 30 32 31 32 30 32 0A \
-	42 BA C1 B8 B5 B5 20 33  20  32 30 30 33 30 33 30 31 0A \
-	42 C1 BA B5 B8 B5 20 34  20  32 30 30 33 30 33 30 34 0A \
+echo  	41 A4 BA B5 B5 20 31     20  30 20  32 30 30 33 30 33 30 33 0A \
+	41 C1 BA B8 B5 20 32     20  30 20  32 30 30 32 31 32 30 32 0A \
+	41 BA C1 B8 B5 20 33     20  30 20  32 30 30 33 30 33 30 31 0A \
+  	42 A4 BA B8 B5 B5 20 31  20  30 20  32 30 30 33 30 33 30 33 0A \
+	42 C1 BA B8 B5 B5 20 32  20  30 20  32 30 30 32 31 32 30 32 0A \
+	42 BA C1 B8 B5 B5 20 33  20  30 20  32 30 30 33 30 33 30 31 0A \
+	42 C1 BA B5 B8 B5 20 34  20  30 20  32 30 30 33 30 33 30 34 0A \
 | ../dehex >${TMPDIR}/input
 
-WORDLIST="${TMPDIR}/spamlist.db"
+WORDLIST="${TMPDIR}/wordlist.db"
 
 rm -f ${WORDLIST}
 
@@ -33,8 +33,8 @@
 LEN1=`wc -l ${TMPDIR}/output.1 | awk '{print $1}'`
 LEN2=`wc -l ${TMPDIR}/output.2 | awk '{print $1}'`
 
-TOKDAT1=`head -1 ${TMPDIR}/output.2 | awk '{print $2 "." $3 }'`
-TOKDAT2=`tail -1 ${TMPDIR}/output.2 | awk '{print $2 "." $3 }'`
+TOKDAT1=`head -1 ${TMPDIR}/output.2 | awk '{print $2 "." $4 }'`
+TOKDAT2=`tail -1 ${TMPDIR}/output.2 | awk '{print $2 "." $4 }'`
 
 RESULT=`printf "%d.%d.%s.%s" $LEN1 $LEN2 $TOKDAT1 $TOKDAT2`
 



More information about the bogofilter-dev mailing list