WARNING: Problem with '-u' option and separate wordlists

David Relson relson at osagesoftware.com
Fri Sep 26 14:11:37 CEST 2003


Greetings,

As those of you reading the list yesterday heard, a problem was reported
with bogofilter-0.15.4 when the '-u' (auto-update) flag is used.  The
problem happens only when old-style separate wordlists are used, not
when the new combined wordlist is used.  The defect in bogofilter is
that file src/register.c doesn't provide the information that
src/datastore.c needs so it can update the proper database.  With the
new (combined) wordlist, there is only one wordlist to update and the
problem doesn't occur.

The impact of this defect is that auto-update with separate wordlists
hasn't been happening.  This means that tokens haven't been added to the
wordlist as one would expect, so the wordlists don't grow (as expected)
and don't have as many tokens and those present have lower counts.
 
Attached is a test script named test.W.WW.sh which demonstrates the
problem using the following steps:

1 - run bogofilter in both separate wordlist mode (using "-WW" flags)
and in combined wordlist mode (using "-W")
2 - use standard testing files spam.mbx and good.mbx to build the
wordlist(s)
3 - print the counts for tokens "subj:Earn", "subj:Your",
"subj:College", "subj:Degree"
4 - score a one-line spam message, i.e "Subject: Earn Your College
Degree", with the '-u' flag set
5 - print the token counts a second time

Since the tokens are all from spam messages, bogofilter should update
the spam counts.

The script's output shows that counts are updated in combined mode, i.e.
wordlist.db, but not in separate mode, i.e. spamlist.db.

The problem affects all versions of bogofilter since 0.14.0 when the
combined wordlist was released.  However it only affects the '-u'
option.  

Also attached is patch file patch.register.c.0926, which has the fix and
can be applied to file register.c (in the bogofilter/src directory).

The fix will be in bogofilter-0.15.5 which I expect to release this
weekend.

David
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.W.WW.sh
Type: application/x-sh
Size: 717 bytes
Desc: not available
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20030926/2648767a/attachment.sh>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch.register.c.0926
Type: application/octet-stream
Size: 1250 bytes
Desc: not available
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20030926/2648767a/attachment.obj>


More information about the Bogofilter mailing list