bogofilter-0.93.2 - New Current Release

David Relson relson at osagesoftware.com
Sat Dec 4 01:25:52 CET 2004


Greetings,

The 0.93.2 release of bogofilter brings with it a variety of bug fixes,
usability enhancements, and documentation updates.  Like the 0.93.1
release, these changes mostly pertaining to bogofilter's use of Berkeley
DB's Transaction capability.  

Be sure to read file RELEASE.NOTES, available on SourceForge:
http://sourceforge.net/project/showfiles.php?group_id=62265


Here's the summary of the two major changes that comprised the heart of
the previous release - bogofilter-0.93.0:

1) Bogofilter now uses BerkeleyDB's transactional capability to ensure
database integrity.  Berkeley DB uses additional files in the wordlist
directory to keep state and logging information. See file doc/README.db
for important info.

2) Bogofilter now defaults to tri-state configuration using cutoff
values of 0.45 and 0.99 (for ham_cutoff and spam_cutoff, respectively).
The ham_cutoff value is new and spam_cutoff is unchanged.  With these
cutoffs, messages with scores between 0.45 and 0.99 are unsures.

In tri-state mode messages are scored as "Spam", "Ham", or "Unsure"
rather than just "Yes" or "No".  This affects the "X-Bogosity:" line and
you may need to change scripts in procmail, maildrop, etc and filters in
your MUA. 

Files are available at http://sourceforge.net/projects/bogofilter for
download.

Here are the md5sums for the release:

5e52a0aa82ae70b34e9ff2db1ca553c1  bogofilter-0.93.2-1.i586.rpm
b5d9d51da002d61816608c4c7853da25  bogofilter-0.93.2-1.src.rpm
19e6e3c19ca99a444e324a672805b5fd  bogofilter-0.93.2.tar.bz2
789b90159d9863d9f02fa5ea749f88da  bogofilter-0.93.2.tar.gz
8719972e0b544bc55c578fee5619312f  bogofilter-static-0.93.2-1.i586.rpm

			       =================
				BOGOFILTER NEWS
			       =================

NOTE: More information on important changes for bogofilter updaters
is in the RELEASE.NOTES files.  Read them!!

RELEASE.NOTES has two important sections entitled:

        INCOMPATIBLE CHANGES IN BOGOFILTER 0.93
and     MAJOR CHANGES IN BOGOFILTER 0.93

Briefly:

	** Bogofilter is now using Berkeley DB's Transaction
	   capability to ensure database integrity.

	** Bogofilter is now generating tri-state results labeled
	   Spam, Ham, and Unsure, compared to the old two-state Yes/No
	   results.

	!!!!!!!! READ THE RELEASE.NOTES !!!!!!!!

0.93.2	2004-12-03

	* New script bf_resize DIR that checks the sizes of all databases in an
	  environment and writes a lock size to DB_CONFIG.

	2004-12-02

	* Accuracy fix: message counts of ignore lists (that can be present)
	  will be ignored and no longer skew the spamicity.

	2004-12-01

	* Allow environment to be group writable, reported by Fletcher Mattox.

	* Accuracy fix: no longer pretend that we had seen an empty message
	  registered when there was no registration. Use ROBX for spamicity.
	  This changes the output format of bogofilter -vvv mode when no spam
	  or no ham messages have been registered previously.

	2004-11-29

	* Support for Berkeley DB 3.0 was explicitly removed again, so that no
	  stable bogofilter version since 0.17.5 will have had support for this
	  version. This eliminates the need for on-disk database format
	  upgrades and keeps things simple.
	  As the unadvertised breaking of BDB 3.0 didn't raise a single
	  complaint and 3.1 has been around since July 2000, this should be
	  safe.

	* Support long options in bogoutil.

	* Add --remove-environment DIR long option to bogoutil, to remove the
	  environment. Only one such option can be used and there is no
	  corresponding short option.

	* Remove useless numeric Berkeley DB error codes from error messages.

	2004-11-26

	* bogofilter processes will refuse to open multiple wordlists in
	  different database environments (directories) when the transactional
	  Berkeley DB datastore is compiled (default). The non-transactional
	  (--disable-transactions), QDBM and TDB datastores are unaffected.

	2004-11-21

	* bogotune now uses getopt() to process the argument list,
	  hence requires a '-n' flag before each non-spam file and a
	  '-s' flag before each spam file.
	* bogotune now accepts '-x flags' to set debug flags.

	2004-11-20

	* Make scoring one huge transaction, rather than one individual
	  transaction per token. This fixes consistency and should improve
	  score speed.

	  WARNING: this seems to have broken bogotune, which, BTW, doesn't
	  return errors to the test suite (t.bulkmode, with message-count
	  files), it reports a bogus "PASS" in spite of database PANICs.

	2004-11-19

	* Restored the old traditional Berkeley DB datastore that cannot be
	  recovered. Its use is discouraged, to use this, type
	  ./configure --disable-transactions

	* Restored the error message when recovery is attempted on QDBM
	  databases, was lost in the DEPOT (hash) ->VILLA (B+tree) switch.

	2004-11-15

	* Added utility script bf_tar.

	2004-11-14

	* Added utility scripts bf_copy and bf_compact.
	* Added BerkeleyDB warning for binary rpm users.

	2004-11-12

	* New entries in bogofilter-faq.html on error messages
	      "Lock table is out of available locks" and
	      "Lock table is out of available object entries"

	* Add %u formatting option to print login or user ID information,
	  SourceForge Feature Request #1056729.

0.93.1	2004-11-11

	* The README.db file now has information on the DB_CONFIG file that
	  can be created and used to configure the Berkeley DB module.

	* Bogofilter's config file now supports setting max lock and
	  object counts for Berkeley DB using options
	      db_lk_max_locks=N
	      db_lk_max_objects=N

	* Bogofilter and bogoutil now allow these options on the
          command line, as:
	      --db_lk_max_locks=N
	      --db_lk_max_objects=N

	* When running database recovery automatically, don't let go of the
	  lockfile, so we can do our actual work subsequently.

	2004-11-10

	* Support for BerkeleyDB 4.3 was added. We'll avoid DB_NOSYNC on
	  DB->close() when DB_LOG_INMEMORY is configured for now.

	* Update manual pages/example outputs and filter recipe examples from
	  "X-Bogosity: yes" to "X-Bogosity: Spam". Fixes Debian bug #280557.
	
	* Bugfix for BerkeleyDB 4.2 support: check the data base flags, not the
	  environment flags, for DB_TXN_NOT_DURABLE, when determining whether
	  DB_NOSYNC is safe on DB->close(). May fix some kinds of database
	  corruption encountered with DB_TXN_NOT_DURABLE.

	* Return DB_VERSION_STRING contents in -V (version) output when
	  compiled against Berkeley DB. Minor change to the output format.

	2004-11-09

	* Unify and clean up the horrible RELEASE.NOTES-*, CHANGES* and NEWS-*
	  mess with lots of duplicated info.
	  There shall only be one RELEASE.NOTES file and one NEWS file.
	  RELEASE.NOTES shall contain important information for updates.
	  NEWS shall contain noteworthy code changes in technical detail.

	  This also removes the confusion that RELEASE.NOTES didn't contain
	  information relevant for 0.93.X.

	2004-11-08

	* Berkeley DB mode: do not create data base in read mode (properly map
	  open_mode to DB_RDONLY flag, store open_mode).

	* Berkeley DB mode: exit with error code if lock file cannot be
	  created. Attempt recovery even if creation of lock file succeeded.

	2004-11-07

	* Fixed negative buffer index in mime.c

0.93.0	2004-11-06 "Broken compatibility" release

	* Fix bogotune's '-D' option.

	2004-11-02

	* Use only reentrant functions in the signal handler that runs
	  periodically to check for crashed processes.
	  Reported by Pavel Kankovsky.

	2004-11-01

	* Add a debugged and enhanced version of Stefan Bellon's QDBM
	  Hash->B+tree converter.

	* Broke QDBM compatibility with 2004-10-30 change, check unsigned
	  characters to match Berkeley DB behavior of bogoutil -d.

	2004-10-31

	* Rearranged flag setting for Berkeley DB data store, so as only to set
	  DB_CHKSUM[_SHA1] when creating the data base.
	  Fixes "checksum error: catastrophic recovery required" and
	  consequential "wordlist.db: page 1: reference count overflow" errors
	  Reported by Torsten Veller.

	* Revised RELEASE.NOTES-0.93 to move QDBM change into "Incompatible
	  Changes" section and to mention BerkeleyDB dump/load for 4.1 and 4.2
	  to add checksums.

	* Inserted new section 2.2 into doc/README.db to mention that it is
	  recommended to dump/load the data base when using BerkeleyDB 4.1 and
	  4.2.

	2004-10-30

	* Converted QDBM from hash files (DEPOT API) to B+ trees
          (Villa API) for better speed (Stefan Bellon).

	2004-10-29

	* Attempting recovery with TDB or QDBM data bases results in an error,
	  so the user does not think it succeeded.

	* Document that recovery only works for Berkeley DB, but not TDB or
	  QDBM.

	2004-10-28

	* Merged Transactional branch (for BerkeleyDB) back into the trunk.
	  Further changes below.

	2004-10-25

	* Added GETTING.STARTED document.

	* Changed default mode from two-state to three-state
	  - with ham_cutoff=0.45 and spam_cutoff=0.99
            The ham_cutoff value is new and spam_cutoff is unchanged.
	  - changed the "Yes/No" tags used in the "X-Bogosity:" line
	    to "Spam/Ham/Unsure"

	NOTE: the next entries appear to be out of order, the pertinent changes
	have been developed on a side branch of bogofilter and have been merged
	for bogofilter 0.93.0.

	2004-09-21

	* bogofilter can now be used with Berkeley DB 3.0 or 3.1 although this
	  is not recommended. You should prefer 4.2 or 4.1 instead.
	  UPDATE: support for 3.0 was later removed on 2004-11-29

	* Documentation on the write cache issue (recoverability of data bases)
	  has been revised.

	2004-09-13

	* Updates doc/README.db with a section on the log file size and
	  pointers to db_checkpoint and db_archive.

	2004-09-03 (txn 2.1)

	* The on-line crash detector would consider its own process a zombie,
	  so all processes that lasted 30 s or longer would abort themselves
	  after that period.

	  This was particularly prominent with BerkeleyDB 4.1 with
	  x86/gcc-assembly mutexes as this combination appears rather slow when
	  facing lock contention, causing t.lock3 failure. BDB 4.1 compiled to
	  use POSIX mutexes (where working) appears to be a lot faster in this
	  situation.

	2004-09-01 (txn 2.0)

	* Hook up crash detection code. Bogofilter is now able to detect
	  when recovery is necessary and should detect stalled data bases
	  within 30 seconds.
	  NOTE: this means if one process crashes all other processes
	  accessing the same data base will abort with an error code.

	  Stalled data bases happen when one process or the system crashes and
	  doesn't have a chance to clear its locks.

	  This code uses ideas from Matthias Andree and Pavel Kankovsky.

	2004-08-23 (txn 1.1)

	* Add -f and -F options to bogoutil (mnemonic: fix) to run data base
	  recovery.

	* Reimplement our own locking so that recovery and data base access
	  don't collide and no two processes try running recovery at the same
	  time.




More information about the Bogofilter mailing list