Bogofilter is a mail filter that classifies mail as spam or ham (non-spam) by a statistical analysis of the message's header and content (body). The program is able to learn from the user's classifications and corrections.

Bogofilter is or can be integrated with graphical mailers, such as KDE's KMail, GNOME's Evolution or Claws Mail (formerly known as Sylpheed-Claws), or it is run by a mail delivery agent (maildrop, procmail) script to classify an incoming message as spam or ham (using wordlists stored by BerkeleyDB). Bogofilter provides processing for plain text and HTML. It supports multi-part MIME messages with decoding of base64, quoted-printable, and uuencoded text and ignores attachments, such as images.

The statistical technique is known as the Bayesian technique and its use for spam was described by Paul Graham in his article A Plan For Spam in August 2002. Gary Robinson, in his weblog Rants (September 2002), suggested some refinements for improved discrimination between spam and ham. Bogofilter's primary algorithm uses the f(w) parameter and the Fisher inverse chi-square technique that he describes. Paul Graham's new article Better Bayesian Filtering (January 2003) suggests some useful parsing improvements.

Bogofilter is written in C. Supported platforms: Linux, FreeBSD, Solaris, OS X, HP-UX, AIX, RISC OS, SunOS, OS/2 …

Author Eric S. Raymond wrote the initial version of bogofilter. Since 2002 it has been brought to maturity by David Relson, Matthias Andree, Greg Louis, and a group of open source volunteers.

