better Bayesian bogofilter

Peter Bishop pgb at adelard.com
Thu Aug 14 11:51:47 CEST 2003


On 13 Aug 2003 at 7:47, Greg Louis wrote:

> > What I'm more interested in knowing is exactly _how_ you plan to keep track 
> > of the ham/spam ratio.  One thought that crosses my mind is having a 
> > ".SCORE" token rather like .MSG_COUNT.  If I understand your article, 
> > .SCORE needs to be updated for each ham and each spam scored.
> 

Is there a problem if the database is shared? 

I have set up a shared database that is referenced by mutliple users
Each user has a different spam/ham ratio, but updating SCORE in the 
database would compute an average spam/ham ratio for all users.

This might not matter too much. but strictly speaking, a score should be 
kept for each user.

A separately defined score file would be more flexible, e.g.
dbasefile: /usr/var/bogofilter/wordlist.cdb
scorefile: $HOME/.bogofilter.db
-- 
Peter Bishop 
Adelard and Centre for Software Reliability, City University
Drysdale Building, 10 Northampton Square, London, EC1V 0HB
Tel: +44-20-7490-9467, Fax: +44-20-7490-9451
pgb at adelard.com, http://www.adelard.com/
pgb at csr.city.ac.uk, http://www.city.ac.uk/





More information about the Bogofilter mailing list