nan in bogofilter stats
David Relson
relson at osagesoftware.com
Wed Nov 19 02:03:36 CET 2008
On Wed, 19 Nov 2008 10:47:25 +1030
Stephen Davies wrote:
> Thanks for the feedback David.
>
> I get:
>
> bogoutil -w wordlist.db .MSG_COUNT
> spam good
> .MSG_COUNT 312870 1
>
> What does this actually mean?
>
> Cheers,
> Stephen
To compute a token's spamicity, bogofilter needs to know how
many spam and ham messages have been registered (in the
wordlist). .MSG_COUNT is the special token that provides this info.
The numbers 312870 and 1 indicate that 312870 spam messages and 1 ham
message have been registered. The value 312870 is reasonable while the
value 1 seems unreasonably low.
FWIW, "bogoutil -d wordlist.db > wordlist.txt" will dump your wordlist
as a text file. Each line has a token, its spam and ham counts, and a
timestamp. .MSG_COUNT's "good" value _should_ be greater than any ham
count.
It might be time to start a new wordlist and register all the ham and
spam you have available. I'd also recommend backing up your wordlist
periodically in case of future problems. Lastly, switching from
NON-TRANSACTIONAL bogofilter to TRANSACTIONAL bogofilter will provide a
more secure database environment.
HTH,
David
More information about the Bogofilter
mailing list