[bogofilter] using block_on_subnets

David Relson relson at osagesoftware.com
Wed Apr 28 16:15:50 CEST 2004


On Wed, 28 Apr 2004 08:56:14 -0500
Bill McClain wrote:

> I rebuilt my wordlist recently and turned on "block_on_subnets=yes". I
> don't know why I hadn't done it before -- the technique is pretty
> valuable. Looking just at the top-level domains (url:nnn), about 60%
> occur only in spam. But: some of the tokens have a low message count,
> lessening their presumed value.

Hi Bill,

Tokens with low message counts are actually quite valuable -- especially
when the ham::spam ratio is very low or very high.
 
> (Actually, this is hard to judge because I've also started using
> thresh_update and very high- and low-scoring messages are no longer
> registered. Could be the low-count tokens are referenced all the time
> without being incremented).

Out of curiosity, what value are you using for thresh_update?  I'm using
0.01 and have noticed that my wordlists are growing much slower than
before.  Also .MSG_COUNT is only increasing 20 or 30 per day (rather
than 500+).

The subnet histogram is interesting.  With my  min_dev of 0.435 very few
domains are hammish while many are spammish.   Interesting :-)

David



More information about the Bogofilter mailing list