[bogofilter] using block_on_subnets
relson at osagesoftware.com
Wed Apr 28 10:15:50 EDT 2004
On Wed, 28 Apr 2004 08:56:14 -0500
Bill McClain wrote:
> I rebuilt my wordlist recently and turned on "block_on_subnets=yes". I
> don't know why I hadn't done it before -- the technique is pretty
> valuable. Looking just at the top-level domains (url:nnn), about 60%
> occur only in spam. But: some of the tokens have a low message count,
> lessening their presumed value.
Tokens with low message counts are actually quite valuable -- especially
when the ham::spam ratio is very low or very high.
> (Actually, this is hard to judge because I've also started using
> thresh_update and very high- and low-scoring messages are no longer
> registered. Could be the low-count tokens are referenced all the time
> without being incremented).
Out of curiosity, what value are you using for thresh_update? I'm using
0.01 and have noticed that my wordlists are growing much slower than
before. Also .MSG_COUNT is only increasing 20 or 30 per day (rather
The subnet histogram is interesting. With my min_dev of 0.435 very few
domains are hammish while many are spammish. Interesting :-)
More information about the Bogofilter