wordlist oddness?
David Relson
relson at osagesoftware.com
Thu Nov 7 06:06:11 CET 2002
Clint,
I ran my test and didn't see anything wrong. Below's what I did, with the
outputs. If you can reproduce your problem, pack the test message and the
run outputs in a .tgz and send the file to me.
David
### first break the message into words, sort them, remove dups, and save
[relson at osage bogofilter-0.8.0]$ bogolexer -p < msg.1103.txt | sort -u > words
### get start-of-test counts for the first 10 words from the message
[relson at osage bogofilter-0.8.0]$ head words | bogoutil -w /var/lib/bogofilter
spam good
64.83.123.32 0 0
about 387 8350
adl 0 1
advice 36 642
advised 2 55
after 222 3853
ages 17 47
all 629 4309
already 92 2188
amazing 108 149
### update spam list from the message
[relson at osage bogofilter-0.8.0]$ bogofilter -s < msg.1103.txt
### then check counts for the first 10 words from the message
### notice that the counts went up. Since bogofilter defaults to
### Graham it's okay for a count to increase as much as 4.
[relson at osage bogofilter-0.8.0]$ head words | bogoutil -w /var/lib/bogofilter
spam good
64.83.123.32 1 0
about 388 8350
adl 1 1
advice 39 642
advised 3 55
after 223 3853
ages 18 47
all 630 4309
already 93 2188
amazing 110 149
### repeat the update using '-r' (robinson) option.
### ... the counts went up by exactly one per word
### (consistent with max allowed by Robinson)
[relson at osage bogofilter-0.8.0]$ bogofilter -r -s < msg.1103.txt
[relson at osage bogofilter-0.8.0]$ head words | bogoutil -w /var/lib/bogofilter
spam good
64.83.123.32 2 0
about 389 8350
adl 2 1
advice 40 642
advised 4 55
after 224 3853
ages 19 47
all 631 4309
already 94 2188
amazing 111 149
More information about the Bogofilter
mailing list