Can not get spamicity larger than 0.520000
David Relson
relson at osagesoftware.com
Thu Jun 15 01:34:09 CEST 2006
On Wed, 14 Jun 2006 10:15:24 -0700
otr comm wrote:
> Hello,
>
> i have fedora core 5 system, kernel 2.6.15-1.2054_FC5smp, berkeleydb
> 4.3, gsl-1.8, and trying to get bogofilter 1.0.2 operating.
>
> I posted this earlier, and was told to seed the wordlist.db with a
> few spam and ham messages. okay, i did, but now i can not get
> spamicity to go greater than 0.520000.
>
> i have feed three known spam messages and three known ham messages
> into wordlist.db like so:
>
> Unsure-to-Spam:
> /usr/local/bin/bogofilter -d .bogofilter -s -B INBOX/somemessage#1
> /usr/local/bin/bogofilter -d .bogofilter -s -B INBOX/somemessage#2
> /usr/local/bin/bogofilter -d .bogofilter -s -B INBOX/somemessage#3
>
>
> Unsure-to-Ham:
> /usr/local/bin/bogofilter -d .bogofilter -n -B INBOX/somemessage#4
> /usr/local/bin/bogofilter -d .bogofilter -n -B INBOX/somemessage#5
> /usr/local/bin/bogofilter -d .bogofilter -n -B INBOX/somemessage#6
>
> and then I check each message with:
>
> /usr/local/bin/bogofilter -e -p -c bogofilter.cf -d .bogofilter -B
> INBOX/somemessage
>
> and the messages that i classified as Ham come back with spamicity of
> around 0.003067, which is what I expected.
> however, the messages that i classified as Spam all come back with
> spamicity of 0.520000, which is not correct.
>
> the Spam messages never vary in spamicity, they are always 0.520000.
>
> what am i doing wrong here?
>
> Thanks,
>
> murrah boswell
Hello Murrah,
Sorry, insufficient information.... When I try a comparable setup and
scoring, all is fine here.
Let's try a test to see if your bogofilter installation is acting
normally. For the test, create a new test directory with 4 ham and 4
spam messages named h1, h2, h3, h4, s1, s2, s3, and s4. Then run the
following commands:
rm -f ./wordlist.db
bogofilter -C -d . -v -n -B h1 h2
bogofilter -C -d . -v -s -B s1 s2
bogofilter -C -d . -v -B h1 h2 h3 h4 s1 s2 s3 s4
Note the "-C" option which tells bogofilter to _not_ read a config
file, i.e. use its default settings. It may be that your bogofilter.cf
file is b0rked.
Here are my results (using 4 randomly selected spam from my spam
mailbox and 4 ham from my inbox):
+ bogofilter -v -C -d . -n -B h1 h2
# 1124 words, 2 messages
+ bogofilter -v -C -d . -s -B s1 s2
# 130 words, 2 messages
+ bogofilter -v -C -d . -B h1 h2 h3 h4 s1 s2 s3 s4
h1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.0.2
h2 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.0.2
h3 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.0.2
h4 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.0.2
s1 X-Bogosity: Spam, tests=bogofilter, spamicity=1.000000, version=1.0.2
s2 X-Bogosity: Spam, tests=bogofilter, spamicity=1.000000, version=1.0.2
s3 X-Bogosity: Spam, tests=bogofilter, spamicity=1.000000, version=1.0.2
s4 X-Bogosity: Spam, tests=bogofilter, spamicity=1.000000, version=1.0.2
If your results are radically different, make a .tgz of your test
directory and email it to me off-list. So that I can reproduce your
results, please create a script with the _exact_ commands you've used.
If you get different results with "-C" and "-c bogofilter.cf" then your
config file is definitely broken. Feel free to include it in the tgz
file.
Regards,
David
More information about the Bogofilter
mailing list