procmail: Non-zero exitcode (1) from "/usr/bin/bogofilter"
dhottinger at harrisonburg.k12.va.us
dhottinger at harrisonburg.k12.va.us
Fri Sep 14 15:59:06 CEST 2007
Quoting Tom Anderson <tanderso at oac-design.com>:
>
> dhottinger at harrisonburg.k12.va.us wrote:
>> Quoting David Relson <relson at osagesoftware.com>:
>>
>>> On Thu, 13 Sep 2007 07:57:20 -0400
>>> dhottinger at harrisonburg.k12.va.us wrote:
>>>
>>>> Quoting David Relson <relson at osagesoftware.com>:
>>>>
>>>>> On Thu, 13 Sep 2007 06:35:05 -0400
>>>>> dhottinger at harrisonburg.k12.va.us wrote:
>>>>>
>>>>> ..[snip]...
>>>> I ran bogoutil -p ..../wordlist.db .MSG_COUNT
>>>> spam good Fisher
>>>> 111746 0 nan
>>> Bogofilter needs both good and spam email to work properly. With a
>>> "zero" good count, it can't work. Certainly feeding a bunch of ham to
>>> it would help. Ideally there's a reasonable balance of ham to spam.
>>> Though there's no precise proper ratio for "balance", under 1::10 will
>>> likely work. Have you 11,000 ham to train with? What might work a lot
>>> better is to check wordlist.db files in your backup tapes to find a
>>> wordlist with reasonable .MSG_COUNT values.
>>
>> After I sent the email, I fed several users mailboxes (after checking
>> for spam) into bogofilter as ham. This seems to have helped quite a
>> bit and put things back into focus. Ive been trying to feed both, and
>> have a report as innocent option in webmail, which very few users are
>> using. This puts emails into a non-spam mailbox which I then import
>> into bogofilter using bogofilter -nv < /var/local/not-spam. I usually
>> dont get but 1-3 emails a month reported as innocent though. Emails
>> that sneak through get reported as spam and imported using: bogofilter
>> -Nsv < /var/local/imp-spam. I'm thinking I should change this and use
>> bogofilter -sv instead. Maybe things will stay a little closer to
>> center then. I really appreciate all the information. It helps to
>> get an expert opinion.
>
> spam good Fisher
> .MSG_COUNT 820229 35342 0.500000
>
> My ratio is 23::1 and it works perfectly. I don't think the ratio is
> important at all. Just registering a single ham should get you going in
> the right direction so that you don't get divide-by-zero errors. Then
> just train on classification errors and your accuracy should stabilize.
> No need to jump through hoops to keep a particular ratio. In fact,
> the ideal ratio is probably precisely the ratio of ham to spam you
> actually receive on a regular basis.
>
> Tom
>
> _______________________________________________
> Bogofilter mailing list
> Bogofilter at bogofilter.org
> http://www.bogofilter.org/mailman/listinfo/bogofilter
>
I think you are right on the money. I fed 20 good hams into bogofilter
and it started working again. Things just got out of whack because it
didnt have any ham tokens.
thanks again,
ddh
--
Dwayne Hottinger
Network Administrator
Harrisonburg City Public Schools
More information about the Bogofilter
mailing list