An error in FAQ!

David Relson relson at osagesoftware.com
Thu Aug 23 02:14:54 CEST 2007


On Thu, 23 Aug 2007 03:29:36 +0800
Jer-ming Lin wrote:

> Dear all,
> 
>     When I read the FAQ "How do I start my bogofilter training?". I  
> found out that there is an error in the method 4 example. In the  
> following article, bogofilter need use -n parameter to retrain the
> ham not -s. Retraining spam has same error, too.
> 
> ex:
> =========================
> 
> In our example (after the initial full training):
> 
>      classify spam.mbox [bogofilter options]
>      bogofilter -s < corpus.good
> ---------------^^^
>      rm -f corpus.*
>      classify ham.mbox [bogofilter options]
>      bogofilter -n < corpus.bad
> ---------------^^^
>      rm -f corpus.*
> 
> =========================
> 
> Best regards,
> Jer-ming

Hello Jer-ming,

It's a good thing that you're looking carefully at bogofilter's
documentation.  There is much to be learned from doing so.

The examples you're quoting is correct, which I shall explain.
First, remember that you're reading about "train-on-error" and the FAQ
is about how to correct errors.  In this section of the FAQ are 2
scripts.  The first one runs bogofilter and, depending on the program's
return code, saves the message to corpus.bad (if bogofilter thought it
spam), to corpus.good (if bogofilter thought it ham), and to
corpus.unsure (if bogofilter didn't classify it as spam or ham).
Since we're dealing with incorrectly classified messages, the messages
in corpus.bad are really _ham_ messages and training with "-n" is the
right thing to do.  Similarly with "-s" and corpus.good.

I hope that this clarifies the FAQ discussion.

Regards,

David

The 




More information about the Bogofilter mailing list