re-training with -Ns and -Sn
relson at osagesoftware.com
Mon Oct 6 19:10:34 EDT 2008
On Mon, 6 Oct 2008 12:26:01 -0500
Bill McClain wrote:
> On Mon, 6 Oct 2008 18:27:45 +0100 (BST)
> "Bjorn Graabek" <bjorn at graabek.com> wrote:
> > I have realised that my bogofilter setup may not be quite
> > right, am I doing something really awful here?
> > I use tri-state classification which, according to the man page,
> > means that any emails classified as unsure are not used to train
> > bogofilter (I use -u on all email). I move incorrectly classified
> > emails into a "mark as bad", "mark as good" folder and I then run
> > bogofilter with "bogofilter -Ns" and "bogofilter -Sn". That is fine
> > with emails that were incorrectly classified, but what about those
> > emails classified as "unsure"? As they weren't registered in the
> > first place, they shouldn't be unregistered before they are
> > correctly registered?
> I don't unregister Unsure emails for that reason. But unless you have
> a huge number of such, I don't think unregistering then will do much
> harm. It just bumps the spam/ham counts one way or the other.
> Sattre Press History of Astronomy
> http://sattre-press.com/ During the 19th Century
> info at sattre-press.com by Agnes M. Clerke
"-N" and "-S" are the options to undo the registration of an
incorrectly registered message.
"-Ns" is used when spam was incorrectly registered as ham.
Bogofilter's action for "-Ns" is to lower each token's ham count and
raise the spam count. For "-Sn" the actions are lower spam count and
raise ham count.
To register an "Unsure" as ham, you should just use "-n" (to tell
bogofilter that the message is ham, _not_ spam). To register an
"Unsure" as spam, use "-s" (to tell bogofilter that it _is_ spam).
More information about the Bogofilter