Train with -Ns and -Sn

David Relson relson at osagesoftware.com
Wed Mar 21 23:54:42 CET 2007


On Wed, 21 Mar 2007 18:11:55 +0100
Peter Gutbrod wrote:

> am 21.03.2007 12:12 Uhr schrieb David Relson unter
> relson at osagesoftware.com:
> 
> > On Wed, 21 Mar 2007 10:44:17 +0100
> > Peter Gutbrod wrote:
> > 
> >> I have 2 mailboxes where I sort in the mails rated as unsure or
> >> classified wrongly  by bogofilter. I feed the mails in this
> >> mailboxes periodically to bogofilter with -Ns for spam and -Sn for
> >> non-spam emails.
> >> 
> >> I'm wondering, whether it might lead to a bad database, if I
> >> reregister unsure mails, that have never been registered before.
> >> 
> >> So should I use instead just -s and -n, without unregistering the
> >> mail first.
> >> 
> >> The reason I used -Ns and -Sn is, that I use bogofilter with the -u
> >> option so far. So any wrongly classified mail needs unregistering.
> >> To keep things easy, I didn't want to process unsure mails
> >> differently than wrongly classified spam and no-spam emails.
> >> 
> >> Peter
> > 
> > Hi Peter,
> > 
> > Using just "-s" or "-n" would be better.
> > 
> > If you're using the "-p" option, each message has an "X-Bogosity:
> > Ham/Spam/Unsure" line.  Thinking of your folders as "should_be_ham"
> > and "should_be_spam",  you have four types of messages:
> > 
> >   classified as ham,    should be spam:  use switch "-Ns"
> >   classified as spam,   should be ham:   use switch "-Sn"
> >   classified as unsure, should be ham:   use switch "-n"
> >   classified as unsure, should be spam:  use switch "-s"
> > 
> > A script can be written to register each message properly (taking
> > into account which folder it's in and its X-Bogosity line).
> > 
> > HTH,
> > 
> > David
> 
> Hi David,
> 
> yes you are right, I could adjust my script, so it uses different
> switches according to the X-Bogosity header. I'll investigate into
> this.
> 
> Is the procedure you suggest only valid as long as I use bogofilter
> with the -u switch?  Without -u none of the actual messages are
> already registered. Should I use just "-s" or "-n" as well for the
> wrongly classified spam and ham messages, in case I do not use the -u
> switch? On the other hand, if bogofilter classifies the message as
> ham or spam, then a similar message must have been already
> registered. So probably your workflow is independent of the usage of
> the -u switch. Can you comment on this?
> 
> And what if a spam message has been registered as ham by accident,
> and now shows up as ham. I think then it has to be registered "-Ns",
> whether bogofilter is used with the -u switch or not.
> 
> Greetings
> 
> Peter

Right you are!  I use the "-u" switch so false positives (ham scored as
spam) and false negatives (spam scored as ham) are automatically
registered in my wordlist.  Using the "-N" and "-S" switches undo the
registration and "-n" and "-s" register the message properly.

David



More information about the Bogofilter mailing list