Newbie Q

David Relson relson at osagesoftware.com
Sun Oct 13 16:22:50 CEST 2002


At 09:51 AM 10/13/02, Michele Bariani wrote:
>On Sunday 13 October 2002 12:54, Tom Allison wrote:
> >
> > How do I train it?
> >
> > Right now it's mostly wrong simply because it has no known history
> > to work from.
> >
> > How can I beef up it's experience level?
>
>I've used a little Perl script to feed it with a "ham" mbox file and then a
>"spam" one (I've been collecting messages for months to experiment on them
>8-). This way I've got some initial tuning on my own mail stream.
>I can send you the script off-list if you like, it's not to be used as an
>example of good Perl ;-) but works ok.

Michele,

I don't know how you have your messages saved, so my comments may be 
off-target.  Hopefully, though, they'll be helpful.

If I have a bunch of spam messages in a directory, I'd use the following 
shell command

         for file in directory/* ; do bogofilter -s < $file ; done

If I have a bunch of spam messages in a single mbox file, I'd use formail:

         formail < spam.mbx -s bogofilter -s

Either way is short and painless.




> > Which brings me to what is probably a much harder question: Is there
> > some way that I can have a client email a spam mail back and have
> > the mail used for correcting a bogofilter setting?  Right now I'm
> > not really sure how to accomplish this.
>
>I've been talking about this with a friend of mine, the idea would be to have
>a single keystroke/button that adds a new (personalized) header to the
>message and sends it back to the server. The rules on the server would see
>the header and understand the message is a false positive/negative that needs
>correction (and not a new one).

I don't have _the_ solution for this, merely some ideas.  The goal is to 
get the misclassified message from the user to bogofilter, so the wordlists 
can be updated.

One way is to forward the messages to special email address, for example to 
bogo-spam if it should have been classified as spam or to bogo-good if it 
should have been classified as good.  A cron job could be run (perhaps each 
hour) that would check the bogo-spam and bogo-good mailboxes and run the 
appropriate "bogofilter -S" or "bogofilter -H" command.

A technique I've been using is to (manually) put the misclassified message 
into directory ~/spam-fixups with names like spam.1013.0845.txt or 
good.1013.0911.txt and then run (via hourly cron job) an update.wordlist 
script.

If I recall, the bogofilter man page gives macros for mutt so that word 
lists can be updated.

David


For summay digest subscription: bogofilter-digest-subscribe at aotto.com



More information about the Bogofilter mailing list