contributing datasets, was: Is bogotune helpful?

David Relson relson at osagesoftware.com
Sun Dec 7 17:04:04 CET 2003


On Sun, 07 Dec 2003 16:27:03 +0100
Boris 'pi' Piwinger <3.14 at logic.univie.ac.at> wrote:

> Bill Wohler <wohler at newt.com> wrote:
> 
> >My ham msgbox had 17109 messages. My spam msgbox had 20,000 messages.
> >However, a grep of MSG_COUNT revealed 17119 and 20,000 respectively.
> >Why would the former be a little off?
> 
> It is always a bit risky to use formail -s since some
> processes forget a seperator line before the From_ line. Try
> formail -es, which should give the same number of messages
> as grep -c '^From '.
> 
> pi

pi,

Excellent tip.  It caused me to realize that script msg-count.sh should
run formail rather than have formail run msg-count.sh, i.e.

Not:	formail -es msg-count.sh < mbox_file > msg_count_file
But:	msg-count.sh < mbox_file > msg_count_file

This makes it easier for the user and ensures that formail is run in the
best way.

Thank you for the suggestion.

David




More information about the Bogofilter mailing list