Mbox format / reclassifying messages

David Relson relson at osagesoftware.com
Sat Sep 27 14:35:14 CEST 2003


On Sat, 27 Sep 2003 12:58:16 +0100
"Laurence" <ljng at hbbs.org> wrote:

> I run bogofilter with -u so it automatically registers messages in the
> wordlist.  I set myself up a way of being able to reclassify messages
> with only webmail access - I have three folders ("mv H2S", "mv U2H"
> and "mv U2S") that I move incorrectly classified messages to.  A cron
> job checks for messages and reclassifies them.
> 
> The mbox format (at least on my machine) can have a dummy first
> message that stores "important folder data". ;)  I noticed when my
> reclassify script first ran it reported a count of two messages when I
> had only moved one into the folder.  Looking through my wordlist I can
> see entries that look like they come from this dummy first message.
> 
> An example message is:
> 
> --- start ---
> >From MAILER-DAEMON Wed Sep 24 12:29:49 2003
> Date: 24 Sep 2003 12:29:49 +0100
> From: Mail System Internal Data MAILER-DAEMON@***removed***
> Subject: DON'T DELETE THIS MESSAGE -- FOLDER INTERNAL DATA
> X-IMAP: 1064402989 0000000000
> Status: RO
> 
> This text is part of the internal format of your mail folder, and is
> not a real message.  It is created automatically by the mail system
> software. If deleted, important folder data will be lost, and it will
> be re-created with the data reset to initial values.
> --- end ---
> 
> Should bogofilter ignore this message when importing from mbox files
> if it's present?
> 
> Laurence

Laurence,

It's not bogofilter's job to recognize every different mbox internal
format.  You might want to include a script in your cron job that trims
off this first message.  It's also quite likely that no harm is done by
scoring the message.  Including the tokens in your wordlist just adds a
number of neutral scoring tokens.

David




More information about the Bogofilter mailing list