Site Configuration Queries

Simon Huggins huggie at earth.li
Mon May 19 23:18:36 CEST 2003


On Mon, May 19, 2003 at 07:12:08PM +0100, Peter Bishop wrote:
> On 19 May 2003 at 18:17, Simon Huggins wrote:
> > I've added a fake mbox header and terminating \n and escaped all lines
> > starting with "From " to force the output into a proper mbox format.
> > Will the tokens in the From line be picked up?  Should I therefore
> > attempt to parse the headers and generate them form that.
> My own mailer can generates a cheapo mbox format with the following
> header
> >From ???@??? Sat Mar 22 15:22:52 2003
> This seems to be accepted by bogofilter (is that right David?) In
> which case no extra tokens are added. and you don't need to parse
> anything.

It's trivial to change this appropriately if this is so.

> PS, I may just be paranoid, but the body might not have a final end of line 
> terminator. so it would be safer to print "\n\n" after the body to ensure 
> there is a blank line separator

I think MIME::Parser will leave a newline (after all there is MIME
content either side of all the messages that it strips) but just in case
of some pathological case I've not thought about I revised the way it
deals with the last newline.

Attached again.

-- 
----------(   Le doute est le commencement de la sagesse.    )----------
----------(                                                  )----------
Simon ----(                                                  )---- Nomis
                             Htag.pl 0.0.22
-------------- next part --------------
#!/usr/bin/perl -w

=head1 NAME

grabmessages - splits out message/rfc822 parts from a MIME message

=head1 SYNOPSIS

Usage:
    grabmessages <message

=head1 DESCRIPTION

Trivial script to print out all message/rfc822 parts of a MIME message.

Originally from an idea on the bogofilter mailing list as one way to allow
people to easily submit things to the spamlist without having their own
addresses added when forwarding spam to an account.  In such a case messages
should be piped to this before being piped to bogofilter -s

The output is forced into an mbox format for easy parsing by bogofilter.

=head1 AUTHOR

Simon Huggins <huggie at earth.li>

=cut


# Copyright(C) Simon Huggins 2003 <huggie at earth.li>
# 
# This program is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the Free
# Software Foundation; either version 2 of the License
#
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
# or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
# for more details.
#
# You should have received a copy of the GNU General Public License along
# with this program; if not, write to the Free Software Foundation, Inc., 59
# Temple Place, Suite 330, Boston, MA 02111-1307  USA


# Yes, it is silly having the license boilerplate take up more space than
# the code but it does remove all doubt.

use strict;
use MIME::Parser;

my $parser = new MIME::Parser;
$parser->extract_nested_messages(0);
$parser->output_to_core(1);		 # No temporary files
my $entity = $parser->parse(\*STDIN);

foreach my $subent ($entity->parts) {
	if ($subent->effective_type eq "message/rfc822") {
		print "From invalid\@example.com Mon May 19 18:00:00 2003\n";
		my $body = $subent->stringify_body;
		$body =~ s/^From />From /mg;
		$body =~ s/\n*$/\n\n/;
		print $body;
	}
}



More information about the Bogofilter mailing list