Templates [was: Prediction ...]
relson at osagesoftware.com
Thu Jul 1 08:30:39 EDT 2004
On Thu, 01 Jul 2004 09:32:27 +0100
Peter Bishop wrote:
> On 29 Jun 2004 at 18:06, Tom Anderson wrote:
> > > I've looked at spamitarium's regexes and confess that, to my
> > > inexperienced eye, they're complex. Give me a simple rule for
> > > distinguishing them and I can try to implement it.
> > I don't think there is a simple rule like you propose. Due to the
> > different formats given by different MTAs, and the ability for
> > spammers to forge one or more fields, it requires a complex
> > expression. Brackets and parentheses are optional in many cases, IP
> > and rDNS and IDENT information may or may not be present, and these
> > elements may all be arranged in many different ways. For instance,
> > here are a few:
> Could I suggest that you let the *end user* specify the format of the
> MTA Received line?
> e.g. if the user wants the line processed
> 1) specify the received line template for *their* specific MTA
> 2) extract the required addrees
> 3) insert a special header line into the message:
> e.g. the template:
> ^Received: from .* \[([0-9\.].+)\] by .*
> would be suffificient for my MTA, and this could be mapped to:
> MTA-IP-Address: $1
Bogofilter uses flex for its parsing, which provides speed and ease of
implementation. The "ease" is somewhat iffy as it depends on what's
being done, but that's not the issue here. User defined templates are
outside of flex's ability and needs for code in bogofilter. Likely a
regex library can be found and included. As we're presently approaching
the 1.0 release and this would be a significant change, I suggest
leaving it for a future version.
An alternate approach might be a sed or awk script to filter the message
and add the MTA-IP-Address: line. Putting it right after the MTA's
Received: line would help ensure it's not forged.
> This could be processed and scored by bogofilter in the normal way.
> The MTA received line processing could be done using procmail and
> formail before bogofilter is called.
> Alternatively something like spamitarium could easily be modifed to
> do a similar job.
That would also work!
> OR maybe bogofilter could allow a user-specified MTA template,
> e.g. specified as an option in bogofilter.cf
> to identify the required IP address field.
> This would need some template matching code in bogofilter,
> but maybe we could just invoke a standard regex library.
> To allow for different MTAs, you could leave some commented out
> templates in the config file for the user to select,
> But if the MTA is not in the list he can still "roll his own"
More information about the Bogofilter