Why strip headers?
David Relson
relson at osagesoftware.com
Fri May 6 03:43:16 CEST 2005
On Fri, 6 May 2005 09:59:14 +1000
Ben Finney wrote:
> On 06-May-2005, Ben Finney wrote:
> > On 05-May-2005, Tom Anderson wrote:
> > > I also clean up my headers with this one:
> > > http://orderamidchaos.com/bogofilter/spamitarium
> >
> > I don't see the purpose of that one. Why would you not give
> > bogofilter all the information about the original message that you
> > can, to help it learn?
>
> Specifically, this doesn't sound right (from spamitarium's POD
> documentation):
>
> =====
> Moreover, headers which do not directly influence the email in any
> functional way, nor are visible to the end-user in a standard
> graphical MUA, are highly likely to contain information which
> spammers think will detract from normal statistical filtering. It
> is therefore desireable to remove these elements, specifically
> X-headers, prior to filtering. Spamitarium removes all invisible,
> non-functional header lines.
> =====
>
> Is it foolishly naïve of me to think that bogofilter knows much more
> about my personal mail history than some spammer, and can judge those
> bogus headers as is?
Hi Ben,
All bogofilter knows about your email is which ones you've told it are
spam and which ones are ham. If there are different X-Headers it the
two message sets, then their presence may well help bogofilter in its
spam vs ham scoring.
Some (many?) mail delivery agents add X-Header lines to a message. If
_yours_ adds one or X-Header lines, bogofilter will see them in _every_
ham and _every_ spam. The result is tokens with scores of 0.5 which
are ignored when scoring.
Stripping X-Header lines, as Tom does, may or may not have an effect.
It all depends on your particular mail setup.
HTH,
David
More information about the Bogofilter
mailing list