headers - example
Boris 'pi' Piwinger
3.14 at logic.univie.ac.at
Mon Mar 8 11:06:25 CET 2004
Jozef Hitzinger wrote:
>> I cannot see your point. Your output just reflects what your training
>> shows. Certain constellations seem to be more unlikely in spam. That is
>> normal and intended.
>
> My point was to demonstrate what I was arguing previously (headers except
> Subject should not go into db):
>
> "195.80.171.24" 53 0.006570 0.000000 0.000074 +
> "rcvd:mail.slovanet.sk" 52 0.006446 0.000000 0.000075 +
> "212.55.234.133" 1 0.000124 0.000000 0.003877 +
> "rcvd:mtx1.www.ematrix.sk" 1 0.000124 0.000000 0.003877 +
> "rcvd:proxy.ematrix.sk" 1 0.000124 0.000000 0.003877 +
> "to:hotmail.com" 266 0.029999 0.002026 0.063266 +
> "head:UTC" 661 0.061609 0.013842 0.183460 +
>
> are neither "hammy" or "spammy" in nature.
Let's have a look. There are two IP-addresses. The were
found in the body, so they don't count here. There are three
hostnames in rcvd; one was seen pretty often in ham, not at
all in spam, that will have some reason, so it is correct to
use it as it is; two have been seen only once. Also this
might have a good reason. Next is the Hotmail thing, so you
seem to get a log legitimate mail with this in To; what's
wrong with that observation? Finally, there is this UTC
which indicated that according to your observations this is
not often used by spammers.
> Yet they are the only on the
> hammy side of this message. How did they got there?
By your training.
> They come from
> headers. Because I trained on full messages, including headers (current
> recommended way), they're in.
I don't recommend this;-)
> So I agree with you, this reflects my training. But I don't agree with
> "certain constelations".
This is just the result of your training.
> The message was pure spam.
It had some indications for ham, though. IF you fix this,
the same message is probably seen differently. Try it!
> If the junk headers
> were just "noise" I wouldn't care, as bogofilter wouldn't care either.
Right, this is not noise, there is significance to those
observations.
pi
More information about the Bogofilter
mailing list