A strange thought
Tom Allison
tallison at tacocat.net
Fri Jan 31 12:28:58 CET 2003
I just got done parsing through about 450 pieces of spam and have
an observation to share.
I get a lot of mail from addresses like:
qwer at specialoffers4you.com
ewrt at specialoffers4you.com
rtyu at specialoffers4you.com
and so on... where the username is randomly generated and
modified, but the domain portion of the email is consistent.
I would think that this is something that you might use to
identify domains that are very highly likely to deliver spam.
And I'm wondering if this domain pattern matching is something
that could be done well with a bayesian statistical approach.
Besides the facts that I'm ignoring a ton of additional
functionality and a ton of addition information (BODY), does it
seem reasonable that domain matching might be a useful approach to
identifying entire domains with greater certainty, regardless of
any additional efforts the spammers use.
I don't know if this has great application to bogofilter, but I
see where bogofilter can have great application towards this idea.
Does it seem "sane"?
--
If we do not change our direction we are likely to end up where we
are headed.
More information about the Bogofilter
mailing list