A strange thought

Tom Allison tallison at tacocat.net
Fri Jan 31 12:28:58 CET 2003


I just got done parsing through about 450 pieces of spam and have 
an observation to share.

I get a lot of mail from addresses like:

qwer at specialoffers4you.com
ewrt at specialoffers4you.com
rtyu at specialoffers4you.com

and so on...  where the username is randomly generated and 
modified, but the domain portion of the email is consistent.

I would think that this is something that you might use to 
identify domains that are very highly likely to deliver spam.

And I'm wondering if this domain pattern matching is something 
that could be done well with a bayesian statistical approach.

Besides the facts that I'm ignoring a ton of additional 
functionality and a ton of addition information (BODY), does it 
seem reasonable that domain matching might be a useful approach to 
identifying entire domains with greater certainty, regardless of 
any additional efforts the spammers use.

I don't know if this has great application to bogofilter, but I 
see where bogofilter can have great application towards this idea. 
Does it seem "sane"?

-- 
If we do not change our direction we are likely to end up where we 
are headed.





More information about the Bogofilter mailing list