info about spam messages

Tom Allison tallison at tacocat.net
Thu Jun 17 23:50:27 CEST 2004


Tom Anderson wrote:
> From: "David Relson" <relson at osagesoftware.com>
> 
>>issue, bogofilter already had the idea of IPADDR.  It doesn't have any
>>concept of EMAIL_ADDR.  In fact, "@" is a delimiter, so
>>"username at domain.com" becomes two separate tokens "username" and
>>"domain.com".
> 
> 
> I've always had an issue with this.  @ should not be a delimiter.  In what
> sense does it ever break up tokens except in an email address?  And breaking
> up email addresses is like saying everyone on the same street is a criminal
> just because one guy is.  The @ should be removed from the set of
> delimiters, and that would solve part of the problem.  Now, I'm not saying
> that logging the FROM address would be useful, but you already have code to
> detect a domain, right?  So, if you were inclined to detect an email, it
> would essentially be [a-zA-Z_\-.+]+\@$DOMAIN.
> 
> Tom
> 

I'm surprise at you Tom....

I should think that the generalization of tokenizing the domain.com 
seperate from the username would by similar to your experiences with 
tokenizing the ASN as a generalized group representation of the IP 
addresses.

For example, everyone from AOL is pretty much a criminal in my records, 
with the exception of one address.  That address is slowly gaining 
enough ground over time that they are no longer Spam, but Unsure.




More information about the Bogofilter mailing list