Training ham seems difficult

Wed Jan 14 11:33:04 CET 2004

On Wed, 14 Jan 2004 10:51:30 +0100
Andreas Pardeike <andreas at pardeike.net> wrote:

<snip>
> BTW, in order to prevent that ham is accidentally detected as
spam and
> thus blocked I wrote a nice system that asks the sender to confirm the
> email and then unblocks it (I store all spam in a mysql db and trash 
> entries
> after a week or when they are confirmed). To prevent useful feedback
> to spammers I send out confirmation questions from a configurable
> extra account
> (i.e. admin at pardeike.net for spam received at andreas at pardeike.net).
> If I
> get a bounce or error message I can classify the email as spam 
> directly. If
> I get an real answer, I reregister it as ham. After a week, I register
> 
> it
> as spam. This works flawless because it also prevents mail loops (the 
> admin
> address never answers to email that it receives - it only checks for 
> valid
> confirmations).
> 
> I am thinking of making the whole system public and creating an 
> installable
> solution from it. Is anybody interested?
> 
> Regards,
> Andreas Pardeike
> 

How does your system deal with automated important messages. For
example: A message from PayPal, (which is automated) gets labeled
as spam, your system send a message to a 'not-a-real-mailbox' asking for
confirmation. The 'not-a-real-mailbox' does not understand the request
your system made (in fact, it sent it /dev/null). Now, that false
positive will never reach the person that may have even requested email.

If this is how your system works, I think it needs some more work.

A better solution would be to deliver the false positive AND send the
request. A response (from the sender) could then be used to update the
wordlist.db to help stop false positives. No response would also update
the wordlist.db, however no harm is done as the mail still reaches its
destination.

-Tig