UI for correcting mistakes

Todd Underwood todd-bogofilter at osogrande.com
Thu Mar 13 21:48:06 CET 2003


> 
> I think the code for the web script would be very easy. I assume that the
> bogofilter side code would be too. I don't know about the trickiness of
elijah, all,

most of these ideas were stuff i was already thinking about for a 
bogofilter implementation but one was novel and caught my eye...

On Thu, 13 Mar 2003, elijah wrote:

> 4) embedded links
> 
> the body of a suspected spam could be modified to provide a URL for
> correcting a 'false positive'.
> 
> it might work like this: bogofilter could add a header which was a hash of
> the message--something which has a good chance of uniquely identifying the
> message quickly. the correction URL is keyed on this hash and the user's
> address. the form of the URL is configurable in the bogofilter config
> file. when the user clicks the URL, a custom web script is activated which
> scans the user's mail for a messages with a header with the hash value.
> if it finds the message, the message is imported into bogofilter as a
> correction. if not, it reports to the user that the message could not be
> found. This could also work similarly with a mailto href.

this is seriously cool & really simple.  it is interesting to think about 
how this could get integrated into to the bogofilter project, given that 
bogofilter is neither a web server nor an MTA.  have to include a cgi with 
the project that is configured to do all of this.  i guess install would 
have to put the cgi in the appropriate place.  still v. cool idea and not 
hard to implement.

> I think the code for the web script would be very easy. I assume that the
> bogofilter side code would be too. I don't know about the trickiness of
> modifying the bodies of mime-multipart messages.

it seems that that is the hard part.  

> 
> advantages: VERY easy for all users and all clients.
> disadvantages: requires custom programming depending on your particular
> mail storage. does not include a way to correct 'false negatives'.

i think it's even better than you think.  the common case for ISPs and 
large installations out there is pop toasters.  most people implementing 
bogofilter for their client base are either going to do a sitewide 
filtering database (poor scaling, less accuracy, not as cool), or per-user 
databases.  in either case, you have to tag spam with headers and allow 
clients to filter it into a separate inbox.  as long as you tag *all* 
messages somehow (perhaps simliar to the mailing list instruction 
headers) this method gives the perfect mechanism for catching false 
positives and false negatives.

> 
> 5) lobotomized aliases
> 
> since most clients don't support bouncing messages, you could set up an
> alias which accepted forwards. The first thing it did was strip away all
> the headers and try to remove the 'forward' stanza. Then it would pass it
> to bogofilter.
> 
> advantages: works with all clients.
> 
> disadvantages: you loose a lot of the most important information when you
> strip away the headers. same disadvantages as aliases.

you could try to stip the right headers, but that's even trickier.

good ideas.  i think the web interface is the most promising thing i've 
heard recently wrt implementing this.

t.

-- 

todd underwood, sr. vp & cto
oso grande technologies, inc.
todd at osogrande.com

"The people never give up their liberties but under some delusion."
  	    --Edmund Burke, Speech at County Meeting of Bucks, 1784. 





More information about the Bogofilter mailing list