Artificial Intelligence

Jonathan Buzzard jonathan at buzzard.org.uk
Fri Sep 20 02:13:34 CEST 2002



aotto at aotto.com said:
> 2) Bogofilter really isn't that Bayesian anyway. It's close, but as
> several experts have pointed out, not exactly Bayesian. 

We should leave it in for two reasons. Firstly Microsoft don't appear
to have a patent on Bayesian spam filtering, they have a patent on a
Support Vector Machine classifier for spam filtering that happens to
mention the word Bayesian in it. 

Secondly some of the proposals floating around to improve bogofilter
make it more Bayesian. It is likely the more "Bayesian" bogofilter
becomes the better it will become.

Thirdly lots of people have heard of Bayesian and associate it with
clever statistical stuff.

Fourthly the term "Artificial Intelligence" has a poor reputation and
generally viewed as being a bit "naff"

Oh, and just for the record I noted in an email I sent to Eric and
everyone then on the list (though that does not appear to include you)
about the Microsoft patent on the 30th August 2002 that Bogofilter is
strictly speaking not Bayesian :-) What I said at the time about the
patent was 

    Having now read the patent it does not apply on a number of bases, and
    in third it is far, far to general.

    Firstly prior art of machine classification of email for spam exists in
    copious amounts. Sure they are all rule based classifiers but the
    principle stands. It is abundantly clear that filtering spam is a
    classification problem. In fact the text of the patent makes confirms
    this view point. So we are not infringing anything by producing a
    machine classifier of email for spam.

    Secondly although the patent mentions numerous classification techniques
    (just about every common one in existance) it then goes on to describe
    a classifier that is not Bayesian. So if we use a Bayesian classifier
    we are in the clear.

    Thirdly the patent talks about continued teaching of the classifier when
    in use. This is almost standard practice with Bayesian classifiers, and
    many other classifiers as well.

    I really don't think we have anything to fear.


All of which I still standby.

Really the reason I hate software patents is that 99.99% of them have
absolutely nothing new in them, and have all been done before. When
somebody can point me at a software patent that does not have prior
art I might change my tune.


JAB.

-- 
Jonathan A. Buzzard                 Email: jonathan at buzzard.org.uk
Northumberland, United Kingdom.       Tel: +44(0)1661-832195



For summay digest subscription: bogofilter-digest-subscribe at aotto.com



More information about the Bogofilter mailing list