simplicity vs safety with complexity

Boris 'pi' Piwinger 3.14 at piology.org
Tue Jan 25 12:24:11 CET 2005


David Relson <relson at osagesoftware.com> wrote:

>I've got a question for y'all.  Would you rather have 
>
>1) a wordlist that's simple, easy to backup, but vulnerable to software
>and hardware crashes; or
>
>2) a wordlist that offers crash protection but is complex to maintain,
>backup, ... 

You know my answer, I just talked about it a few days ago.
But since you ask, I think it is good to partially repeat it
here, to have all the answers in one place. Anyway, I'll add
more to it.

I've been using bogofilter for long, in the beginning on
mounted directories, even without locking (my failure of
understanding how to do it correctly). I never had any
problem. Soon I started experimenting with different
training methods which ended up in the best solution I could
find for my settings, which is the by now well know method
of training-to-exhaustion. Clearly, this reduces the risk to
the database since write access is very limited and
controlled. I also do it on a copy to not have a production
database with interim results. So what could happen to my
database? I can only think of hard disc failure which would
then also make the protected method useless.

Now I'm not saying this protection is useless. People
working on their life database, in particular with automated
updates are at risk to some degree. They can either do
automatic backups (problematic if disc/quota space is low)
and would not lose much or they can rely on that protection.
This again happens at the cost of some disc space, I
understand much less than with a full backup, though.

The worst problem from reading this list without really
trying it out myself is user experience. Many people have
problems using the new feature. This seems to be manly due
to the need to understand -- at least to some degree -- the
underlying database. This is way to much for simple users.
Already the long things you are supposed to read and
*understand* before upgrade is a problem. This said, all the
processes need to be fully transparent. I see a lot of
effort to improve this situation, but still it seems not yet
enough.

What could be the solution? One is already implemented: Make
the feature optional. But this still requires all interfaces
to not show the difference, else all software around
bogofilter (all scripts, filter implementations etc.) will
have to treat those things differently. Or make it fully
transparent, so people can upgrade without changing
anything, which might not be possible, but maybe it can be
made extremly simple.

>Bogofilter's history as a spam filter is one of success and continued
>improvement.  This is good!

I agree, but we still need more popularity I think. The main
way would be -- beat me! -- to have it available for Windows
users. People could use it popular local servers like
Hamster, other could embed it into plugins for Eudora or
Outlook, or even Mozilla which has its own filter.

pi



More information about the Bogofilter mailing list