TODO for 1.0
David Relson
relson at osagesoftware.com
Sun Jan 12 23:27:19 CET 2003
Greetings,
As you all know, today the mime processing code was merged into the main
development tree, most documentations files were moved to a central
location, the main README was updated to help new users find the
documentation, etc.
Soon, hopefully this week, I'll be releasing a new bogofilter beta. Much
as I'd like to call it 1.0, other voices have spoken and I have listened,
and it will not be 1.0. Likely it will be 0.10.0 since 10 comes after 9
and there's too much new stuff to merely call it 0.9.2. Alternatively,
since 0.10 sounds like 10% done, it might be 0.9.5 to signify closing
the distance to 1.0.
IF this new release turns out to be stable, in a month or so it may be
promoted to 1.0. However, as the message below points out, there are a
number of things that bogofilter is lacking. Added to bogofilter, they
would be a major step and might be sufficient completion to justify a 1.0
label.
So, please use Greg's message (below) as the starting point for a dialog on
what's needed for 1.0. Other ideas are welcomed. Of greatest value would
be people to take on some of the work. Several items relate to
documentation and I wonder if there are any writers out there?
David
At 07:07 AM 1/4/03, Greg Louis wrote:
>TODO for 1.0 should include, I'd say:
>
>- agree on testing methodology that we all trust, so we don't see
> people write "I haven't tested it yet" when others have done so
> extensively (this has been a major virtue of the Spambayes project)
>
>- finalize the algorithm choice (I think everyone who's seriously
> evaluated each would agree that Robinson-Fisher is the best available
> at present, though I suspect Robinson-BayesChain might deserve further
> evaluation). That's not to say we mightn't change it if Gary or
> someone else comes up with an even better scheme, but I'd like to see
> bogofilter officially support just one at a time -- serial monogamy
> if needed, but no more polygamy :)
>
>- agree on what mime parsing we want to do and how it's to be done, and
> do it, and give it time to prove its worth and settle down
>
>(The classifier and the tokenizer are the crucial elements of the
>program, obviously. They need to be right and they need to be stable.)
>
>- develop a sound and sensible HOWTO that explains what the parameters
> (spam cutoff, nonspam cutoff, minimum deviation, s and x) do, how
> they interact, and how to choose values for them. I think this
> really really matters: we can't claim we're ready for prime time when
> at bottom we don't truly understand what we're doing. Me, I know in
> theory what they do and a little about how they interact, but there
> is more to be thought through and/or learned before I could claim to
> know how bogofilter's classification really works. And I doubt that
> I'm unique in my ignorance. I still see people on the bogofilter
> list doing pure handwaving with x, for example.
>
>That's not by any means a complete list, but I hope it gives the
>flavour of why I don't see it as wise to claim 1.0 readiness at this
>stage in bogofilter's growth. If 0.10.0 sounds like regression to
>infancy, then I'd propose sticking with 0.9.x, letting x go to 99 or
>beyond if need be.
More information about the Bogofilter
mailing list