comments from a new user

David Relson relson at osagesoftware.com
Sat May 10 03:08:02 CEST 2003


Welcome Andrew,

'Tis always good to hear from a new user.  Newbies bring in a fresh 
outlook.  Often they stub their toes on details that we experienced users 
have come to take for granted and have forgotten about.  Reported incidents 
of "toe stubbing" lead to improvements.

More succintly, your feedback is appreciated.

David

At 08:34 PM 5/9/03, Andrew Pimlott wrote:

>I just started using bogofilter and wanted to drop a few notes on my initial
>experience.
>
>- When I first started experimenting, I wanted to know whether it
>   was "safe" to register mails more than once.  The man page gives a
>   hint in the paragraphs on -S and -N that there is no detection of
>   duplicate registrations, but it would be clearer to state it
>   up-front at the top of REGISTRATION OPTIONS.  (I realize that
>   duplicates probably aren't a big deal in actual usage, but when
>   getting familiar with bogofilter it can be confusing if one
>   expects them to be ignored but they are not.)

I'll update the FAQ.

>- The man page mentions a -t (terse) option, but its behavior is not
>   specified.

The default terse format is "%1.1c %f" which bogofilter expands to Y or N 
(to indicate "yes - it's spam" or "no - it's not spam") and the spamicity, 
i.e. a value in the range 0.0 to 1.0

>- The FAQ has an obviously wrong explanation of pgood and pbad.  It calls
>   pgood the "likelihood that a message containing this token is non-spam"
>   when (I think) it means the "likelihood that a non-spam message contains
>   this token".

This is debatable.  What's presently in the FAQ isn't great, but the 
correct wording isn't obvious.

Consider a token with a pgood score of 0.1.  That means it appears in 10% 
of the good messages that have been registered.  Given that there are a 
gazillion other messages in the world, your statement isn't quite right.

Perhaps you can suggest a different wording ...

>- It took me a while to figure out why -vv sometimes doesn't print
>   statistics.  I see that there is a -F option to force it, but this seems
>   like a poor default.  I imagine that most people expect the statistics,
>   period.

When bogofilter is in tri-state mode and classifies the message as Unsure, 
I want to know "why".  The histogram provides information for that.  When 
the message is clearly ham or spam, the histogram is much less 
interesting.  Sometimes you might want to see the histogram _anyway_, so 
the force (-F) option is available for that.


>I am using bogofilter 0.12.2 from Debian GNU/Linux.
>
>Andrew

Keep the feedback coming.  Now, I've got to make some changes.

Hope this has helped :-)

David





More information about the Bogofilter mailing list