Testing if it works

Tom Allison tallison at tacocat.net
Sat Jul 17 14:02:18 CEST 2004


Barsalou wrote:
> After reading the man page again, I started thinking that the following
> command would help me identify good -o values:
> 
> cat /home/mike/spamfile | bogofilter -e -p -M -o 0.8,0.2 | grep -e
> ogosity
>

The command line option can be easily replaced by using a bogofilter 
configuration file (bogofilter.cf).  This is generally preferred since 
it simplified the use in the command line.

More on bogofilter.cf is in the man pages, but bogofilter.cf is self 
documented.

> If an entry comes back as Unsure, then my values need to be changed. 
> This assumes that all the mail in /home/mike/spamfile is in fact spam.
> 
> I could do the reverse for ham.

You may loose your mind doing this.

Bogofilter has to learn about spam and that is going to take probably a 
minimum of 100 each ham and spam before it even starts to understand the 
most basic spam with any regularity.

It's a process of continuous evolution but when you reach 2,000 each of 
ham and spam it gets much slower.  Almost to the point of zero maintanence.

I would suggest starting with something like 0.8/0.2 and leaving it 
there for a month.  If your Unsure section of spam is consistently 
showing only spam or only ham than you can review the scores and adjust 
accordingly.  But go slowly.

bogotune is supposed to automate a lot of this for you so all you have 
to do is fire it off and go do something else for a bit.

> I also believe that this would not reclassify any the spam...because
> there is the missing -s, -n, or -u.
> 
> Is this even close to right?

Close.

You need to do something in order for bogofilter to learn and store the 
words it is seeing.

-u will do this for you automatically on every email read and make a 
guess if it's ham/spam, but you'll have to make corrections with the -Ns 
/ -nS options.

-s/-n will do this for you on the assumption that you already know what 
the email message is (spam/ham) and store the words accordingly.

You could replace your command line with:

       bogofilter -Mv < /home/mike/spamfile

and get (approximately) the same results.
(Approximately because I don't have any mbox files to test with)



More information about the Bogofilter mailing list