Training frustration

Anne Wilson cannewilson at googlemail.com
Mon Feb 11 14:23:48 CET 2008


On Monday 11 February 2008 11:32, Pavel Kankovsky wrote:
> On Mon, 11 Feb 2008, Anne Wilson wrote:
> > Total   messages: 164
> >
> > Total        ham: 3
> > Misdetected  ham: 0
> >     retrain fail: 0
> >
> > Total       spam: 19
> > Misdetected spam: 92
> >     retrain fail: 92
>
> You've got 19 messages in the spam directory. This number is ok, according
> to the output of ls. The number of messages in the ham directory, i.e. 3,
> is probably ok as well. The other numbers are pretty suspicious. 92 failed
> messages out of 19? 164 total messages?
>
> It seems you removed old messages from the spam folder having had trained
>
> them:
> > I have made sure that the folders were compacted after deleting old,
> > trained, messages.
>
> but did not told trainbogo.sh to clean it stats dir ("stats.tmp" in the
> current working directory, probably your home dir, by default) and it
> still remembers them and makes hopeless attempts to recheck them every
> time you run it. Try running trainbogo.sh -c (if you really want to delete
> messages that have already been used for training).
>
> > ls -l /home/anne/Maildir/.INBOX.bogotrain_spam/cur/
> > total 160
> >
> > Total 160?
>
> 160 blocks occupied by files in the directory. It is probably ok.
>
That makes sense, and fits with what I suspected, but didn't know how to 
handle.

You seem surprised that I was cleaning out the 'trained' messages.  I thought 
it was a bad idea to keep running the same messages through the training.  Am 
I wrong?

I'll keep the -c parameter in the command in future.  What is the reason for 
it not being the default?

Anne



More information about the Bogofilter mailing list