problem in training

Matthias Andree matthias.andree at gmx.de
Tue Sep 28 21:22:43 CEST 2004


Yair Zohar <yair at ard.huji.ac.il> writes:

> I've tried these:
> #> ./randomtrain -s ~/junk/Maildir/tmp/ -n ~/junk/Maildir/cur/
>
> where dir 'tmp' contains all the spam mails and dir 'cur' contains the 
> ham in a Maildir format.

Maildir doesn't work that way. You specify just ~/junk/Maildir/ -
bogofilter itself will then go traverse cur/ and new/ for mail, as is
should be the case for Maildir.

> spam  reg   good  reg
>   96   12    360    8
>
> (I don't know what reg means)

I'm not aware how randomtrain works beyond its help message, but...

> I checked some of the unwanted mails by :
>
> #> cat 'unwanted mail that was in the dir tmp'  | bogofilter -vvv
>
> I saw bogofilter is not familiar with the words of this mail so I 
> realized the training didn't work as I expected.

...the description says that bogofilter will be trained if it was
wrong or unsure.

As you're also trying bogominitrain.pl, I'd suggest:

1. erase your data base:  rm -f ~/.bogofilter/{*.db,__db*,log.*}
2. register spam: bogofilter -sMB ~/junk/Maildir/ 
3. register ham: bogofilter -nMB ~/Maildir/

Adjust paths as appropriate, you can also pass in mbox files if you have
any.

-- 
Matthias Andree

Encrypted mail welcome: my GnuPG key ID is 0x052E7D95 (PGP/MIME preferred)



More information about the Bogofilter mailing list