contributing datasets, was: Is bogotune helpful?

Boris 'pi' Piwinger 3.14 at logic.univie.ac.at
Wed Dec 3 13:20:59 CET 2003


Greg Louis wrote:

>> I've got 40,000 spam and hams roughly 50/50. Let me know how I can
>> create and get the datasets to you.
> 
> I'm not sure if any message-count converter is supplied with bogofilter
> these days, but running a command of the form
> 
>     formail -s bogol dbdir <mboxfile >messagecountfile

[...]

Questions:

1) Is it important how training was done?

2) Do you need those messagecountfile seperated for training
and testing? (Which would then mean a new training is needed
on only a part of the messages.)

3) Is it OK to do that in chunks of 5,000 messages or do you
want them all together?

pi




More information about the Bogofilter mailing list