randomtrain vs bogotrain.pl
David Relson
relson at osagesoftware.com
Fri Jul 4 16:30:13 CEST 2003
At 09:36 AM 7/4/03, Boris 'pi' Piwinger wrote:
>David Relson wrote:
>
> > Today I've run another test set with 75 Nigerian spam and 585
> non-spam. As
> > test 1, I put _all_ the messages in spamlist and goodlist. Test 2 used
> > randomtrain. Test 3 used bogotrain.pl. Each of the scripts was run 4
> > times and "extinction" was achieved.
>
>Using -f you only need to start my script once.
>
> > After training, I scored 101 spam and
> > 616 non-spam.
> >
> > training scoring
> > spam good spam good
> > bogofilter 75 585 101 (100%) 616 (100%)
> > randomtrain 24 27 101 (100%) 616 (100%)
> > bogotrain.pl 7 4 94 ( 93%) 605 ( 98%)
> >
> > While these results are in no way definitive, they indicate that the very
> > small wordlists produced by bogotrain.pl are inadequate.
>
>That might be a consequence of you very small and special
>training set:
As said, the messages used in scoring were different from the messages used
in training. With all training messages going into the wordlists (the
"bogofilter" experiment), the scoring was 100% correct. With randomtrain's
51 messages in the wordlists (the "randomtrain" experiment), the scoring
was 100% correct. With bogotrain.pl's 11 messages, the scoring was still
good (94% and 98%) but not perfect.
> > 1) The spam used in the scoring are the usual UCE spam, not nigerian
> > spam. They should all be scored as ham.
>
>Actually, most were scored as spam above. So all fail badly;-)
Sorry, I didn't explain well. The "spam" and "good" columns under scoring
indicate the origins of the messages. The scores are how many were
correctly scored. Since there was no Nigerian spam in the messages in the
"spam" and "good" columns, all messages should have scored as ham. The
very small wordlists generated by bogotrain.pl did not do as well in this
test as did the other (larger) wordlists.
Conclusion, while bogotrain.pl creates very small wordlists to distinguish
Nigerian spam from normal ham, the small wordlists didn't have enough
information for classifying other messages. The larger wordlists from the
other two tests did better.
> > 2) I expected to see some incorrect classifications in all the tests
> and am
> > surprised that 2 tests didn't have any.
>
>That indeed looks like an error.
No incorrect classifications is exceptionally good. It's not an error.
>But most important I think is that you training set is way
>to small. I suggest the following experiment:
The three tests used identical data sets and gave different
results. That's all I was trying to show.
>Take all you training set (probably several thousend each).
>Create three databases as above. Run your real mail trhough
>it for some days. See where they disagree. The server load
>should not be too bad with three calls of bogofilter instead
>of one.
Sorry, but I don't have enough Nigerian spam to create such a large
training set.
>pi
>
>PS: I'm about to leave for a week.
Have a good week! I'll be gone as well - the Relson family heads out of
town tomorrow for 7 days in a cabin on Paradise Lake, which is at the
northern tip of Michigan's lower peninsula. We'll talk more when I get back.
David
More information about the Bogofilter
mailing list