multiple types of spam
Matt Garretson
mattg at assembly.state.ny.us
Thu Jul 3 20:03:52 CEST 2003
> "419 mail" ??? I'm not familiar with that label.
That's the common general name for the Nigerian-type scams.
http://home.rica.net/alphae/419coal/
> I haven't been keeping close track on Nigerian scam messages. I know that
> bogofilter is catching some and missing some. It _would_ make an
> interesting experiment to create wordlists to see if that's detectable.
At my site, these spams get by bogofilter more often than any other
type. In effort to remedy this, i have started training bogofilter
on *every* single copy of these spams (as opposed to other types of
spam, for which i try to weed out most duplicates before training).
I'm hoping this repetition will skew the statistics appropriately.
I think this is sort of similar to what Pi and Elijah are discussing
in another thread.
Also, i now keep all the 419 spams in a separate training corpus,
just in case a use for a separate wordlist ever comes up.
BTW, the "brute force" 419 training does seem to be helping somewhat.
-Matt
More information about the Bogofilter
mailing list