multiple types of spam

Matt Garretson mattg at assembly.state.ny.us
Thu Jul 3 20:03:52 CEST 2003


> "419 mail" ???  I'm not familiar with that label.


That's the common general name for the Nigerian-type scams.
http://home.rica.net/alphae/419coal/


> I haven't been keeping close track on Nigerian scam messages.  I know that 
> bogofilter is catching some and missing some.  It _would_ make an 
> interesting experiment to create wordlists to see if that's detectable.


At my site, these spams get by bogofilter more often than any other
type.  In effort to remedy this, i have started training bogofilter
on *every* single copy of these spams (as opposed to other types of
spam, for which i try to weed out most duplicates before training).
I'm hoping this repetition will skew the statistics appropriately.
I think this is sort of similar to what Pi and Elijah are discussing
in another thread.

Also, i now keep all the 419 spams in a separate training corpus,
just in case a use for a separate wordlist ever comes up.

BTW, the "brute force" 419 training does seem to be helping somewhat.

-Matt





More information about the Bogofilter mailing list