My wordlist doesn't detect spam very well anymore

Teemu Likonen tlikonen at iki.fi
Sun Feb 9 21:23:48 CET 2020


Jonathan Kamens [2020-02-09T14:42:57-05] wrote:

> With the numbers you mentioned in your .MSG_COUNT, I doubt "several 
> hundred" messages of either ham or spam is going to be enough to 
> generate accurate training.
>
> The rolling corpus of messages that I save to do my monthly training 
> currently has 16126 ham and 5519 spam messages in it.

My total number of saved messages is about 422000 but only about 100 is
spam. I save spam only for 30 days. It will take years to collect
thousands of spam messages. Maybe bogotune is not useful for me, yet.

Currently I continue with my big wordlist and manually try some other
settings. I'll start keeping more spam messages around too in order to
have bigger training set if I choose to start from scratch later.


-- 
///  OpenPGP key: 4E1055DC84E9DFF613D78557719D69D324539450
//  https://keys.openpgp.org/search?q=tlikonen@iki.fi
/  https://keybase.io/tlikonen  https://github.com/tlikonen
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 694 bytes
Desc: not available
URL: <https://www.bogofilter.org/pipermail/bogofilter/attachments/20200209/0704cdf0/attachment.sig>


More information about the bogofilter mailing list