speaking of random words

Tom Anderson tanderso at oac-design.com
Thu Mar 18 14:22:10 CET 2004


On Wed, 2004-03-17 at 10:02, Boris 'pi' Piwinger wrote:
> So the problem seems to be your wordlists has a very high
> inertia. So probably it has seen so much it does not really
> take up new information (with significance). Kind of
> information overflow when you get tired after many hours of
> TV and someone suddenly asks you about details a minute ago.

I really don't see this "high inertia", as you put it, as a problem.  In
fact, I like that my wordlist is "stable" and does not swing wildly
based on a single registration.  However, at the same time, it does
require a lot of nudging to get ham tokens back toward more neutral
territory.  What this means is that I can probably set my recursion max
for exhaustive training slightly higher without adverse effects.

The main reason I posted the spam was as an illustration of a spammer
who "got it right".  They chose a group of words which were
overwhelmingly hammy.  So far, this is my worst-case-scenario for this
class of spam.  I think bogofilter will handle it, but only after
several registrations bring those ham tokens away from 0 a little bit
and fortify the spammy tokens closer to 1.

Tom

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20040318/d1a38e6a/attachment.sig>


More information about the Bogofilter mailing list