token pairs [was: Algorithm limitations]

Tom Anderson tanderso at oac-design.com
Wed Apr 14 14:15:30 CEST 2004


On Wed, 2004-04-14 at 07:39, David Relson wrote:
> True.  The odds are low, but it can happen.  One could use a leading
> colon or a pair of colons, as in ":token:pair" and "token::pair", to
> avoid the problem.  As there are only a few tags in use, I think the
> risk of a collision is pretty low.  Also, since most tokens have fairly
> neutral scores, getting an incorrect result because of a collision is
> even smaller.

The odds would be low if emails were just random collections of
characters.  But they're not.  Spammers will discover this loophole and
specifically craft their email bodies such that they can fool bogofilter
into misclassifying a token pair as a header tag of importance.  Using
either of the methods you specified or "+" or some other seperator
character would be preferred so that a loophole does not exist.

Tom

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20040414/37567a66/attachment.sig>


More information about the Bogofilter mailing list