Tricky
Eric Wood
eric at interplas.com
Sun Feb 1 04:02:12 CET 2004
From: "David Relson"
> It uses fonts to add a few
> letters in the middle of a word thus mucking with dictionary lookups.
I do this before running it through bogofilter - just in case I can chunk
the email ahead of time.
-eric wood
# Strip Useless TABS and HTML comments, used to split up words
:0 HB
* < 20000
* text/html
{
:0fwb
| expand | sed -e :a -e 's/<!--[^-]*-->//g;/</N;//ba'
}
# Use lynx to strip html and search /etc/vmail/spam_words
:0 HB
* !^From:.*spam at intgrp\.com
* !^X-Loop:.*adult-trap
* !^TOspam at intgrp\.com
* < 20000
* text/html
* ? lynx -dump -stdin | grep -i -f /etc/vmail/spam_words
{
:0 fwh
| formail -A"X-Loop: spam-trap lynx"
:0
! spam at intgrp\.com
}
More information about the Bogofilter
mailing list