Tricky

Eric Wood eric at interplas.com
Sun Feb 1 04:02:12 CET 2004


From: "David Relson"
> It uses fonts to add a few
> letters in the middle of a word thus mucking with dictionary lookups.

I do this before running it through bogofilter - just in case I can chunk
the email ahead of time.
-eric wood

# Strip Useless TABS and HTML comments, used to split up words
:0 HB
* < 20000
* text/html
{
  :0fwb
  | expand | sed -e :a -e 's/<!--[^-]*-->//g;/</N;//ba'
}

# Use lynx to strip html and search /etc/vmail/spam_words
:0 HB
* !^From:.*spam at intgrp\.com
* !^X-Loop:.*adult-trap
* !^TOspam at intgrp\.com
* < 20000
* text/html
* ? lynx -dump -stdin | grep -i -f /etc/vmail/spam_words
{
  :0 fwh
  | formail -A"X-Loop: spam-trap lynx"
  :0
  ! spam at intgrp\.com
}





More information about the Bogofilter mailing list