mass processing with mutt and Fcc
Jesse Meyer
meyer at btinet.net
Wed Apr 2 18:49:47 CEST 2003
On Tue, Apr 01, 2003 at 10:23:32PM +0200, Boris 'pi' Piwinger wrote:
> David Relson <relson at osagesoftware.com> wrote:
>
> >Actually it takes extra work to recognize html tags (and comments) and
> >throw them away. When processing normal text, pretty much all that's kept
> >is letters and digits and a few special characters like period, hyphen,
> >underscore, apostrophe, etc. It's trivially easy to apply the normal text
> >mode to html.
>
> The problem is that we need HTML processing to avoid the
> spammers' tricks with tags in the middle of words. So it
> would be nice to do that and also evaluate the content of
> the tags.
Wouldn't it be rather easy (although probably not very elegant) to
make a short script that runs any html message through lynx -dump
first, then gives it to bogofilter to analyse, and, if that succeeds,
then passing the original message through?
--
...crying "Tekeli-li! Tekeli-li!"... ~ HPL
icq : 34583382 | === ascii ribbon campaign ===
msn : dasunt at hotmail.com | () - against html mail
yim : tsunad | /\ - against proprietary attachments
More information about the Bogofilter
mailing list