invalid html warfare

John McCain jmccain at layer3al.com
Wed May 28 06:53:15 CEST 2003


imho, I think the days of using code-level html to identify spam are gone.  I 
think the only way statistical filters are going to continue to be effective 
is if they see the message exactly as a human would.  I've seen a great deal 
of statistical filter evasion such as the examples I cited.  The best case 
scenario right now seems to be that the filter will still catch the message, 
but our databases will gradually degrade with garbage data such as 
<gmurfoophead>, assuming we are maintaining the training database.

I am beginning to wonder how practical it would be to ban html e-mail for my 
domain.

On Tuesday 27 May 2003 09:06 pm, David Relson wrote:
> At 09:38 PM 5/27/03, Peter Jaques wrote:
> >silly me, the message wasn't text/html...
>
> Peter,
>
> That would certainly explain it :-)  The possibility had crossed my
> mind.  Do you use text MUA or one that will render html?  If the latter,
> did it treat the unlabeled html as html?
>
> In text mode, it would be possible for bogofilter to detect various html
> tags and switch to text/html mode.  Of course just because it could be done
> doesn't mean it should be done.
>
> Any thoughts?
>
> David
>
>
> ---------------------------------------------------------------------
> FAQ: http://bogofilter.sourceforge.net/bogofilter-faq.html
> To unsubscribe, e-mail: bogofilter-unsubscribe at aotto.com
> For summary digest subscription: bogofilter-digest-subscribe at aotto.com
> For more commands, e-mail: bogofilter-help at aotto.com





More information about the Bogofilter mailing list