mass processing with mutt and Fcc

David Relson relson at osagesoftware.com
Wed Apr 2 19:28:21 CEST 2003


At 12:01 PM 4/2/03, Mark Stosberg wrote:

>On Wed, Apr 02, 2003 at 10:49:47AM -0600, Jesse Meyer wrote:
> >
> > Wouldn't it be rather easy (although probably not very elegant) to
> > make a short script that runs any html message through lynx -dump
> > first, then gives it to bogofilter to analyse, and, if that succeeds,
> > then passing the original message through?
>
>Makes sense to me.
>
>         Mark

Mark,

Don't forget that bogofilter is already removing much (if not all) of the 
html from a messages.  A quick test indicates that there's little 
difference in spam score with or without lynx.

Here's a sample test:

f="msg.d/msg.rc.0123.04.txt bogofilter -v < $f ; lynx -dump $f | bogofilter -v
X-Bogosity: Yes, tests=bogofilter, spamicity=0.589417, version=0.11.1.6
X-Bogosity: Yes, tests=bogofilter, spamicity=0.583682, version=0.11.1.6

More generally, to test it I'd use:

for f in msg.html* ; do echo $f ; bogofilter -v < $f ; lynx -dump $f | 
bogofilter -v ; done

If you do run tests and find any significant differences, let me know :-)

David





More information about the Bogofilter mailing list