mass processing with mutt and Fcc
David Relson
relson at osagesoftware.com
Wed Apr 2 19:28:21 CEST 2003
At 12:01 PM 4/2/03, Mark Stosberg wrote:
>On Wed, Apr 02, 2003 at 10:49:47AM -0600, Jesse Meyer wrote:
> >
> > Wouldn't it be rather easy (although probably not very elegant) to
> > make a short script that runs any html message through lynx -dump
> > first, then gives it to bogofilter to analyse, and, if that succeeds,
> > then passing the original message through?
>
>Makes sense to me.
>
> Mark
Mark,
Don't forget that bogofilter is already removing much (if not all) of the
html from a messages. A quick test indicates that there's little
difference in spam score with or without lynx.
Here's a sample test:
f="msg.d/msg.rc.0123.04.txt bogofilter -v < $f ; lynx -dump $f | bogofilter -v
X-Bogosity: Yes, tests=bogofilter, spamicity=0.589417, version=0.11.1.6
X-Bogosity: Yes, tests=bogofilter, spamicity=0.583682, version=0.11.1.6
More generally, to test it I'd use:
for f in msg.html* ; do echo $f ; bogofilter -v < $f ; lynx -dump $f |
bogofilter -v ; done
If you do run tests and find any significant differences, let me know :-)
David
More information about the Bogofilter
mailing list