[cvs] bogofilter/contrib randomtrain,1.2,1.3

Matthias Andree matthias.andree at gmx.de
Sun Dec 8 19:04:52 CET 2002


relson at users.sourceforge.net writes:

 Change grep options to allow nul bytes in the text body.
>
>
> Index: randomtrain
> ===================================================================
> RCS file: /cvsroot/bogofilter/bogofilter/contrib/randomtrain,v
> retrieving revision 1.2
> retrieving revision 1.3
> diff -u -d -r1.2 -r1.3
> --- randomtrain	4 Dec 2002 15:24:58 -0000	1.2
> +++ randomtrain	8 Dec 2002 17:18:54 -0000	1.3
> @@ -1,10 +1,12 @@
>  #! /bin/bash
>  #
> +# $Id$ #
> +#
>  #  randomtrain -- bogofilter messages from files in random order
>  #                 and train if the result is wrong or uncertain
>  #  needs:    bash basename rm grep awk wc perl dd bogofilter
>  #  usage:    see function usage() starting on line 10 of this file
> -#  version:  0.5 (Greg Louis <glouis at dynamicro.on.ca>)
> +#  author:   (Greg Louis <glouis at dynamicro.on.ca>)
>  
>  pid=$$
>  BOGOFILTER="../bogofilter"
> @@ -69,7 +71,7 @@
>      test "$indic" != "s" -a "$indic" != "n" && usage
>      file=$1 ; shift
>      if [ ! -r $file ]; then echo "$file not found"; usage; fi
> -    grep -b '^From ' $file | \
> +    grep -a -b '^From ' $file | \
>  	awk "BEGIN {FS=\":\"} {print \"$indic $file \"\$1}" >>list.$pid
>      wc -c $file | awk "{print \"$indic $file \"\$1}" >>list.$pid
>  done

Breaks Solaris. We need a different approach. I presume the binary tools
that ship with commercial unices are not up to the tasks we expect
them. They expect text, we feed binary, this must fail. I've had much
"fun" with sed and head on Solaris already which is why I wrote
tests/dumbhead.c -- I couldn't find a tool that would handle \0
properly.

I fear we now need a dumbgrep...

-- 
Matthias Andree




More information about the bogofilter-dev mailing list