Ignore lists
Greg McCann
greg at cambria.com
Fri Mar 5 04:56:50 CET 2004
On 3/4/2004 at 10:33 PM David Relson <relson at osagesoftware.com> wrote:
>With a properly formatted list, "egrep -v -f ignore.list" can be used
>for deleting tokens. Use a caret "^" at the beginning of each line and
>a space " " at the end of each line. That way only complete tokens will
>be matched and deleted.
I thought fgrep was necessary when using a wordlist and couldn't get that to work with regexps. But you're right - using egrep instead seems to do the trick.
I don't know about the "usability" factor of doing this as opposed to simply having an ignore.db, especially for new users, but it does seem to work.
# bogoutil -p ./ rcvd:Jan rcvd:Feb rcvd:Mar rcvd:Apr rcvd:May rcvd:Jun rcvd:Jul rcvd:Aug rcvd:Sep rcvd:Oct
rcvd:Nov rcvd:Dec
spam good Fisher
rcvd:Jan 12 199 0.011211
rcvd:Feb 14625 130 0.954782
rcvd:Mar 5428 26 0.975113
rcvd:Apr 0 0 0.415000
rcvd:May 18 0 0.999675
rcvd:Jun 96 0 0.999939
rcvd:Jul 3 2 0.220076
rcvd:Aug 0 0 0.415000
rcvd:Sep 0 0 0.415000
rcvd:Oct 7 3 0.304674
rcvd:Nov 0 0 0.415000
rcvd:Dec 7 99 0.013135
# cat ignore.txt
^rcvd:Jan
^rcvd:Feb
^rcvd:Mar
^rcvd:Apr
^rcvd:May
^rcvd:Jun
^rcvd:Jul
^rcvd:Aug
^rcvd:Sep
^rcvd:Oct
^rcvd:Nov
^rcvd:Dec
# cat bogorebuild_filter.sh
bogoutil -d wordlist.db | egrep -v -f ignore.txt | bogoutil -l wordlist.new.db
mv wordlist.new.db wordlist.db
chmod 666 wordlist.db
# ./bogorebuild_filter.sh
# bogoutil -p ./ rcvd:Jan rcvd:Feb rcvd:Mar rcvd:Apr rcvd:May rcvd:Jun rcvd:Jul rcvd:Aug rcvd:Sep rcvd:Oct
rcvd:Nov rcvd:Dec
spam good Fisher
rcvd:Jan 0 0 0.415000
rcvd:Feb 0 0 0.415000
rcvd:Mar 0 0 0.415000
rcvd:Apr 0 0 0.415000
rcvd:May 0 0 0.415000
rcvd:Jun 0 0 0.415000
rcvd:Jul 0 0 0.415000
rcvd:Aug 0 0 0.415000
rcvd:Sep 0 0 0.415000
rcvd:Oct 0 0 0.415000
rcvd:Nov 0 0 0.415000
rcvd:Dec 0 0 0.415000
Greg
More information about the Bogofilter
mailing list