Bogofilter and reclassifying

Nathaniel nate at nate37.net
Fri Dec 5 13:31:42 CET 2003


Thanks for the replys (both on and off the list).

Attached is a shell script I came up with to only reclassify mail that must be 
reclassified.  Only real caveat is that it will set the X-Bogosity header out 
of sync with what the file might really be scored as (the headers are already 
out of sync anyway, though).  This is to prevent the script from 
reclassifying multiple times (but also allows implementation of train to 
exhaustion really easily, if wanted).  Train to exhaustion should probably be 
a seperate script anyway, since things would be incorrectly weighed if only 
reclassified messages are trained to exhaustion.

Please let me know of any problems/ideas/suggestions.

-Nathaniel


On Friday 05 December 2003 01:44 am, Nathaniel wrote:
> Hello,
>
> I'm just wondering what is the "correct"/best way to reclassify a message.
>
> Currently I have a script that will grep through spam/ham dirs for an
> incorrect X-Bogosity header.  It then rescores it, checks to see if it is
> still incorrect and if so, it reclassifies it as spam/ham (if its the first
> pass it will -N or -S apporiatly) until it correctly scores.  It does this
> all while overwriting the file with the new X-Bogosity header.
>
> Is this correct?  Most howtos I've seen either recreate a wordlist or just
> mark as spam/ham the entire corpus, but I didn't want to maintain large
> corpuses and wanted something fairly efficient...
>
> Thanks.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: bogofilter_train.sh
Type: application/x-shellscript
Size: 889 bytes
Desc: not available
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20031205/056b9d51/attachment.bin>


More information about the Bogofilter mailing list