[bogofilter] using block_on_subnets
Tom Allison
tallison at tacocat.net
Fri Apr 30 12:39:27 CEST 2004
David Relson wrote:
> On 29 Apr 2004 07:54:23 -0400
> Tom Anderson wrote:
>
>
>>On Thu, 2004-04-29 at 07:36, David Relson wrote:
>>
>>>"Any" covers a lot of territory:-) Doing all the work in bogoutil
>>>would require reading the whole wordlist and applying the wildcard.
>>
>>Would it be difficult to insert a regular expression at the datastore
>>level? I browsed through the code, but don't feel qualified to try to
>>patch something in myself.
>>
>>
>>>A bit more complex, but using the command line's capabilities, is to
>>>use"bogoutil -d | egrep | awk print $1 | bogoutil -p".
>>
>>Ok, at which point is the wildcard token declared here. I'm guessing
>>as an argument to egrep, but I ended up with a broken pipe :(
>>
>>Tom
>
>
> Tom,
>
> I don't know if BerkeleyDB has builtin support for wildcarding... My
> guess would be that it doesn't, but I might be wrong.
>
> If memory serves, the command sequence is:
>
> bogoutil -d $path/wordlist.db \
> | egrep "expr" \
> | awk '{print $1}' \
> | bogoutil -p $path/wordlist.db
>
> Enjoy!
>
> David
>
I was playing a bit with this today.
I'm amazed at how many URL entries are just invalid IP addresses.
url:0 66 252 0.318316
url:0.0 2 25 0.125084
url:0.0.0 0 25 0.000370
url:0.0.0.0 0 25 0.000370
url:0.0.160 1 0 0.991605
I'm pretty certain that these are all invalid URL's.
I just surprised at how many of them are also "good"
More information about the Bogofilter
mailing list