spam that bogofilter is not catching

David Relson relson at osagesoftware.com
Mon Mar 17 17:30:55 CET 2003


At 11:12 AM 3/17/03, Terry Todd wrote:

>I have been getting some nasty spam that bogofilter is not catching.
>It is a very short message from a different source every time but
>always has a url that points to the same IP address.  The url
>changes but the IP doesn't.  Does anyone have any ideas on how to
>deal with this type of thing?
>
>I could send some examples.  I have to warn you though they are bad.
>
>Thanks,
>Terry Todd

Hi Terry,

For bogofilter to classify a message it needs to identify the tokens in the 
message and match them up with its wordlists.  Do the counts for these 
messages seem reasonable to you?  The easiest way to count tokens in a 
message is to use command "bogolexer < message | sort -u | wc -l".

Do you have "block_on_subnets=yes" in your config file, i.e. 
/etc/bogofilter.cf or ~/.bogofilter.cf?  When it's enabled, bogofilter will 
take IP address 12.34.56.78 and create 4 tokens, "url:12.34.56.78", 
"url:12.34.56", "url:12.34", and  "url:12".  Having the 3 additional tokens 
will help bogofilter classify the message.  If the message has a lot of 
tokens, this will help a little (not a lot).

Have you run bogofilter with "-vv" to generate a histogram or with "-vvv" 
to generate the detailed table of tokens, their counts, their scores, 
etc...  The information shown should help you understand what's going on.

Another idea (not necessarily a great one) would be to blacklist that IP 
address via procmail (or maildrop...).

It may be that the messages don't have enough in common with one another 
(or with other spam) for bogofilter to do a good job.  In that case, you 
may be out of luck.





More information about the Bogofilter mailing list