cannot filter virus letters

Dmitry vdb at mail.ru
Tue Feb 10 11:27:47 CET 2009


 On Четверг 29 января 2009, David Relson wrote:
> On Thu, 29 Jan 2009 15:03:37 +0300
>
> Dmitry wrote:
> > Sorry, exhaustive training doesn't change anything in my case.
> > Spamicity value is still less than 0.52. Tuning robx/robs gives me
> > strange results. Some good letters become spammy after that. I think
> > the algorithm has to be changed somehow for small letters with a few
> > words in the mesage body. Otherwise, hammy headers always  get
> > greater value and never let the spamicity score to be high enough.
>
> Hello Dmitry,
>
> In its earliest days, bogofilter scored a message based on the 15 tokens
> with the most extreme scores, i.e. scores furthest from 0.5.  Years ago
> bogofilter's scoring algorithm was changed and it now uses tokens
> further than 0.5 by 0.375 (which is the default value of the 'min_dev'
> parameter).
>
> I've known for quite a while that using min_dev can cause the spamicity
> score to be computed based on very few tokens.  It seems that your
> message is being scored based on a single token.
>
> I see two things that can be done.
>
> First, you can change the value of min_dev in your bogofilter
> configuration file.
>

Hello David,

The best results I get with the values:

min_dev=0.1
robx=0.8
robs=1.0
ham_cutoff = 0.45
spam_cutoff= 0.82

Now everything seems to be OK. No more virus letters. But there is another 
problem letter sent to mail mailbox that makes me unhappy. When I pass it 
to `bogofilter -s` it becomes more hammy! 

This is the output of a series "bogofilter -s ; bogofilter -t" commands:

U 0.517247
U 0.513562
U 0.510004
U 0.507321

What a strange result! It is the opposite of what I expect. The content of 
this letter is commercial spam with all words concatenated without spaces. 
Unfortunately, I can't quote this letter here because of non-latin 
charset. When I switch back to default bogofilter.cf with default values, 
the spammicity of this letter stay always at "U 0.500000". Exhaustive 
training  does not change anything. What can be done in such situation?

Another small problem --  I get errors when I define custom wordlists like 
this:
wordlist i,ignore,~/.bogofilter/ignorelist.db,1
wordlist r,wordlist,~/.bogofilter/wordlist.db,2

The error message:
bogofilter[16452]: Can't open database ~/.bogofilter/ignorelist.db: unable 
to open database file_
bogofilter[16452]: Error on database ~/.bogofilter/ignorelist.db: unable to 
open database file_
Can't open file 'ignorelist.db' in directory '~/.bogofilter'.
error #2 - No such file or directory.

It works If I define full path to wordlists:
wordlist i,ignore,/home/user/.bogofilter/ignorelist.db,1
wordlist r,ignore,/home/user/.bogofilter/wordlist.db,2

Is this expected behaviour? Am I missing something? Should I define $HOME 
variable before running bogofilter command from the console?

-- 
Dmitry



More information about the Bogofilter mailing list