some questions about bogofilter 0.13.6&0.15.7
Mike Lykov
combr at vesna.ru
Tue Nov 4 12:24:10 CET 2003
В сообщении от Вторник 04 Ноябрь 2003 14:19 Boris 'pi' Piwinger написал:
> > How i can re-create wordlists for using it ?
> Build them from scratch with your mail collection.
Hmm. I think best choice is use already catched spam (I put it to special
mailbox on server ).
But rebuilding database follow to break all learning (by hand or auto) %((
> > Or i can use -H option - it will be same as above ?
> That won't be enough I believe.
On new (re-created) database i can see the same errors, what I can fixed in
current - with attaches, for example..
> > PARSING OPTIONS
> > Where i must use it?
> Either on the command line or in the config file.
in command line with "classification" or "registration" options ?
When updating wordlists or when classifying letter ?
It is
> strongly recommended that you leave them alone, they might
> go soon and the defaults will most likely work best.
When I exec 'bogofilter -Q' i do not see defaults for this "parsing options"
%(
In default confilg it's also missed.
You suppose not to use it ?
I wanted to exec bogofilter -Ph -Pt ...
> > --cqxox3fnlpmgjstp-- 1 20030526
> > --TB36FDmn/VVEgNH/ 2 20030708
> > --TB36FDmn/VVEgNH/-- 2 20030708
> I guess the rationale is that those can be specific to spam
> software.
I think that base64 sets of letters mostly random , isn't it?
What about ver 0.15.8, where
" * Modified handling of mime attachments to decode rfc822 and
to discard applications and images." ?
What is that mean? Why discard images ?
In my character of email, better way is to discard documents such as Microsoft
Office files rather images - some spam letters contain image where spammer
write his information (to avoid content-filtering).
> > Often i see that the letter with attached file and a little piece if text
> > (two-three words) classified as spam, but attach can't be spam!
> Why not? I often get spam with attachments.
See above. I have real spam with little images, but I have never seen spam
with big office documents ;)
But i often see false positives on such office documents!
> > I think bogofilter must not classify attaches at all! (same about headers
> > .. %)
> This has been discussed. It shows that it is useful. Like
> getting all those viruses, images etc.
Right, i subscribe thus maillist to discuss ;)
Who need to tokenize headers ? In spam received: subject: from: and other
mostly random - spammers try to avoid filters in MTA, but MTA is best for
filtering on headers! ;)
If bogofilter is a content-filter, it must rely on content of letters, not on
random information like a headers ;)
--
Mike
registered linux user #315334
jabber id: combr at jabber.ru
More information about the Bogofilter
mailing list