Anyone having problems training bogiflter with this stuff?

Nigel Henry cave.dnb at tiscali.fr
Sat Dec 23 17:04:57 CET 2006


I'm receiveing some stuff with what looks like chinese, or japanese 
characters, intersperced with lower, and upper case latin characters. I've 
run a few training sessions with it, but bogofilter is struggling.

I've pasted a sample below, as trying to set it as a .txt file in a text 
editor results in some very weird characters.

��シ�`�フ�N���W�b�g�J�[�h���v���[���g�キ���ニ�セ�、�フ���l�ヲ�ス�フ�ナ�キ�ェ�A
�サ���ェ�~�オ�ゥ�チ�ス���ノ�g�ヲ�ネ�「�フ�ナ�A�ャ�リ���ナ�フ�������l�ヲ�ト�「���フ�ナ�キ�ェ�@�ス�ナ�キ�ゥ
�H
���z�����ォ���ワ�ネ�「�ワ�ワ�f�「�ワ�キ�フ�ナ�A�サ�フ���ナ���z�����ォ�����ナ�����ィ
�、�ニ�v�チ�ト�「�ワ�キ�B
���ヘ�サ�、�「�、�`���l�ヲ�ス�フ�ナ�キ�ェ�A�ヌ�、�ナ�オ���、�ゥ�H
�ィ�ヤ���ュ�セ�ウ�「�B

I'm still using bogofilter-1.0.2 at the moment, which has been working very 
well up to now. It's processing mail directly dl'd to kmail.

I havn't upgraded bogofilter, because I was concerned about how it might 
affect the wordlist.db, but I do have 2 maildir mailboxes in Kmail where I 
save some of the ham, and all of the spam, apart from that which is being 
correctly identified as spam, so it isn't a big problem to recreate the 
wordlist.db.

Any suggestions on how to deal with this spam that bogofilters having problems 
with?

btw. It is ending up in the unsure mailbox, so bogofilter obviously thinks 
there is something dodgy about it.

Nigel.



More information about the Bogofilter mailing list