BASE64 [was: various]
David Relson
relson at osagesoftware.com
Wed Oct 23 01:00:48 CEST 2002
At 06:56 PM 10/22/02, Matthias Andree wrote:
> > Base64 parsing is a problem. Not only will 'solitary' on a
> > line be ignored, but a legitimate b64 line, such as
> > 'c29saXRhcnkgd29yZAo=', will be tokenized as 'c29saxrhcnkgd29yzao'.
> > Unfortunately, 'xrhcnk' on a line by itself is also legal base64.
> > While this is permissible, I don't think it's common; I think we're
> > more likely to find base64 lines to be longer than the maximum token
> > length, except when they end in '='. I haven't thought this through,
> > but I think we may get better results if the base64 re was modified to
> > only catch '='-terminated strings.
>
>That won't work, this termination only happens when "padding" is
>necessary, i. e. if the length is not divisible by 3, but leaves a
>remainder of 1 or 2.
I've got some code that will do the 4x6-bit to 3x8-bit conversion for
base64. However, there's a lack of handling for Content-xxx and related
messages and I'm wondering whether I want to (should) tackle the task.
More information about the bogofilter-dev
mailing list