html comment processing

Emmanuel Seyman seyman at acticiel.com
Mon Mar 31 23:56:50 CEST 2003


On Mon, Mar 31, 2003 at 01:06:53PM -0500, David Relson wrote:
>
> I _like_ your interpretation!  It fits well with what we actually
> see.  However, I don't think the purists would agree with you.

Then the purists are wrong, I'm afraid.
I find the specs clear enough to understand and
I think Hermann is right.

> Personally, I find the wording to be odd.  It's hard to understand.  Having
> the "comment declaration" separate from the comment allows "<!>" to be used
> as empty comment - but don't ask me how that's useful.  Having "--"

You can use this to mark a place for a parser. My school had a CGI script
that included job offers in a web page. The script basically replaced this:

<td>job offer description 1</td>
<td>job offer description 2</td>
<td>job offer description 3</td>
<!>

with this:

<td>job offer description 1</td>
<td>job offer description 2</td>
<td>job offer description 3</td>
<td>job offer description 4</td>
<!>


> <br>one tw<!--this is a comment-->o three

The comment is "--this is a comment--" .

> <br>single dou<!--this is a comment-->ble triple

Same here.

> <br><!first> <!--second--> <!-->third<-->

"<!first>" is a comment declaration with data characters inside
but no comment.

The second comment declaration contains the comment "--second--".

> <!-->third<-->

Again "<!-->" is a comment declaration with data characters inside.
"third" is part of the text. It needs to be counted.

"<-->" is an illegal tag. To be ignored.

> <br>Please vis<! FF3FFi?FS$s0,sz>it our web<! FF3FFi?FS$s0,sz>si<!
> FF3FFi?FS$s0,sz>te
> <br>Please vis<!-- FF3FFi?FS$s0,sz>it our web<! FF3FFi?FS$s0,sz>si<!
> FF3FFi?FS$s0,sz>te

All six comment declarations contain data characters but no comments.

Emmanuel




More information about the Bogofilter mailing list