ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* xml comments
@ 2001-11-26 14:20 Patrick Gundlach
  2001-11-26 15:03 ` Hans Hagen
  2001-11-26 15:07 ` Hans Hagen
  0 siblings, 2 replies; 6+ messages in thread
From: Patrick Gundlach @ 2001-11-26 14:20 UTC (permalink / raw)


Hi,

xml comments have a bug (!?): spaces are critical at the 
beginning of a comment.

\starttext
\startbuffer
<!-- ignored comment -->
<one>one</one>
<!--not ignored -->
<two>two</two>
\stopbuffer
\processXMLbuffer
\stoptext

this should (as far as I understand the spec right) only output 
one two and not the second comment (<!--not ignored -->)

it seems that the missing space between the -- and the n is 
causing this effect. 

Thats what the xml rec says:
[15]   
Comment
  ::=   
 '<!--' ((Char - '-') | ('-' (Char - '-')))* '-->'

But probably Taco knows for sure if there is another spec which 
states otherwise ;-)

I hope you don't understand me wrong: with all my bug reports 
I don't want to complain  
about context, which is in my eyes a great piece of software. 

Viele Grüße,

  Patrick Gundlach

- I TeX, therefore I am -


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xml comments
  2001-11-26 14:20 xml comments Patrick Gundlach
@ 2001-11-26 15:03 ` Hans Hagen
  2001-11-26 15:07 ` Hans Hagen
  1 sibling, 0 replies; 6+ messages in thread
From: Hans Hagen @ 2001-11-26 15:03 UTC (permalink / raw)
  Cc: ntg-context

At 03:20 PM 11/26/2001 +0100, Patrick Gundlach wrote:

>\starttext
>\startbuffer
><!-- ignored comment -->
><one>one</one>
><!--not ignored -->
><two>two</two>
>\stopbuffer
>\processXMLbuffer
>\stoptext
>
>this should (as far as I understand the spec right) only output
>one two and not the second comment (<!--not ignored -->)

what exactly do you mean with 'output'? Should I treat <!--crap as being 
<!-- crap?

>it seems that the missing space between the -- and the n is
>causing this effect.
>
>Thats what the xml rec says:
>[15]
>Comment
>   ::=
>  '<!--' ((Char - '-') | ('-' (Char - '-')))* '-->'

ah, i instantly get an head ache when i see this line of spec

>I hope you don't understand me wrong: with all my bug reports
>I don't want to complain
>about context, which is in my eyes a great piece of software.

a bug reports <> comment -)

there should not be bugs; also i know rather well which pieces of context 
are coded ok (apart from bugs) and which parts are crap (which i have to 
work on), so keep on mailing

Hans
-------------------------------------------------------------------------
                                   Hans Hagen | PRAGMA ADE | pragma@wxs.nl
                       Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
  tel: +31 (0)38 477 53 69 | fax: +31 (0)38 477 53 74 | www.pragma-ade.com
-------------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xml comments
  2001-11-26 14:20 xml comments Patrick Gundlach
  2001-11-26 15:03 ` Hans Hagen
@ 2001-11-26 15:07 ` Hans Hagen
  2001-11-27  8:37   ` Taco Hoekwater
  1 sibling, 1 reply; 6+ messages in thread
From: Hans Hagen @ 2001-11-26 15:07 UTC (permalink / raw)
  Cc: ntg-context

At 03:20 PM 11/26/2001 +0100, Patrick Gundlach wrote:
>Hi,
>
>xml comments have a bug (!?): spaces are critical at the
>beginning of a comment.

you may try (and play with):

\long\def\xparseXMLescape !#1#2#3{\parseXMLescape{#1#2}#3}

(don't define this undef \unprotect regime!)

here i assume that we can expect at least a couple of chars after the <! 
and the #3 nicely eats up the space

Hans
-------------------------------------------------------------------------
                                   Hans Hagen | PRAGMA ADE | pragma@wxs.nl
                       Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
  tel: +31 (0)38 477 53 69 | fax: +31 (0)38 477 53 74 | www.pragma-ade.com
-------------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xml comments
  2001-11-26 15:07 ` Hans Hagen
@ 2001-11-27  8:37   ` Taco Hoekwater
  2001-11-27  9:43     ` Hans Hagen
  0 siblings, 1 reply; 6+ messages in thread
From: Taco Hoekwater @ 2001-11-27  8:37 UTC (permalink / raw)
  Cc: gundlach, ntg-context

"Hans Hagen" <pragma@wxs.nl> wrote:
> At 03:20 PM 11/26/2001 +0100, Patrick Gundlach wrote:
> >Hi,
> >
> >xml comments have a bug (!?): spaces are critical at the
> >beginning of a comment.
> 
> you may try (and play with):
> 
> \long\def\xparseXMLescape !#1#2#3{\parseXMLescape{#1#2}#3}

Adding #3 is safe, sort of. <! is always followed by one of the language's
keywords, and <!-- is by far the shortest of those. Just watch out
that <!----> is a valid comment (although very unlikely to appear).

The string "<!-- crap -->" is actually a comment that starts with
a space. The comment itself in this case is: " crap ".

It is possible to take advantage of the fact that '<!--' is the only
keyword that starts with the same character twice, so the 'best' thing 
to do is something along these lines:

\long\def\xparseXMLescape!#1#2%
  {\if#1#2%
     \@EA\handleXMLcomment
    \else
     \@EA\doxparseXMLescape
    \fi#1#2}

\def\handleXMLcomment--#1-->{}

\def\doxparseXMLescape#1 {\parseXMLescape{#1}}

-- 
groeten,

Taco


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xml comments
  2001-11-27  8:37   ` Taco Hoekwater
@ 2001-11-27  9:43     ` Hans Hagen
  2001-11-27 13:21       ` Taco Hoekwater
  0 siblings, 1 reply; 6+ messages in thread
From: Hans Hagen @ 2001-11-27  9:43 UTC (permalink / raw)
  Cc: gundlach, ntg-context

At 09:37 AM 11/27/2001 +0100, Taco Hoekwater wrote:
>"Hans Hagen" <pragma@wxs.nl> wrote:
> > At 03:20 PM 11/26/2001 +0100, Patrick Gundlach wrote:
> > >Hi,
> > >
> > >xml comments have a bug (!?): spaces are critical at the
> > >beginning of a comment.
> >
> > you may try (and play with):
> >
> > \long\def\xparseXMLescape !#1#2#3{\parseXMLescape{#1#2}#3}
>
>Adding #3 is safe, sort of. <! is always followed by one of the language's
>keywords, and <!-- is by far the shortest of those. Just watch out
>that <!----> is a valid comment (although very unlikely to appear).
>
>The string "<!-- crap -->" is actually a comment that starts with
>a space. The comment itself in this case is: " crap ".
>
>It is possible to take advantage of the fact that '<!--' is the only
>keyword that starts with the same character twice, so the 'best' thing
>to do is something along these lines:
>
>\long\def\xparseXMLescape!#1#2%
>   {\if#1#2%
>      \@EA\handleXMLcomment
>     \else
>      \@EA\doxparseXMLescape
>     \fi#1#2}
>
>\def\handleXMLcomment--#1-->{}
>
>\def\doxparseXMLescape#1 {\parseXMLescape{#1}}

yesterday i found out that i needed both alternatives , so i came up with:

\long\def\xparseXMLescape !#1#2%
   {\if#1-%
      \if#2-%
        \expandafter\expandafter\expandafter\xxparseXMLescape
      \else
        \expandafter\expandafter\expandafter\xyparseXMLescape
      \fi
    \else
      \expandafter\xyparseXMLescape
    \fi#1#2}

\long\def\xxparseXMLescape --#1{\parseXMLescape{--}#1}
\long\def\xyparseXMLescape #1  {\parseXMLescape{#1}}

so far this one works ok.

indeed, if we assume that #1 and #1 are the same and -, we can have a 
faster alternative, but we may want to play safe. How does this keyword 
mechanism work? Are there only officially registered ones?

Hans
-------------------------------------------------------------------------
                                   Hans Hagen | PRAGMA ADE | pragma@wxs.nl
                       Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
  tel: +31 (0)38 477 53 69 | fax: +31 (0)38 477 53 74 | www.pragma-ade.com
-------------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xml comments
  2001-11-27  9:43     ` Hans Hagen
@ 2001-11-27 13:21       ` Taco Hoekwater
  0 siblings, 0 replies; 6+ messages in thread
From: Taco Hoekwater @ 2001-11-27 13:21 UTC (permalink / raw)
  Cc: gundlach, ntg-context

> indeed, if we assume that #1 and #1 are the same and -, we can have a 
> faster alternative, but we may want to play safe. How does this keyword 
> mechanism work? Are there only officially registered ones?

Here is the full list. It is case-sensitive, and not extensible in any way.

Within any XML instance document:

<![CDATA[
<!--
<!DOCTYPE

Only within the bracketed part of DOCTYPE or in an external DTD:

<!ELEMENT
<!ATTLIST
<!ENTITY
<!NOTATION

Only in DTDs:

<![INCLUDE
<![IGNORE
<![%....;

A space character is allowed after the [ in these three.

These last ones are used in building/parameterising the DTD, like this:

<!ENTITY % draft 'INCLUDE' >
<!ENTITY % final 'IGNORE' >
<![%draft;[
<!ELEMENT book (comments*, title, body, supplements?)>
]]>
<![%final;[
<!ELEMENT book (title, body, supplements?)>
]]>

Nothing else that starts with <! is allowed to appear anywhere, except
within a <![CDATA section (where it doesnt count as markup).

-- 
groeten,

Taco


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2001-11-27 13:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-11-26 14:20 xml comments Patrick Gundlach
2001-11-26 15:03 ` Hans Hagen
2001-11-26 15:07 ` Hans Hagen
2001-11-27  8:37   ` Taco Hoekwater
2001-11-27  9:43     ` Hans Hagen
2001-11-27 13:21       ` Taco Hoekwater

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).