ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
From: Kate F <kate@elide.org>
To: mailing list for ConTeXt users <ntg-context@ntg.nl>
Subject: Re: TeX in \xmlsetentity and DTDs in DOCTYPEs
Date: Tue, 19 Jan 2016 02:39:50 +0000	[thread overview]
Message-ID: <CAA36g0UZ=GT166hXvhGOmoQRJU4KWP8fE4cCcd99HD=8YLo8ww@mail.gmail.com> (raw)
In-Reply-To: <CAA36g0UQoErAawNdSd3i+UhDnb4cRdq-aWaNbMu0WJv2jioUMg@mail.gmail.com>

On 19 January 2016 at 02:16, Kate F <kate@elide.org> wrote:
> On 18 January 2016 at 21:16, Hans Hagen <pragma@wxs.nl> wrote:
>> On 1/18/2016 9:49 PM, Kate F wrote:
>>>
>>> On 18 January 2016 at 19:13, Hans Hagen <pragma@wxs.nl> wrote:
>>>>
>>>> On 1/18/2016 5:22 PM, Kate F wrote:
>>>>>
>>>>>
>>>>> On 18 January 2016 at 13:30, Thomas A. Schmitz
>>>>> <thomas.schmitz@uni-bonn.de> wrote:
>>>>>>
>>>>>>
>>>>>> On 01/17/2016 07:24 PM, Hans Hagen wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> it should work in the in beta now
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi Hans,
>>>>>>
>>>>>> now I have a problem :-) What should take precedence if an entity is
>>>>>> both
>>>>>> defined in the dtd and as a \xmltexentity? The way I see it, the
>>>>>> latter:
>>>>>> e.g., in the DTD, I might declare something for use in a browser but
>>>>>> require
>>>>>> a different solution when typesetting with ConTeXt. The latest and
>>>>>> greatest
>>>>>> now takes my DTD definitions instead of the \xmltexentities, which did
>>>>>> not
>>>>>> happen before. Is that an unwanted side effect or the new default?
>>>>>>
>>>>>
>>>>> Ah, there's a bug:
>>>>>
>>>>>       <!ENTITY i.opt "<option>-i</option>">
>>>>>
>>>>> This should produce an <option> node in the DOM tree, just as if you'd
>>>>> typed that out where the entity is used. Currently ConTeXt takes that
>>>>> as literal text, as if you'd typed "&lt;option&gt;-i&lt;option/&gt;"
>>>>>
>>>>> Often I wish XML weren't so complex...
>>>>
>>>>
>>>>
>>>> are you sure? i've never seen that
>>>>
>>>> Hans
>>>
>>>
>>> Yep!
>>>
>>> These are called "internal parsed entities". "Parsed" requires that
>>> any tags *inside* the entity must be balanced, unlike in SGML
>>> entities.
>>>
>>> Sorry I can't find a clear explanation in the XML spec; it's a pretty
>>> confusing document.
>>> But here's some random person's slide illustrating an example:
>>> http://images.slideplayer.com/23/6622270/slides/slide_47.jpg
>>>
>>> libxml2 deals with these correctly, which is what I've been using
>>> (xsltproc and friends) for my documents which use them. I generally
>>> trust libxml2 to get things right.
>>>
>>> I use these entities to centralise often-repeated fragments between
>>> documents, kind of like how you might use a primitive macro in TeX.
>>>
>>> So for example in one external DTD I have some general things:
>>>
>>>      <!ENTITY macro.arg  "<replaceable>macro</replaceable>">
>>>      <!ENTITY equal.lit      "<literal>=</literal>">
>>>
>>> And then in one specific document's internal entities, something which
>>> uses them:
>>>
>>>          <!ENTITY D2.opt
>>>
>>> "<option>-D</option>&macro.arg;&equal.lit;<replaceable>defn</replaceable>">
>>>
>>> Then if I change my mind about how I want to mark up "=", for example,
>>> I only have one place to change it. This makes life with XML a little
>>> bit less painful.
>>
>>
>> well, i've learned not to trust all these docs on the web too much and
>> applications can do what they want (and thereby even influence standards)
>>
>> xml pocket reference:
>>
>> - parsed entity: replacement text that can be referenced
>>
>>   internal: literal string to be injected (then the example
>>             shows only text and entities
>>
>> in your example you use a (decimal) character entity ... the link you give
>> says that you cannot use & and % as part of the entities value so that would
>> mean your example is wrong
>>
>
> Indeed I did not mean that somebody's presentation slides are normative.
>
> This is normative for XML 1.0:
> https://www.w3.org/TR/1998/REC-xml-19980210#wf-entities
>
> Which has the following productions:
>
>     extParsedEnt ::= TextDecl? content
>     content ::= (element | CharData | Reference | CDSect | PI | Comment)*
>
> So you can see that both elements (by "element") and entity references
> (by "Reference") are permitted in the grammar. The latter includes:
>
>     Reference ::= EntityRef | CharRef
>     EntityRef::='&' Name ';'

Sorry, I pointed to an outdated version of the XML 1.0 spec there.

The current version (fifth revision) has the grammar written slightly
differently, but still permits both well-formed <xyz>...</xyz> tags
and &xyz; entities inside entity declarations:

https://www.w3.org/TR/xml/#sec-entity-decl

    [70]    EntityDecl  ::= GEDecl | PEDecl
    [72]    PEDecl      ::= '<!ENTITY' S '%' S Name S PEDef S? '>'
    [74]    PEDef       ::= EntityValue | ExternalID
    [9]     EntityValue ::= '"' ([^%&"] | PEReference | Reference)* '"'
                         |  "'" ([^%&'] | PEReference | Reference)* "'"
    [67]    Reference   ::= EntityRef | CharRef
    [68]    EntityRef   ::= '&' Name ';'

Thus &xyz; is permitted by productions applying ultimately to
EntityRef, and <xyz>...</xyz> is permitted by productions applying
through the * closure of EntityValue's [^%&"] text (where [^...] means
"not").


> Per this, all my examples are correct.
>
> O'Reilly XML in a Nutshell has an example:
> http://docstore.mik.ua/orelly/xml/xmlnut/ch03_04.htm
>
>
>> of course we can consider an option to parse the entity as xml
>>
>> (we can consider < as a trigger for parsing thereby kin dof automatically
>> adapting)
>>
>
> I do not think this is a good idea - per the XML spec, these entities
> should always be taken as well-formed fragments of XML. So treating
> them otherwise would be incorrect.
>
> Thanks,
>
> --
> Kate
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

  reply	other threads:[~2016-01-19  2:39 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-15 17:21 Kate F
2016-01-15 17:34 ` Thomas A. Schmitz
2016-01-15 17:58   ` Kate F
2016-01-15 20:20     ` Thomas A. Schmitz
2016-01-16  0:33       ` Hans Hagen
2016-01-16 15:55         ` Kate F
2016-01-17 18:24           ` Hans Hagen
2016-01-18 13:30             ` Thomas A. Schmitz
2016-01-18 16:13               ` Kate F
2016-01-18 16:22               ` Kate F
2016-01-18 19:13                 ` Hans Hagen
2016-01-18 20:49                   ` Kate F
2016-01-18 21:16                     ` Hans Hagen
2016-01-19  2:16                       ` Kate F
2016-01-19  2:39                         ` Kate F [this message]
2016-01-18 20:07               ` Hans Hagen
2016-01-18 20:56                 ` Kate F
2016-01-18 21:19                   ` Hans Hagen
2016-01-18 21:26               ` Hans Hagen
2016-01-18 21:45                 ` Thomas A. Schmitz
2016-01-19  8:19                   ` Hans Hagen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAA36g0UZ=GT166hXvhGOmoQRJU4KWP8fE4cCcd99HD=8YLo8ww@mail.gmail.com' \
    --to=kate@elide.org \
    --cc=ntg-context@ntg.nl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).