public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Reference IDs in XML output
@ 2021-03-12 20:22 Albert Krewinkel
       [not found] ` <877dmc9l7b.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Albert Krewinkel @ 2021-03-12 20:22 UTC (permalink / raw)
  To: pandoc-discuss

There is a small problem which I noticed lately: citation keys are used
as part of the id of the respective reference item; e.g., if a citation
has `@misc{foo, ...}` then the bibliography entry has id="ref-foo". This
can be a problem when generating XML output, as the citation keys may
contain characters which are not allowed in XML names. E.g., BibTeX
allows slashes as part of the identifier, but those are illegal in an
`id` attribute, leading to the generation of invalid XML documents. As
far as I can see, this affects JATS, TEI, HTML4, and EPUB2. The HTML5
standard is less restrictive, so EPUB3 is unaffected.

I'd like to fix the problem, but am not sure where and how.

- Where: in each affected writer, or in citeproc?
- How: by removing the offending characters, or by using a different
  scheme to generate reference identifiers? Numbering, hashing, …?
  Do we check for duplicates, or can we assume that identifiers with
  prefix "ref-" are reserved for pandoc?

The more I think about this, the more questions I have and by now I'm
overthinking it. Any help to get me back to the ground is appreciated.


--
Albert Krewinkel
GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/877dmc9l7b.fsf%40zeitkraut.de.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-03-14 21:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-12 20:22 Reference IDs in XML output Albert Krewinkel
     [not found] ` <877dmc9l7b.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
2021-03-12 21:03   ` TRS-80
2021-03-13  9:41   ` BPJ
2021-03-13 16:12   ` jcr
2021-03-14 21:18   ` John MacFarlane

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).