Just following up on my own question in case anybody else is interested, I realised that the logic of my preprocessor code wasn't the problem, it was just my ignorance of lua and character escaping. Both ? and * are special characters in string.gsub expressions, so they need to be escaped with the lua escape character which is % (which sends my editor haywire because it thinks I am putting in a TeX comment, but never mind!).

So my now working code to translate historic processing instructions into directives without a separate pass is:

\startluacode
function lxml.preprocessor(data)
 data = string.gsub(data, '<%?capdm %*force_line_stop%?>', '<?context-directive injector newline ?>')
 return data
end
\stopluacode

\startsetups xml:directive:injector:newline
 \crlf
\stopsetups

Duncan

On Mon, 2 May 2022 at 09:24, Duncan Hothersall <dh@capdm.com> wrote:
Many thanks Denis. Very useful tip on the injectors and generalised command usage. That will definitely come in useful.

My problem is that I already have a lot of XML data with the existing processing instructions in it. I know I could use an external preprocess in something like Python to change them into injectors before they are fed into ConTeXt, but I was hoping there was a way to handle them directly as part of the ConTeXt / lua process.

Thanks again though.

Duncan




On Mon, 2 May 2022, 08:48 , <denis.maier@unibe.ch> wrote:

That was too quick, sorry.

 

Hi Duncan,

 

I have used context’s own injectors for this :

 

<?context-directive injector addlinetopage ?>

 

\startsetups xml:directive:injector:addlinetopage

  \adaptlayout[lines=+1]

\stopsetups

 

Or, for your line break example :

 

<?context-directive injector newline ?>

 

\startsetups xml:directive:injector:newline

  \crlf

\stopsetups

 

Also, I have learned that you can just use arbitrary context code in xml:

 

\def\xmltexdirective#1#2{\doif{#1}{command}{#2}}

 

\xmlinstalldirective{tex}{xmltexdirective}

 

         <?context-directive tex command \inframed{xxx} ?>

         <?context-directive tex command \page ?>

         <?context-directive tex command \crlf ?>

 

Best,

Denis

 

Von: Maier, Denis Christian (UB)
Gesendet: Montag, 2.
Mai 2022 09:45
An: 'mailing list for ConTeXt users' <ntg-context@ntg.nl>
Cc: Duncan Hothersall <dh@capdm.com>
Betreff: AW: [NTG-context] XML processing instructions

 

 

Von: ntg-context <ntg-context-bounces@ntg.nl> Im Auftrag von Duncan Hothersall via ntg-context
Gesendet: Montag, 2.
Mai 2022 09:20
An: mailing list for ConTeXt users <
ntg-context@ntg.nl>
Cc: Duncan Hothersall <
dh@capdm.com>
Betreff: [NTG-context] XML processing instructions

 

I have a big set of existing XML books (held in a derivative of DocBook) which I'm looking to start processing directly with ConTeXt. (Up to now I have a system which converts the XML into ConTeXt code which is then processed, but this is inefficient and lots of the code is now unsupported.)

 

I've had some success producing output, but my first real sticking point has come with processing instructions. The existing XML contains lots of processing instructions of the form 

<?capdm whatever?>, some of which can be conditional and introduce new data etc. But I'd be happy at this stage if I could just process the most basic one of them, which is used to introduce a line stop in a running paragraph of text.

 

My best guess at how to do this was to use the lxml.preprocessor function to convert the processing instruction into an element, and then process the element as normal. But (a) my attempt didn't work, and (b) there may well be a better way.

 

Minimal working example below, except that obviously the processing instruction bit doesn't work!

 

Thanks for any help or insights.

 

Duncan

 

 

MWE:

------

 

\startbuffer[demo]
<book>
  <para>A paragraph with<?capdm force_line_stop?>a processing instruction.</para>
</book>
\stopbuffer

\startxmlsetups xml:demo:base
 \xmlsetsetup{#1}{*}{xml:demo:*}
\stopxmlsetups
\xmlregisterdocumentsetup{demo}{xml:demo:base}

\startxmlsetups xml:demo:book
 \xmlflush{#1}
\stopxmlsetups

\startxmlsetups xml:demo:para
 \xmlflush{#1}\endgraf
\stopxmlsetups

\startluacode
 function lxml.preprocessor(data,settings)
  return string.find(data,"<?capdm *force_line_stop?>")
   and string.gsub(data,"<?capdm *force_line_stop?>","<capdmlinestop></capdmlinestop/>")
   or data
 end
\stopluacode

\startxmlsetups xml:demo:capdmlinestop
 \crlf
 \xmlflush{#1}
\stopxmlsetups

\setupbodyfont[modern]
\starttext
\xmlprocessbuffer{demo}{demo}{}
\stoptext

 

------



--
Duncan Hothersall, Operations Director
CAPDM Limited - Online Program Enablers
0131 677 2400  www.capdm.com
Registered in Scotland: SC168970       VAT: 682 846 983
Registered address: 20 Forth Street Edinburgh EH1 3LH UK

Capture, author, publish, deliver and manage your learning materials.