From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/114837 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Duncan Hothersall via ntg-context Newsgroups: gmane.comp.tex.context Subject: Re: XML processing instructions Date: Tue, 3 May 2022 08:30:33 +0100 Message-ID: References: <643c4b177ba64ac1bb5c108776a9bb0a@unibe.ch> Reply-To: mailing list for ConTeXt users Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============3692806806351465993==" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="1113"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Duncan Hothersall To: mailing list for ConTeXt users Original-X-From: ntg-context-bounces@ntg.nl Tue May 03 09:31:26 2022 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane-mx.org Original-Received: from zapf.boekplan.nl ([5.39.185.232] helo=zapf.ntg.nl) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nln0I-00007X-Fv for gctc-ntg-context-518@m.gmane-mx.org; Tue, 03 May 2022 09:31:26 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id F38AB2A38B1; Tue, 3 May 2022 09:30:56 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at zapf.boekplan.nl Original-Received: from zapf.ntg.nl ([127.0.0.1]) by localhost (zapf.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id G5-Dy416_poA; Tue, 3 May 2022 09:30:54 +0200 (CEST) Original-Received: from zapf.ntg.nl (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id 742F62A379A; Tue, 3 May 2022 09:30:54 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id 28BCB2A37F6 for ; Tue, 3 May 2022 09:30:52 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at zapf.boekplan.nl Original-Received: from zapf.ntg.nl ([127.0.0.1]) by localhost (zapf.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VBzXJJ1mC3OZ for ; Tue, 3 May 2022 09:30:50 +0200 (CEST) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.208.172; helo=mail-lj1-f172.google.com; envelope-from=dh@capdm.com; receiver= Original-Received: from mail-lj1-f172.google.com (mail-lj1-f172.google.com [209.85.208.172]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)) (No client certificate requested) by zapf.ntg.nl (Postfix) with ESMTPS id 97F182A3796 for ; Tue, 3 May 2022 09:30:50 +0200 (CEST) Original-Received: by mail-lj1-f172.google.com with SMTP id q14so20971727ljc.12 for ; Tue, 03 May 2022 00:30:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=capdm.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=qhAxEl34omrHC/pHbUGkNSi9mbSomKTob7yp1x/51Uo=; b=Z2K1wd+yAneeSTPPO3jOMYi/ZzJ7wkugKe96+1jdC1498I/Rw59fSVOgkqY2cIVcW7 ES6QFtbgW87CazwAgyJJ0kgw8pri+B+BLgz1k9sX8h0vk55m989YC9tnvlrSFqFkH//A iGIH6R7IZvNXYMSDJnJvgDTyST2tN/X1on8xI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=qhAxEl34omrHC/pHbUGkNSi9mbSomKTob7yp1x/51Uo=; b=N7KNqq5ecdW26lTDTRdvaYn6tVrys+FT1rNybDoGj1CRg/paqUIccQ8E91p5CCGpgJ vJocGDpQTV33c4mUi81gjSeLS+n2M8KdMpBREZbi+yWx3RBaqMppb56wwsTK2Gc0WsLc zDa721ReworZ+4cULFg8TqpGQ85WaI1UkL3i7oY5EuRQqn1P+0LqsbwA00kVbH9rZ9/P ZF3FLog8m1wUvqmKS+qGwKmbxSYZqhU14wTV3r2ESe22a7a20JDsGOPAmeVy0TW9zBPf Yk31L/cXb3qN9xnWYnbCwP/+DOdPFB6ib6FeNJuO84bQQH7pZTlZ7pnV8mCH9OHZMPdP QpvQ== X-Gm-Message-State: AOAM531448Er60yuXaWTA/ENSu96DlSLjtqFUAWIk7pwMuvw6XxrK8f3 yxLTw4p/lyq4OXgrxeurYqvP0VPDi1b/9AMaKbGZNVWJt1pfVe4e X-Google-Smtp-Source: ABdhPJy8JCtSrQGT4WhLJslZtZiWwFtq2spJg2zcEn7CNKg9S4vJ9K6MWqLae2AXiRiaUzzPaQCxv3k7YtuvAYmWGA4= X-Received: by 2002:a2e:8887:0:b0:24f:10d8:4603 with SMTP id k7-20020a2e8887000000b0024f10d84603mr9396280lji.191.1651563049648; Tue, 03 May 2022 00:30:49 -0700 (PDT) In-Reply-To: X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.26 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ntg-context-bounces@ntg.nl Original-Sender: "ntg-context" Xref: news.gmane.io gmane.comp.tex.context:114837 Archived-At: --===============3692806806351465993== Content-Type: multipart/alternative; boundary="000000000000e6177305de167c08" --000000000000e6177305de167c08 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Just following up on my own question in case anybody else is interested, I realised that the logic of my preprocessor code wasn't the problem, it was just my ignorance of lua and character escaping. Both ? and * are special characters in string.gsub expressions, so they need to be escaped with the lua escape character which is % (which sends my editor haywire because it thinks I am putting in a TeX comment, but never mind!). So my now working code to translate historic processing instructions into directives without a separate pass is: \startluacode function lxml.preprocessor(data) data =3D string.gsub(data, '<%?capdm %*force_line_stop%?>', '') return data end \stopluacode \startsetups xml:directive:injector:newline \crlf \stopsetups Duncan On Mon, 2 May 2022 at 09:24, Duncan Hothersall wrote: > Many thanks Denis. Very useful tip on the injectors and generalised > command usage. That will definitely come in useful. > > My problem is that I already have a lot of XML data with the existing > processing instructions in it. I know I could use an external preprocess = in > something like Python to change them into injectors before they are fed > into ConTeXt, but I was hoping there was a way to handle them directly as > part of the ConTeXt / lua process. > > Thanks again though. > > Duncan > > > > > On Mon, 2 May 2022, 08:48 , wrote: > >> That was too quick, sorry. >> >> >> >> Hi Duncan, >> >> >> >> I have used context=E2=80=99s own injectors for this : >> >> >> >> >> >> >> >> \startsetups xml:directive:injector:addlinetopage >> >> \adaptlayout[lines=3D+1] >> >> \stopsetups >> >> >> >> Or, for your line break example : >> >> >> >> >> >> >> >> \startsetups xml:directive:injector:newline >> >> \crlf >> >> \stopsetups >> >> >> >> Also, I have learned that you can just use arbitrary context code in xml= : >> >> >> >> \def\xmltexdirective#1#2{\doif{#1}{command}{#2}} >> >> >> >> \xmlinstalldirective{tex}{xmltexdirective} >> >> >> >> >> >> >> >> >> >> >> >> Best, >> >> Denis >> >> >> >> *Von:* Maier, Denis Christian (UB) >> *Gesendet:* Montag, 2. Mai 2022 09:45 >> *An:* 'mailing list for ConTeXt users' >> *Cc:* Duncan Hothersall >> *Betreff:* AW: [NTG-context] XML processing instructions >> >> >> >> >> >> *Von:* ntg-context *Im Auftrag von *Duncan >> Hothersall via ntg-context >> *Gesendet:* Montag, 2. Mai 2022 09:20 >> *An:* mailing list for ConTeXt users >> *Cc:* Duncan Hothersall >> *Betreff:* [NTG-context] XML processing instructions >> >> >> >> I have a big set of existing XML books (held in a derivative of DocBook) >> which I'm looking to start processing directly with ConTeXt. (Up to now = I >> have a system which converts the XML into ConTeXt code which is then >> processed, but this is inefficient and lots of the code is now unsupport= ed.) >> >> >> >> I've had some success producing output, but my first real sticking point >> has come with processing instructions. The existing XML contains lots of >> processing instructions of the form >> >> , some of which can be conditional and introduce new >> data etc. But I'd be happy at this stage if I could just process the mos= t >> basic one of them, which is used to introduce a line stop in a running >> paragraph of text. >> >> >> >> My best guess at how to do this was to use the lxml.preprocessor functio= n >> to convert the processing instruction into an element, and then process = the >> element as normal. But (a) my attempt didn't work, and (b) there may >> well be a better way. >> >> >> >> Minimal working example below, except that obviously the processing >> instruction bit doesn't work! >> >> >> >> Thanks for any help or insights. >> >> >> >> Duncan >> >> >> >> >> >> MWE: >> >> ------ >> >> >> >> \startbuffer[demo] >> >> A paragraph witha processing >> instruction. >> >> \stopbuffer >> >> \startxmlsetups xml:demo:base >> \xmlsetsetup{#1}{*}{xml:demo:*} >> \stopxmlsetups >> \xmlregisterdocumentsetup{demo}{xml:demo:base} >> >> \startxmlsetups xml:demo:book >> \xmlflush{#1} >> \stopxmlsetups >> >> \startxmlsetups xml:demo:para >> \xmlflush{#1}\endgraf >> \stopxmlsetups >> >> \startluacode >> function lxml.preprocessor(data,settings) >> return string.find(data,"") >> and string.gsub(data,"> *force_line_stop?>","") >> or data >> end >> \stopluacode >> >> \startxmlsetups xml:demo:capdmlinestop >> \crlf >> \xmlflush{#1} >> \stopxmlsetups >> >> \setupbodyfont[modern] >> \starttext >> \xmlprocessbuffer{demo}{demo}{} >> \stoptext >> >> >> >> ------ >> > --=20 Duncan Hothersall, Operations Director CAPDM Limited - Online Program Enablers 0131 677 2400 www.capdm.com Registered in Scotland: SC168970 VAT: 682 846 983 Registered address: 20 Forth Street Edinburgh EH1 3LH UK Capture, author, publish, deliver and manage your learning materials. *Sign up to the CAPDM newsletter here * --000000000000e6177305de167c08 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Just following up on my own question in case anybody = else is interested, I realised that the logic of my preprocessor code wasn&= #39;t the problem, it was just my ignorance of lua and character escaping. = Both ? and * are special characters in string.gsub expressions, so they nee= d to be escaped with the lua escape character which is % (which sends my ed= itor haywire because it thinks I am putting in a TeX comment, but never min= d!).

So my now working code to translate historic = processing instructions into directives without a separate pass is:

\startluacode
function lxml.preprocessor(data)
=C2= =A0data =3D string.gsub(data, '<%?capdm %*force_line_stop%?>'= , '<?context-directive injector newline ?>')
=C2=A0return = data
end
\stopluacode

\startsetups xml:directive:injector:newl= ine
=C2=A0\crlf
\stopsetups

Duncan
=
On Mon= , 2 May 2022 at 09:24, Duncan Hothersall <dh@capdm.com> wrote:
Many thanks Denis. Very useful tip on = the injectors and generalised command usage. That will definitely come in u= seful.

My problem is that I al= ready have a lot of XML data with the existing processing instructions in i= t. I know I could use an external preprocess in something like Python to ch= ange them into injectors before they are fed into ConTeXt, but I was hoping= there was a way to handle them directly as part of the ConTeXt / lua proce= ss.

Thanks again though.=

Duncan




On Mon, 2 May 2022, 08:48 , = <denis.maier@unibe.ch> wrote:

That was too quick, sorry.

=C2=A0

Hi Duncan,=

=C2=A0

I have used context=E2=80=99s o= wn injectors for this=C2=A0:

=C2=A0

<?context-directive injector= addlinetopage ?>

=C2=A0

\startsetups xml:directive:inje= ctor:addlinetopage

=C2=A0 \adaptlayout[lines=3D+1]=

\stopsetups

=C2=A0

Or, for your line break example= =C2=A0:

=C2=A0

<?context-directive injector= newline ?>

=C2=A0

\startsetups xml:directive:inje= ctor:newline

=C2=A0 \crlf

\stopsetups

=C2=A0

Also, I have learned that you c= an just use arbitrary context code in xml:

=C2=A0

\def\xmltexdirective#1#2{\doif{= #1}{command}{#2}}

=C2=A0

\xmlinstalldirective{tex}{xmlte= xdirective}

=C2=A0

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 <?context-directive tex command \inframed{xxx} ?><= u>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 <?context-directive tex command \page ?>=

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 <?context-directive tex command \crlf ?>=

=C2=A0

Best,

Denis

=C2=A0

Von: Maier, Denis Christian (UB)
Gesendet: Montag, 2.
Mai 2022 09:45
An: 'mailing list for ConTeXt users' <ntg-context@ntg.nl>
Cc: Duncan Hothersall <dh@capdm.com>
Betreff: AW: [NTG-context] XML processing instructions=

=C2=A0

=C2=A0

Von: ntg-context <n= tg-context-bounces@ntg.nl> Im Auftrag von Duncan Hothersall via ntg-context
Gesendet: Montag, 2.
Mai 2022 09:20
An: mailing list for ConTeXt users <
= ntg-context@ntg.nl>=
Cc: Duncan Hothersall <
dh@capdm.com>
Betreff: [NTG-context] XML processing instructions

=C2=A0

I have a big set of existing XM= L books (held in a derivative of DocBook) which I'm looking to start pr= ocessing directly with ConTeXt. (Up to now I have a system which converts t= he XML into ConTeXt code which is then processed, but this is inefficient and lots of the code is now unsupported.)

=C2=A0

I've had some success produ= cing output, but my first real sticking point has come with processing inst= ructions. The existing XML contains lots of processing instructions of the = form=C2=A0

<?capdm whatever?>, some = of which can be conditional and introduce new data etc. But I'd be happ= y at this stage if I could just process the most basic one of them, which i= s used to introduce a line stop in a running paragraph of text.

=C2=A0

My best guess at how to do this= was to use the=C2=A0lxml.preprocessor function to convert the processing i= nstruction into an element, and then process the element as normal. But (a) my attempt didn't work, and (b) there may well be a bett= er way.

=C2=A0

Minimal working example below, = except that obviously the processing instruction bit doesn't work!

=C2=A0

Thanks for any help or insights.

=C2=A0

Duncan

=C2=A0

=C2=A0

MWE:

------

=C2=A0

\startbuffer[demo]
<book>
=C2=A0 <para>A paragraph with<?capdm force_line_stop?>a process= ing instruction.</para>
</book>
\stopbuffer

\startxmlsetups xml:demo:base
=C2=A0\xmlsetsetup{#1}{*}{xml:demo:*}
\stopxmlsetups
\xmlregisterdocumentsetup{demo}{xml:demo:base}

\startxmlsetups xml:demo:book
=C2=A0\xmlflush{#1}
\stopxmlsetups

\startxmlsetups xml:demo:para
=C2=A0\xmlflush{#1}\endgraf
\stopxmlsetups

\startluacode
=C2=A0function lxml.preprocessor(data,settings)
=C2=A0 return string.find(data,"<?capdm *force_line_stop?>"= )
=C2=A0 =C2=A0and string.gsub(data,"<?capdm *force_line_stop?>&qu= ot;,"<capdmlinestop></capdmlinestop/>")
=C2=A0 =C2=A0or data
=C2=A0end
\stopluacode

\startxmlsetups xml:demo:capdmlinestop
=C2=A0\crlf
=C2=A0\xmlflush{#1}
\stopxmlsetups

\setupbodyfont[modern]
\starttext
\xmlprocessbuffer{demo}{demo}{}
\stoptext

=C2=A0

------



--
Duncan Hothersall,= Operations Director
= CAPDM Limited - Online Program Enablers
0131 677 2400= =C2=A0=C2=A0www.capdm.com
Registered in Scotland: SC168970=C2=A0 =C2=A0 =C2=A0 =C2=A0VA= T: 682 846 983
Registered address:=C2=A020 = Forth Street Edinburgh EH1 3LH UK

Capture,= author, publish, deliver and manage your learning materials.
--000000000000e6177305de167c08-- --===============3692806806351465993== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX18KSWYgeW91ciBxdWVzdGlvbiBpcyBvZiBpbnRlcmVz dCB0byBvdGhlcnMgYXMgd2VsbCwgcGxlYXNlIGFkZCBhbiBlbnRyeSB0byB0aGUgV2lraSEKCm1h aWxsaXN0IDogbnRnLWNvbnRleHRAbnRnLm5sIC8gaHR0cDovL3d3dy5udGcubmwvbWFpbG1hbi9s aXN0aW5mby9udGctY29udGV4dAp3ZWJwYWdlICA6IGh0dHA6Ly93d3cucHJhZ21hLWFkZS5ubCAv IGh0dHA6Ly9jb250ZXh0LmFhbmhldC5uZXQKYXJjaGl2ZSAgOiBodHRwczovL2JpdGJ1Y2tldC5v cmcvcGhnL2NvbnRleHQtbWlycm9yL2NvbW1pdHMvCndpa2kgICAgIDogaHR0cDovL2NvbnRleHRn YXJkZW4ubmV0Cl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCg== --===============3692806806351465993==--