From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/114824 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Duncan Hothersall via ntg-context Newsgroups: gmane.comp.tex.context Subject: XML processing instructions Date: Mon, 2 May 2022 08:19:33 +0100 Message-ID: Reply-To: mailing list for ConTeXt users Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0532545723805161807==" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="11825"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Duncan Hothersall To: mailing list for ConTeXt users Original-X-From: ntg-context-bounces@ntg.nl Mon May 02 09:26:48 2022 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane-mx.org Original-Received: from zapf.boekplan.nl ([5.39.185.232] helo=zapf.ntg.nl) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nlQSG-0002s6-28 for gctc-ntg-context-518@m.gmane-mx.org; Mon, 02 May 2022 09:26:48 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id 5F317289C5F; Mon, 2 May 2022 09:19:56 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at zapf.boekplan.nl Original-Received: from zapf.ntg.nl ([127.0.0.1]) by localhost (zapf.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0SLxRn2k11A7; Mon, 2 May 2022 09:19:54 +0200 (CEST) Original-Received: from zapf.ntg.nl (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id DC7D5289B4F; Mon, 2 May 2022 09:19:53 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id A18B628786A for ; Mon, 2 May 2022 09:19:52 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at zapf.boekplan.nl Original-Received: from zapf.ntg.nl ([127.0.0.1]) by localhost (zapf.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qTSZesf18xbO for ; Mon, 2 May 2022 09:19:51 +0200 (CEST) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.208.176; helo=mail-lj1-f176.google.com; envelope-from=dh@capdm.com; receiver= Original-Received: from mail-lj1-f176.google.com (mail-lj1-f176.google.com [209.85.208.176]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)) (No client certificate requested) by zapf.ntg.nl (Postfix) with ESMTPS id 522A0289B35 for ; Mon, 2 May 2022 09:19:51 +0200 (CEST) Original-Received: by mail-lj1-f176.google.com with SMTP id 16so17351481lju.13 for ; Mon, 02 May 2022 00:19:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=capdm.com; s=google; h=mime-version:from:date:message-id:subject:to; bh=58VhvLFnXkkJsmOjD6fwk/i3hr98AvM3yXyAHi6YzVs=; b=1z87nVtbIleVYHBx/XYxvYvofaIS3pJBU96zXkkFHF4TaV4hjZb/gBwl7mJDU8/KCz 3r1a1Opa9kseLMnJIH2v8WGYBKLKT0sr4uklbjsSHu1eXlKtQLeDSkUXPZVDvqFQkmJB HZuRAnaqSJyi0C0vLSZppRU37Nb8U0G+GQ/ok= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=58VhvLFnXkkJsmOjD6fwk/i3hr98AvM3yXyAHi6YzVs=; b=nEKUm7Iz1kQJ0ZjdUZxe3trgmSnCexLy8kRY3A35sM6J+YJJy9OLgsI5a5vzLK+0oe 2QJEYm+putv07ZwuqRR3cA6VgBr64q4x4odAdCYqIeBvP/4+I4w41b7hTwB3e7sircLa eClB9YmK4Qe68D7MYFlVkHh9g8cq+9Nd4kMEfmh4wlcvL2fU9csGUCxsrfMTuUISgUI7 ost3CmCXSR4VBWMmdijISieobn8K3Jggk2Cc+BwYQmjsfJVDB5Mw5AAMk8xywpZ1ELb+ hqWSUlsXLaCjOLYq6pGaz1b8gHED/KqmNqbf0yZQTpDdeYl6HHOj11K2JWoXP6nBPOph hLDw== X-Gm-Message-State: AOAM532gZSMLlPPYC1IriioxPwQI/1GqzQvyq3OCruIbdsIe2z7WOPbM SSfNY2ZUM17oeKBASRFyFrqmlPAKgvdUfyLcIPfMWZu3ZpV8vXyF X-Google-Smtp-Source: ABdhPJxKLXYcJjHVUs8gKho8VRrJ4v1fpdPrOLGmpgU/LiFMPxqKTFoNqcgIjAe7ywKyl2IxyK37zM2NTWx9puYIIOU= X-Received: by 2002:a2e:a812:0:b0:24f:11b4:4e95 with SMTP id l18-20020a2ea812000000b0024f11b44e95mr7246660ljq.105.1651475990073; Mon, 02 May 2022 00:19:50 -0700 (PDT) X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.26 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ntg-context-bounces@ntg.nl Original-Sender: "ntg-context" Xref: news.gmane.io gmane.comp.tex.context:114824 Archived-At: --===============0532545723805161807== Content-Type: multipart/alternative; boundary="000000000000be662e05de023714" --000000000000be662e05de023714 Content-Type: text/plain; charset="UTF-8" I have a big set of existing XML books (held in a derivative of DocBook) which I'm looking to start processing directly with ConTeXt. (Up to now I have a system which converts the XML into ConTeXt code which is then processed, but this is inefficient and lots of the code is now unsupported.) I've had some success producing output, but my first real sticking point has come with processing instructions. The existing XML contains lots of processing instructions of the form , some of which can be conditional and introduce new data etc. But I'd be happy at this stage if I could just process the most basic one of them, which is used to introduce a line stop in a running paragraph of text. My best guess at how to do this was to use the lxml.preprocessor function to convert the processing instruction into an element, and then process the element as normal. But (a) my attempt didn't work, and (b) there may well be a better way. Minimal working example below, except that obviously the processing instruction bit doesn't work! Thanks for any help or insights. Duncan MWE: ------ \startbuffer[demo] A paragraph witha processing instruction. \stopbuffer \startxmlsetups xml:demo:base \xmlsetsetup{#1}{*}{xml:demo:*} \stopxmlsetups \xmlregisterdocumentsetup{demo}{xml:demo:base} \startxmlsetups xml:demo:book \xmlflush{#1} \stopxmlsetups \startxmlsetups xml:demo:para \xmlflush{#1}\endgraf \stopxmlsetups \startluacode function lxml.preprocessor(data,settings) return string.find(data,"") and string.gsub(data,"","") or data end \stopluacode \startxmlsetups xml:demo:capdmlinestop \crlf \xmlflush{#1} \stopxmlsetups \setupbodyfont[modern] \starttext \xmlprocessbuffer{demo}{demo}{} \stoptext ------ --000000000000be662e05de023714 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I have a big set of existing XML books (held in a der= ivative of DocBook) which I'm looking to start processing directly with= ConTeXt. (Up to now I have a system which converts the XML into ConTeXt co= de which is then processed, but this is inefficient and lots of the code is= now unsupported.)

I've had some success produ= cing output, but my first real sticking point has come with processing inst= ructions. The existing XML contains lots of processing instructions of the = form=C2=A0
<?capdm whatever?>, some of which can be conditi= onal and introduce new data etc. But I'd be happy at this stage if I co= uld just process the most basic one of them, which is used to introduce a l= ine stop in a running paragraph of text.

My best g= uess at how to do this was to use the=C2=A0lxml.preprocessor function to co= nvert the processing instruction into an element, and then process the elem= ent as normal. But (a) my attempt didn't work, and (b) there may well b= e a better way.

Minimal working example below, exc= ept that obviously the processing instruction bit doesn't work!

Thanks for any help or insights.

Duncan


MWE:
------
=

\startbuffer[demo]
<book>
=C2=A0 <para&g= t;A paragraph with<?capdm force_line_stop?>a processing instruction.&= lt;/para>
</book>
\stopbuffer

\startxmlsetups xml:dem= o:base
=C2=A0\xmlsetsetup{#1}{*}{xml:demo:*}
\stopxmlsetups
\xmlre= gisterdocumentsetup{demo}{xml:demo:base}

\startxmlsetups xml:demo:bo= ok
=C2=A0\xmlflush{#1}
\stopxmlsetups

\startxmlsetups xml:demo= :para
=C2=A0\xmlflush{#1}\endgraf
\stopxmlsetups

\startluacode=
=C2=A0function lxml.preprocessor(data,settings)
=C2=A0 return string= .find(data,"<?capdm *force_line_stop?>")
=C2=A0 =C2=A0an= d string.gsub(data,"<?capdm *force_line_stop?>","<c= apdmlinestop></capdmlinestop/>")
=C2=A0 =C2=A0or data
= =C2=A0end
\stopluacode

\startxmlsetups xml:demo:capdmlinestop
= =C2=A0\crlf
=C2=A0\xmlflush{#1}
\stopxmlsetups

\setupbodyfont[= modern]
\starttext
\xmlprocessbuffer{demo}{demo}{}
\stoptext

------
--000000000000be662e05de023714-- --===============0532545723805161807== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX18KSWYgeW91ciBxdWVzdGlvbiBpcyBvZiBpbnRlcmVz dCB0byBvdGhlcnMgYXMgd2VsbCwgcGxlYXNlIGFkZCBhbiBlbnRyeSB0byB0aGUgV2lraSEKCm1h aWxsaXN0IDogbnRnLWNvbnRleHRAbnRnLm5sIC8gaHR0cDovL3d3dy5udGcubmwvbWFpbG1hbi9s aXN0aW5mby9udGctY29udGV4dAp3ZWJwYWdlICA6IGh0dHA6Ly93d3cucHJhZ21hLWFkZS5ubCAv IGh0dHA6Ly9jb250ZXh0LmFhbmhldC5uZXQKYXJjaGl2ZSAgOiBodHRwczovL2JpdGJ1Y2tldC5v cmcvcGhnL2NvbnRleHQtbWlycm9yL2NvbW1pdHMvCndpa2kgICAgIDogaHR0cDovL2NvbnRleHRn YXJkZW4ubmV0Cl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCg== --===============0532545723805161807==--