From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/114825 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Duncan Hothersall via ntg-context Newsgroups: gmane.comp.tex.context Subject: Re: XML processing instructions Date: Mon, 2 May 2022 08:33:01 +0100 Message-ID: References: Reply-To: mailing list for ConTeXt users Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============5279034632766515409==" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="8005"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Duncan Hothersall To: mailing list for ConTeXt users Original-X-From: ntg-context-bounces@ntg.nl Mon May 02 09:33:49 2022 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane-mx.org Original-Received: from zapf.boekplan.nl ([5.39.185.232] helo=zapf.ntg.nl) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nlQZ3-0001sf-1i for gctc-ntg-context-518@m.gmane-mx.org; Mon, 02 May 2022 09:33:49 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id DF949289BDD; Mon, 2 May 2022 09:33:23 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at zapf.boekplan.nl Original-Received: from zapf.ntg.nl ([127.0.0.1]) by localhost (zapf.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rX_XbotSqn_0; Mon, 2 May 2022 09:33:21 +0200 (CEST) Original-Received: from zapf.ntg.nl (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id 91CB9287858; Mon, 2 May 2022 09:33:21 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id 35FED288AE6 for ; Mon, 2 May 2022 09:33:20 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at zapf.boekplan.nl Original-Received: from zapf.ntg.nl ([127.0.0.1]) by localhost (zapf.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JHE1pi1mSiVX for ; Mon, 2 May 2022 09:33:19 +0200 (CEST) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.208.179; helo=mail-lj1-f179.google.com; envelope-from=dh@capdm.com; receiver= Original-Received: from mail-lj1-f179.google.com (mail-lj1-f179.google.com [209.85.208.179]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)) (No client certificate requested) by zapf.ntg.nl (Postfix) with ESMTPS id C7B76287858 for ; Mon, 2 May 2022 09:33:18 +0200 (CEST) Original-Received: by mail-lj1-f179.google.com with SMTP id l19so17408413ljb.7 for ; Mon, 02 May 2022 00:33:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=capdm.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=yR+udFcv6Km1nXcpJvLncCDrIxmgcwmhvaLZV9ANm9s=; b=mVXEYT4HpPWZKeDnyZCa0mGaIoKi6FHdXaVF9CVR6KYJyC4KcqjgCqstUmyao7MwwN U3+9Tz9WI7JC+wAFjkgX0mtYMtYIdjYLiabF1eJsZ45yr0VpGhWH9LMcE0+y229+pRbb ifOOki7Qq+4zGDDEn+9YxLgig4RH+oc9ITpjY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=yR+udFcv6Km1nXcpJvLncCDrIxmgcwmhvaLZV9ANm9s=; b=qrOn1eRabblCrj03LC1e/43peavIAe0rIJKznyqVrqGqrtrlzzNJLZfcZ3kC9u9bFn c9a0JdB+8souI3vCBv3tHqA7bQnvBBHE+AhtxdduJNnoYZhpiZoa4e79Kzx2alPtFMHi pxvScYVpCk35BhZph+ZrSJJZUF8RksqqYebgHm2fUmzxpXyapvqePq9AhngJoEhs532+ 52ls6JUKm0I043GFNRJFmzfKe//gqPIi/C1DTJ5fXk3bamzdQRZ4KyRHG/4yYdGpt1Mb zhqUpirPnSbeYVglCawWYXs6n2k/eQXlZm3SQWZHo6SmYqY+cAV/egY09VMad6GQurNp /UdA== X-Gm-Message-State: AOAM5314Eq603yE/mGU8SKEtryjiR90OolzcZ2eY11gzefjxaJ/+KVlW OVMpfFS4G5sZgMnEMO2QIQYPbEb/MeHBZXk7x8Kf+uTu6oeqjQ== X-Google-Smtp-Source: ABdhPJwcEQBUWmA2QIO2Q4LEbaZiaXOOvYZ6iPGduR+bKciUOlG3Ee2qomYsTZLHrnVFEBX5ZW6+ElG6fPUT7R7/vIk= X-Received: by 2002:a2e:740e:0:b0:24f:1284:86e8 with SMTP id p14-20020a2e740e000000b0024f128486e8mr6810793ljc.259.1651476797940; Mon, 02 May 2022 00:33:17 -0700 (PDT) In-Reply-To: X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.26 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ntg-context-bounces@ntg.nl Original-Sender: "ntg-context" Xref: news.gmane.io gmane.comp.tex.context:114825 Archived-At: --===============5279034632766515409== Content-Type: multipart/alternative; boundary="000000000000e5769305de02675d" --000000000000e5769305de02675d Content-Type: text/plain; charset="UTF-8" Apologies, there are two rogue * in the lxml.preprocessor code, but even when they are removed it doesn't work. On Mon, 2 May 2022 at 08:19, Duncan Hothersall wrote: > I have a big set of existing XML books (held in a derivative of DocBook) > which I'm looking to start processing directly with ConTeXt. (Up to now I > have a system which converts the XML into ConTeXt code which is then > processed, but this is inefficient and lots of the code is now unsupported.) > > I've had some success producing output, but my first real sticking point > has come with processing instructions. The existing XML contains lots of > processing instructions of the form > , some of which can be conditional and introduce new > data etc. But I'd be happy at this stage if I could just process the most > basic one of them, which is used to introduce a line stop in a running > paragraph of text. > > My best guess at how to do this was to use the lxml.preprocessor function > to convert the processing instruction into an element, and then process the > element as normal. But (a) my attempt didn't work, and (b) there may well > be a better way. > > Minimal working example below, except that obviously the processing > instruction bit doesn't work! > > Thanks for any help or insights. > > Duncan > > > MWE: > ------ > > \startbuffer[demo] > > A paragraph witha processing > instruction. > > \stopbuffer > > \startxmlsetups xml:demo:base > \xmlsetsetup{#1}{*}{xml:demo:*} > \stopxmlsetups > \xmlregisterdocumentsetup{demo}{xml:demo:base} > > \startxmlsetups xml:demo:book > \xmlflush{#1} > \stopxmlsetups > > \startxmlsetups xml:demo:para > \xmlflush{#1}\endgraf > \stopxmlsetups > > \startluacode > function lxml.preprocessor(data,settings) > return string.find(data,"") > and string.gsub(data," *force_line_stop?>","") > or data > end > \stopluacode > > \startxmlsetups xml:demo:capdmlinestop > \crlf > \xmlflush{#1} > \stopxmlsetups > > \setupbodyfont[modern] > \starttext > \xmlprocessbuffer{demo}{demo}{} > \stoptext > > ------ > -- Duncan Hothersall, Operations Director CAPDM Limited - Online Program Enablers 0131 677 2400 www.capdm.com Registered in Scotland: SC168970 VAT: 682 846 983 Registered address: 20 Forth Street Edinburgh EH1 3LH UK Capture, author, publish, deliver and manage your learning materials. *Sign up to the CAPDM newsletter here * --000000000000e5769305de02675d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Apologies, there are two rogue * in the lxml.preprocessor = code, but even when they are removed it doesn't work.

On Mon, 2 May 2022= at 08:19, Duncan Hothersall <dh@capdm.c= om> wrote:
I have a big set of existing XML books (held in a d= erivative of DocBook) which I'm looking to start processing directly wi= th ConTeXt. (Up to now I have a system which converts the XML into ConTeXt = code which is then processed, but this is inefficient and lots of the code = is now unsupported.)

I've had some success pro= ducing output, but my first real sticking point has come with processing in= structions. The existing XML contains lots of processing instructions of th= e form=C2=A0
<?capdm whatever?>, some of which can be condi= tional and introduce new data etc. But I'd be happy at this stage if I = could just process the most basic one of them, which is used to introduce a= line stop in a running paragraph of text.

My best= guess at how to do this was to use the=C2=A0lxml.preprocessor function to = convert the processing instruction into an element, and then process the el= ement as normal. But (a) my attempt didn't work, and (b) there may well= be a better way.

Minimal working example below, e= xcept that obviously the processing instruction bit doesn't work!
=

Thanks for any help or insights.

Duncan


MWE:
------

\startbuffer[demo]
<book>
=C2=A0 <para= >A paragraph with<?capdm force_line_stop?>a processing instruction= .</para>
</book>
\stopbuffer

\startxmlsetups xml:d= emo:base
=C2=A0\xmlsetsetup{#1}{*}{xml:demo:*}
\stopxmlsetups
\xml= registerdocumentsetup{demo}{xml:demo:base}

\startxmlsetups xml:demo:= book
=C2=A0\xmlflush{#1}
\stopxmlsetups

\startxmlsetups xml:de= mo:para
=C2=A0\xmlflush{#1}\endgraf
\stopxmlsetups

\startluaco= de
=C2=A0function lxml.preprocessor(data,settings)
=C2=A0 return stri= ng.find(data,"<?capdm *force_line_stop?>")
=C2=A0 =C2=A0= and string.gsub(data,"<?capdm *force_line_stop?>","<= ;capdmlinestop></capdmlinestop/>")
=C2=A0 =C2=A0or data=C2=A0end
\stopluacode

\startxmlsetups xml:demo:capdmlinestop=C2=A0\crlf
=C2=A0\xmlflush{#1}
\stopxmlsetups

\setupbodyfont= [modern]
\starttext
\xmlprocessbuffer{demo}{demo}{}
\stoptext
<= /div>

------
=


--
Duncan Hothersall,= Operations Director
= CAPDM Limited - Online Program Enablers
0131 677 2400= =C2=A0=C2=A0www.capdm.com
Registered in Scotland: SC168970=C2=A0 =C2=A0 =C2=A0 =C2=A0VA= T: 682 846 983
Registered address:=C2=A020 = Forth Street Edinburgh EH1 3LH UK

Capture,= author, publish, deliver and manage your learning materials.
--000000000000e5769305de02675d-- --===============5279034632766515409== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX18KSWYgeW91ciBxdWVzdGlvbiBpcyBvZiBpbnRlcmVz dCB0byBvdGhlcnMgYXMgd2VsbCwgcGxlYXNlIGFkZCBhbiBlbnRyeSB0byB0aGUgV2lraSEKCm1h aWxsaXN0IDogbnRnLWNvbnRleHRAbnRnLm5sIC8gaHR0cDovL3d3dy5udGcubmwvbWFpbG1hbi9s aXN0aW5mby9udGctY29udGV4dAp3ZWJwYWdlICA6IGh0dHA6Ly93d3cucHJhZ21hLWFkZS5ubCAv IGh0dHA6Ly9jb250ZXh0LmFhbmhldC5uZXQKYXJjaGl2ZSAgOiBodHRwczovL2JpdGJ1Y2tldC5v cmcvcGhnL2NvbnRleHQtbWlycm9yL2NvbW1pdHMvCndpa2kgICAgIDogaHR0cDovL2NvbnRleHRn YXJkZW4ubmV0Cl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCg== --===============5279034632766515409==--