From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/111130 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Jano Kula Newsgroups: gmane.comp.tex.context Subject: nbsp in XML (S01E01) Date: Wed, 21 Apr 2021 20:17:53 +0200 Message-ID: Reply-To: mailing list for ConTeXt users Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="0000000000000579a905c07f9758" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="8870"; mail-complaints-to="usenet@ciao.gmane.io" To: mailing list for ConTeXt users Original-X-From: ntg-context-bounces@ntg.nl Wed Apr 21 20:19:04 2021 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane-mx.org Original-Received: from zapf.boekplan.nl ([5.39.185.232] helo=zapf.ntg.nl) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lZHRH-0002Ag-57 for gctc-ntg-context-518@m.gmane-mx.org; Wed, 21 Apr 2021 20:19:03 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id 94920282881; Wed, 21 Apr 2021 20:18:35 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at zapf.boekplan.nl Original-Received: from zapf.ntg.nl ([127.0.0.1]) by localhost (zapf.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CJItpWAh-s7L; Wed, 21 Apr 2021 20:18:34 +0200 (CEST) Original-Received: from zapf.ntg.nl (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id BCC13282AA2; Wed, 21 Apr 2021 20:18:34 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id 33B6B281430 for ; Wed, 21 Apr 2021 20:18:33 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at zapf.boekplan.nl Original-Received: from zapf.ntg.nl ([127.0.0.1]) by localhost (zapf.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mtrSvf9617mB for ; Wed, 21 Apr 2021 20:18:32 +0200 (CEST) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.167.181; helo=mail-oi1-f181.google.com; envelope-from=jano.kula@gmail.com; receiver= Original-Received: from mail-oi1-f181.google.com (mail-oi1-f181.google.com [209.85.167.181]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)) (No client certificate requested) by zapf.ntg.nl (Postfix) with ESMTPS id 6962B280C9D for ; Wed, 21 Apr 2021 20:18:32 +0200 (CEST) Original-Received: by mail-oi1-f181.google.com with SMTP id x20so9538484oix.10 for ; Wed, 21 Apr 2021 11:18:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=R1uzAODWvePn8ikdVhCSVZJVxVCWSjQKARn4f5gMrvg=; b=CFgsVq7WEH7WNPcchxHZ4DDt7ESQRCtAxEmMkRAhVCScc7ezVr7NLrTC1oyjp+w6Yj AyTRHHElobr7iogIkG+NwO9Gwqv2mZI3+oBkGxshfL3apDvA0ZNmjo2ktbWGryPG81Vu 6DmumxVCZUoCp4ezHSlf01byQquDVdIR0qxOGu5/hADU16QNW0LjIH/GhDKpCY5j/GHi pFD3jbeZDcLo5xsHbLEba6HqgaKoLkdJ6elydHVZexexEtpIAsBAUA6ydrLpRa7Yaw8h oMkTCmX70u84nkS7v2r5t6VFeUiLv7r/UfO4D3DiTfvGwfzJaYfZOixyZo8RfH0pqUMy LNQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=R1uzAODWvePn8ikdVhCSVZJVxVCWSjQKARn4f5gMrvg=; b=aCRnf6hVJzy6VY8DYsq6YMqOhi4QZfYGqV9ZDUXeulKu45oI9ZABmMfZ6GCbK/IRCb Bp8L+Cfjse61Mcjnr7BoJwAboe0OdauFjtso7k8DArrds/4Lu31C8rRdjKOmEu95mhn5 H5GJi3GfMU16wohhxgIt/gE80hzzr707SHaqDH1Ct76kJX+pkEaWXFQAEEuUmkjY0ASc mahyaKvYyIqwpKwUfm+YAfbplDbrqC2ze5DBv2UuQWV5Xm6+8PWPMp8EcnzWKsax+e0/ 522oDyia8Y0EBeu4Txacv5ojATmriIT3W1iAQbsUKYT8Go+pvws4cst3iutU1LEVEkRm fCNg== X-Gm-Message-State: AOAM530SpRS3S0vbayfHfYP1ZoSaZvHa22P3cUBjKLKmeq6bO+sB0ieJ GWMZK7lDWMrhzidSYjCeMcJ/PVQ29OpanG/cFRTv5cxqI4D27g== X-Google-Smtp-Source: ABdhPJy57JC82Oppte9H9F5kJKzAR2/Idcd+s2L/Vzz15rwHjg6OsMqKoStPlBs43E+QGwl5bidZGRzBn6PhUD8tmkU= X-Received: by 2002:aca:bc89:: with SMTP id m131mr7680478oif.71.1619029110640; Wed, 21 Apr 2021 11:18:30 -0700 (PDT) X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.26 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ntg-context-bounces@ntg.nl Original-Sender: "ntg-context" Xref: news.gmane.io gmane.comp.tex.context:111130 Archived-At: --0000000000000579a905c07f9758 Content-Type: multipart/alternative; boundary="0000000000000579a605c07f9756" --0000000000000579a605c07f9756 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Dear list, first episode of series on nbsp of XML in lmtx. Unfortunately, not that catchy as Netflix. Used XML input has two types of non-breakable space: - unicode character - html entitity (in fact an ugly output of HTML editor) HTML is preprocessed with ctx preprocessor (great feature!) and substituted for unicode char nbsp or tilde. MWE shows unichar spaces are non-breakable (see end of the first lines), however they are not stretchable (see second line of the paragraphs). Does unicode nbsp have fixed with in ctx? When tilde is the replacement in preprocessor (uncomment first replacement in preprocessor), xmlfush will display tilde (which is, as character, non-breakable and unstretchable, no surprise). Why tilde is displayed? Replacing or adding nbsp (tilde) with finalizers have different results, see next episode after this one is understood. Thank you, Jano MWE (rather use attached file not to loose invisible characters): \startbuffer[doc]

Temperature 20 =C2=B0C 20 =C2=B0C 20 =C2=B0C 20 =C2=B0C average.=

Altitude 6000 m 6000 m 6000 m 6000 m average.

\stopbuffer \startluacode function lxml.preprocessor(data) -- data =3D string.gsub(data, " ", "~") -- replacement nbsp invisible in luacode data =3D string.gsub(data, " ", " ") return data end \stopluacode \startxmlsetups xml:name \xmlsetsetup{\xmldocument}{*}{-} \xmlsetsetup{\xmldocument}{document|p}{xml:name:*} \stopxmlsetups \xmlregistersetup{xml:name} \startxmlsetups xml:name:document \xmlflush{#1}\par \stopxmlsetups \startxmlsetups xml:name:p \parfillskip0pt\xmlflush{#1}\par \stopxmlsetups \startTEXpage[offset=3D5mm,width=3D60mm] \xmlprocessbuffer{xml:name}{doc}{} \stopTEXpage --0000000000000579a605c07f9756 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Dear list,

first episode of series on n= bsp of XML in lmtx.
Unfortunately, not that catchy as Netflix.

Used XML input has two types of non-breakable sp= ace:
  • unicode character
  • html entitity (in fact an ugly output of HTML e= ditor)
HTML is preprocessed with ctx preprocessor (great feat= ure!) and substituted for unicode char nbsp or tilde.

<= /div>
MWE shows unichar spaces are non-breakable (see end of the first = lines), however they are not stretchable (see second line of the paragraphs= ).

Does unicode nbsp have fixed with in ctx?
=

When tilde is the replacement in preprocessor (uncommen= t first replacement in preprocessor), xmlfush will display tilde (which is,= as character, non-breakable and unstretchable, no surprise).
Why tilde is displayed?

Replacing or a= dding nbsp (tilde) with finalizers have different results, see next episode= after this one is understood.

Thank you,
Jano

MWE (rather use attached file not to loose = invisible characters):

\s= tartbuffer[doc]
<?xml version "1.0"?>
<document>= ;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 <p>Temperature 20=C2=A0=C2=B0C 20=C2= =A0=C2=B0C 20=C2=A0=C2=B0C 20=C2=A0=C2=B0C average.</p>
=C2=A0 =C2= =A0 =C2=A0 =C2=A0 <p>Altitude 6000&amp;nbsp;m 6000&amp;nbsp;m= 6000&amp;nbsp;m 6000&amp;nbsp;m average.</p>
</documen= t>
\stopbuffer

\startluacode
function lxml.preprocessor(dat= a)
=C2=A0 =C2=A0 -- data =3D string.gsub(data, "&amp;nbsp;"= ;, "~")
=C2=A0 =C2=A0 -- replacement nbsp invisible in luacode=
=C2=A0 =C2=A0 data =3D string.gsub(data, "&amp;nbsp;", &q= uot;=C2=A0")
=C2=A0 =C2=A0 return data
end
\stopluacode

\startxmlsetups xml:name
=C2=A0 =C2=A0 \xmlsetsetup{\xmldocument}{= *}{-}
=C2=A0 =C2=A0 \xmlsetsetup{\xmldocument}{document|p}{xml:name:*}\stopxmlsetups
\xmlregistersetup{xml:name}

\startxmlsetups xml:= name:document
\xmlflush{#1}\par
\stopxmlsetups

\startxmlsetups= xml:name:p
\parfillskip0pt\xmlflush{#1}\par
\stopxmlsetups

\s= tartTEXpage[offset=3D5mm,width=3D60mm]
\xmlprocessbuffer{xml:name}{doc}{= }
\stopTEXpage
--0000000000000579a605c07f9756-- --0000000000000579a905c07f9758 Content-Type: application/octet-stream; name="xml-and-space-preprocessor.tex" Content-Disposition: attachment; filename="xml-and-space-preprocessor.tex" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_knrs2e6y0 XHN0YXJ0YnVmZmVyW2RvY10NCjw/eG1sIHZlcnNpb24gIjEuMCI/Pg0KPGRvY3VtZW50Pg0KICAg ICAgICA8cD5UZW1wZXJhdHVyZSAyMMKgwrBDIDIwwqDCsEMgMjDCoMKwQyAyMMKgwrBDIGF2ZXJh Z2UuPC9wPg0KICAgICAgICA8cD5BbHRpdHVkZSA2MDAwJmFtcDtuYnNwO20gNjAwMCZhbXA7bmJz cDttIDYwMDAmYW1wO25ic3A7bSA2MDAwJmFtcDtuYnNwO20gYXZlcmFnZS48L3A+DQo8L2RvY3Vt ZW50Pg0KXHN0b3BidWZmZXINCg0KXHN0YXJ0bHVhY29kZQ0KZnVuY3Rpb24gbHhtbC5wcmVwcm9j ZXNzb3IoZGF0YSkNCiAgICAtLSBkYXRhID0gc3RyaW5nLmdzdWIoZGF0YSwgIiZhbXA7bmJzcDsi LCAifiIpDQogICAgLS0gcmVwbGFjZW1lbnQgbmJzcCBpbnZpc2libGUgaW4gbHVhY29kZQ0KICAg IGRhdGEgPSBzdHJpbmcuZ3N1YihkYXRhLCAiJmFtcDtuYnNwOyIsICLCoCIpDQogICAgcmV0dXJu IGRhdGENCmVuZA0KXHN0b3BsdWFjb2RlDQoNCg0KXHN0YXJ0eG1sc2V0dXBzIHhtbDpuYW1lDQog ICAgXHhtbHNldHNldHVwe1x4bWxkb2N1bWVudH17Kn17LX0NCiAgICBceG1sc2V0c2V0dXB7XHht bGRvY3VtZW50fXtkb2N1bWVudHxwfXt4bWw6bmFtZToqfQ0KXHN0b3B4bWxzZXR1cHMNClx4bWxy ZWdpc3RlcnNldHVwe3htbDpuYW1lfQ0KDQpcc3RhcnR4bWxzZXR1cHMgeG1sOm5hbWU6ZG9jdW1l bnQNClx4bWxmbHVzaHsjMX1ccGFyDQpcc3RvcHhtbHNldHVwcw0KDQpcc3RhcnR4bWxzZXR1cHMg eG1sOm5hbWU6cA0KXHBhcmZpbGxza2lwMHB0XHhtbGZsdXNoeyMxfVxwYXINClxzdG9weG1sc2V0 dXBzDQoNClxzdGFydFRFWHBhZ2Vbb2Zmc2V0PTVtbSx3aWR0aD02MG1tXQ0KXHhtbHByb2Nlc3Ni dWZmZXJ7eG1sOm5hbWV9e2RvY317fQ0KXHN0b3BURVhwYWdlDQoNCg== --0000000000000579a905c07f9758 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX18KSWYgeW91ciBxdWVzdGlvbiBpcyBvZiBpbnRlcmVz dCB0byBvdGhlcnMgYXMgd2VsbCwgcGxlYXNlIGFkZCBhbiBlbnRyeSB0byB0aGUgV2lraSEKCm1h aWxsaXN0IDogbnRnLWNvbnRleHRAbnRnLm5sIC8gaHR0cDovL3d3dy5udGcubmwvbWFpbG1hbi9s aXN0aW5mby9udGctY29udGV4dAp3ZWJwYWdlICA6IGh0dHA6Ly93d3cucHJhZ21hLWFkZS5ubCAv IGh0dHA6Ly9jb250ZXh0LmFhbmhldC5uZXQKYXJjaGl2ZSAgOiBodHRwczovL2JpdGJ1Y2tldC5v cmcvcGhnL2NvbnRleHQtbWlycm9yL2NvbW1pdHMvCndpa2kgICAgIDogaHR0cDovL2NvbnRleHRn YXJkZW4ubmV0Cl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCg== --0000000000000579a905c07f9758--