From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/102285 Path: news.gmane.org!.POSTED!not-for-mail From: Rik Kabel Newsgroups: gmane.comp.tex.context Subject: Re: Unexpected space after hyphen in xml/html export Date: Wed, 10 Oct 2018 15:11:34 -0400 Message-ID: <606c6f51-3fa7-157d-4ebd-b151472fc4fa@rik.users.panix.com> References: <70fd12c7-a55e-b651-b527-0c7cd6902815@rik.users.panix.com> <3e13d1b2-a84e-5e24-3dd7-bff8c6e30532@rik.users.panix.com> <37b2e767-db15-eb8b-0af1-f7b774e7af0f@xs4all.nl> <67be6651-dea5-9318-e6f0-17bd876608e2@rik.users.panix.com> Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============5241826663685963952==" X-Trace: blaine.gmane.org 1539198606 28586 195.159.176.226 (10 Oct 2018 19:10:06 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 10 Oct 2018 19:10:06 +0000 (UTC) User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 To: Hans Hagen , mailing list for ConTeXt users Original-X-From: ntg-context-bounces@ntg.nl Wed Oct 10 21:10:02 2018 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from zapf.boekplan.nl ([5.39.185.232] helo=zapf.ntg.nl) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gAJrq-0007D7-SX for gctc-ntg-context-518@m.gmane.org; Wed, 10 Oct 2018 21:09:58 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id 14A2C78DDE; Wed, 10 Oct 2018 21:11:50 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at zapf.boekplan.nl Original-Received: from zapf.ntg.nl ([127.0.0.1]) by localhost (zapf.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tZPG3mA8CXCK; Wed, 10 Oct 2018 21:11:48 +0200 (CEST) Original-Received: from zapf.ntg.nl (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id C2F2A78DDB; Wed, 10 Oct 2018 21:11:48 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id 70DFC78DDB for ; Wed, 10 Oct 2018 21:11:47 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at zapf.boekplan.nl Original-Received: from zapf.ntg.nl ([127.0.0.1]) by localhost (zapf.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id H0bsLVKd6G5o for ; Wed, 10 Oct 2018 21:11:46 +0200 (CEST) Original-Received: from mailbackend.panix.com (mailbackend.panix.com [166.84.1.89]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by zapf.ntg.nl (Postfix) with ESMTPS id DD9C878DD5 for ; Wed, 10 Oct 2018 21:11:36 +0200 (CEST) Original-Received: from [192.168.201.199] (cpe-24-194-22-135.nycap.res.rr.com [24.194.22.135]) by mailbackend.panix.com (Postfix) with ESMTPSA id 1384333B50; Wed, 10 Oct 2018 15:11:35 -0400 (EDT) In-Reply-To: <67be6651-dea5-9318-e6f0-17bd876608e2@rik.users.panix.com> Content-Language: en-US X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.20 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ntg-context-bounces@ntg.nl Original-Sender: "ntg-context" Xref: news.gmane.org gmane.comp.tex.context:102285 Archived-At: This is a multi-part message in MIME format. --===============5241826663685963952== Content-Type: multipart/alternative; boundary="------------0A5FB7271B9B8A1F0E26FDFC" Content-Language: en-US This is a multi-part message in MIME format. --------------0A5FB7271B9B8A1F0E26FDFC Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit On 10/10/2018 14:50, Rik Kabel wrote: > On 10/8/2018 18:32, Hans Hagen wrote: >> >>> Alas, it is fixed for that particular occurence, but it still occurs >>> 29 times in the document (using today's beta). >>> >>> A more extended search shows that there are also spaces afters >>> en-dashes (in "Press|–|Citizen" and  in "Miniatur|–|Bibliothek der >>> Deutschen Classiker"), but none after em-dashes. Unfortunately, my >>> attempts to reproduce this in a smaller document have so far failed. >> well, you know: no mwe, no solution > And here is the mwe. The culprit, it appears, is bidi. I have tried > all documented options (but not all combinations) for > \setupdirections, and the only one under which there is no problem is > "off". With bidi active, there is a spurious space wherever a > linebreak is introduced. As the example demonstrates, this is not a > function of the compounds, but of hyphenation in general. > > \setupbackend [export=yes] > \setupdirections [bidi=on] > \starttext > abraca% adjust to cause hyphenation with your textwidth > abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra > abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra > abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra > abra-cadabra abra-cadabra abra-cadabra abra-cadabra > abra-cadabra abra-cadabra abra-cadabra abra-cadabra > abra-cadabra abra-cadabra abra-cadabra abra-cadabra > \stoptext > > (The problem appears in the export html/xml file, not in the pdf.) > Not a function of explicit compounds (||) but of hyphenation of compounds. Normal hyphenation does not bring about the problem. -- RIk --------------0A5FB7271B9B8A1F0E26FDFC Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit
On 10/10/2018 14:50, Rik Kabel wrote:
On 10/8/2018 18:32, Hans Hagen wrote:

Alas, it is fixed for that particular occurence, but it still occurs 29 times in the document (using today's beta).

A more extended search shows that there are also spaces afters en-dashes (in "Press|–|Citizen" and  in "Miniatur|–|Bibliothek der Deutschen Classiker"), but none after em-dashes. Unfortunately, my attempts to reproduce this in a smaller document have so far failed.
well, you know: no mwe, no solution
And here is the mwe. The culprit, it appears, is bidi. I have tried all documented options (but not all combinations) for \setupdirections, and the only one under which there is no problem is "off". With bidi active, there is a spurious space wherever a linebreak is introduced. As the example demonstrates, this is not a function of the compounds, but of hyphenation in general.
\setupbackend     [export=yes]
\setupdirections  [bidi=on]
\starttext
abraca% adjust to cause hyphenation with your textwidth
abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra
abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra
abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra
abra-cadabra abra-cadabra abra-cadabra abra-cadabra
abra-cadabra abra-cadabra abra-cadabra abra-cadabra
abra-cadabra abra-cadabra abra-cadabra abra-cadabra
\stoptext
(The problem appears in the export html/xml file, not in the pdf.)

Not a function of explicit compounds (||) but of hyphenation of compounds. Normal hyphenation does not bring about the problem.

--
RIk

--------------0A5FB7271B9B8A1F0E26FDFC-- --===============5241826663685963952== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX18KSWYgeW91ciBxdWVzdGlvbiBpcyBvZiBpbnRlcmVz dCB0byBvdGhlcnMgYXMgd2VsbCwgcGxlYXNlIGFkZCBhbiBlbnRyeSB0byB0aGUgV2lraSEKCm1h aWxsaXN0IDogbnRnLWNvbnRleHRAbnRnLm5sIC8gaHR0cDovL3d3dy5udGcubmwvbWFpbG1hbi9s aXN0aW5mby9udGctY29udGV4dAp3ZWJwYWdlICA6IGh0dHA6Ly93d3cucHJhZ21hLWFkZS5ubCAv IGh0dHA6Ly9jb250ZXh0LmFhbmhldC5uZXQKYXJjaGl2ZSAgOiBodHRwczovL2JpdGJ1Y2tldC5v cmcvcGhnL2NvbnRleHQtbWlycm9yL2NvbW1pdHMvCndpa2kgICAgIDogaHR0cDovL2NvbnRleHRn YXJkZW4ubmV0Cl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f --===============5241826663685963952==--