* Unexpected space after hyphen in xml/html export
@ 2018-10-06 22:19 Rik Kabel
2018-10-06 23:28 ` Hans Hagen
0 siblings, 1 reply; 7+ messages in thread
From: Rik Kabel @ 2018-10-06 22:19 UTC (permalink / raw)
To: mailing list for ConTeXt users
[-- Attachment #1.1: Type: text/plain, Size: 2188 bytes --]
List,
Occasionally an unexpected and unwanted space is inserted following the
hyphen of a compound word in html/xml exports. In a document with about
500 such compounds, this occurs 30 times.
The following input:
\setupbackend [export=yes,xhtml=yes]
\starttext
Theocracy, the priest power; monarchy, the one|-|man power; and
oligarchy, the few|-|men power|—|are three forms of vicarious
government over the people, perhaps for them, not by them. Democracy is
direct self|-|government over all the people, for all the people, by
all the people. Our institutions are democratic: theocratic, monarchic,
oligarchic vicariousness is all gone. We have no Divine vicar who is
responsible to God for our politics and religion; only a human attorney,
answerable to the people for his official work. The axis of rotation has
changed: the equator of the old civilization passes through the poles
of the new. This makes some change in the geography of both Church and
State.
\stopsection
\stoptext
Produces, in relevant part, the following xml (wrapped for convenience):
Theocracy, the priest power; monarchy, the one-man power; and oligarchy,
the few- men power—are three forms of vicarious government over
the people, perhaps for them, not by them. Democracy is direct
self-government over all the people, for all the people, by all the
people. Our institutions are democratic: theocratic, monarchic,
oligarchic vicariousness is all gone. We have no Divine vicar who is
responsible to God for our politics and religion; only a human attorney,
answerable to the people for his official work. The axis of rotation has
changed: the equator of the old civilization passes through the poles
of the new. This makes some change in the geography of both Church and
State.</document>
Note the space after "few-" in the second line of the output text.
(The paragraph is a quotation from Theodore Parker's sermon "The Effect
of Slavery on the American People," delivered on July 4, 1858. It is
thought by many to be the inspiration for part of Lincoln's Gettysburg
Address.)
--
Rik
[-- Attachment #1.2: Type: text/html, Size: 2488 bytes --]
[-- Attachment #2: Type: text/plain, Size: 492 bytes --]
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage : http://www.pragma-ade.nl / http://context.aanhet.net
archive : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Unexpected space after hyphen in xml/html export
2018-10-06 22:19 Unexpected space after hyphen in xml/html export Rik Kabel
@ 2018-10-06 23:28 ` Hans Hagen
2018-10-08 20:24 ` Rik Kabel
0 siblings, 1 reply; 7+ messages in thread
From: Hans Hagen @ 2018-10-06 23:28 UTC (permalink / raw)
To: mailing list for ConTeXt users, Rik Kabel
On 10/7/2018 12:19 AM, Rik Kabel wrote:
> List,
>
> Occasionally an unexpected and unwanted space is inserted following the
> hyphen of a compound word in html/xml exports. In a document with about
> 500 such compounds, this occurs 30 times.
>
> The following input:
>
> \setupbackend [export=yes,xhtml=yes]
> \starttext
> Theocracy, the priest power; monarchy, the one|-|man power; and
> oligarchy, the few|-|men power|—|are three forms of vicarious
> government over the people, perhaps for them, not by them. Democracy is
> direct self|-|government over all the people, for all the people, by
> all the people. Our institutions are democratic: theocratic, monarchic,
> oligarchic vicariousness is all gone. We have no Divine vicar who is
> responsible to God for our politics and religion; only a human attorney,
> answerable to the people for his official work. The axis of rotation has
> changed: the equator of the old civilization passes through the poles
> of the new. This makes some change in the geography of both Church and
> State.
> \stopsection
> \stoptext
>
> Produces, in relevant part, the following xml (wrapped for convenience):
>
> Theocracy, the priest power; monarchy, the one-man power; and oligarchy,
> the few- men power—are three forms of vicarious government over
> the people, perhaps for them, not by them. Democracy is direct
> self-government over all the people, for all the people, by all the
> people. Our institutions are democratic: theocratic, monarchic,
> oligarchic vicariousness is all gone. We have no Divine vicar who is
> responsible to God for our politics and religion; only a human attorney,
> answerable to the people for his official work. The axis of rotation has
> changed: the equator of the old civilization passes through the poles
> of the new. This makes some change in the geography of both Church and
> State.</document>
>
> Note the space after "few-" in the second line of the output text.
>
> (The paragraph is a quotation from Theodore Parker's sermon "The Effect
> of Slavery on the American People," delivered on July 4, 1858. It is
> thought by many to be the inspiration for part of Lincoln's Gettysburg
> Address.)
But it's not what happened: quite some folks in power have middle age
monarchic characteristics, oligarchies are around etc. Old institutions
(that probably root deeply in mankind0 are just better in pretending to
be different.
Anyway fixed in next beta (but you need to keep an eye on disc side
effects.
Hans
-----------------------------------------------------------------
Hans Hagen | PRAGMA ADE
Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage : http://www.pragma-ade.nl / http://context.aanhet.net
archive : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Unexpected space after hyphen in xml/html export
2018-10-06 23:28 ` Hans Hagen
@ 2018-10-08 20:24 ` Rik Kabel
2018-10-08 22:32 ` Hans Hagen
0 siblings, 1 reply; 7+ messages in thread
From: Rik Kabel @ 2018-10-08 20:24 UTC (permalink / raw)
To: Hans Hagen, mailing list for ConTeXt users
[-- Attachment #1.1: Type: text/plain, Size: 4079 bytes --]
On 10/6/2018 19:28, Hans Hagen wrote:
> On 10/7/2018 12:19 AM, Rik Kabel wrote:
>> List,
>>
>> Occasionally an unexpected and unwanted space is inserted following
>> the hyphen of a compound word in html/xml exports. In a document with
>> about 500 such compounds, this occurs 30 times.
>>
>> The following input:
>>
>> \setupbackend [export=yes,xhtml=yes]
>> \starttext
>> Theocracy, the priest power; monarchy, the one|-|man power; and
>> oligarchy, the few|-|men power|—|are three forms of vicarious
>> government over the people, perhaps for them, not by them.
>> Democracy is
>> direct self|-|government over all the people, for all the people, by
>> all the people. Our institutions are democratic: theocratic,
>> monarchic,
>> oligarchic vicariousness is all gone. We have no Divine vicar who is
>> responsible to God for our politics and religion; only a human
>> attorney,
>> answerable to the people for his official work. The axis of
>> rotation has
>> changed: the equator of the old civilization passes through the
>> poles
>> of the new. This makes some change in the geography of both
>> Church and
>> State.
>> \stopsection
>> \stoptext
>>
>> Produces, in relevant part, the following xml (wrapped for convenience):
>>
>> Theocracy, the priest power; monarchy, the one-man power; and
>> oligarchy,
>> the few- men power—are three forms of vicarious government over
>> the people, perhaps for them, not by them. Democracy is direct
>> self-government over all the people, for all the people, by all the
>> people. Our institutions are democratic: theocratic, monarchic,
>> oligarchic vicariousness is all gone. We have no Divine vicar who is
>> responsible to God for our politics and religion; only a human
>> attorney,
>> answerable to the people for his official work. The axis of
>> rotation has
>> changed: the equator of the old civilization passes through the
>> poles
>> of the new. This makes some change in the geography of both
>> Church and
>> State.</document>
>>
>> Note the space after "few-" in the second line of the output text.
>>
>> (The paragraph is a quotation from Theodore Parker's sermon "The
>> Effect of Slavery on the American People," delivered on July 4, 1858.
>> It is thought by many to be the inspiration for part of Lincoln's
>> Gettysburg Address.)
>
> But it's not what happened: quite some folks in power have middle age
> monarchic characteristics, oligarchies are around etc. Old
> institutions (that probably root deeply in mankind0 are just better in
> pretending to be different.
>
> Anyway fixed in next beta (but you need to keep an eye on disc side
> effects.
>
> Hans
Alas, it is fixed for that particular occurence, but it still occurs 29
times in the document (using today's beta).
A more extended search shows that there are also spaces afters en-dashes
(in "Press|–|Citizen" and in "Miniatur|–|Bibliothek der Deutschen
Classiker"), but none after em-dashes. Unfortunately, my attempts to
reproduce this in a smaller document have so far failed.
Perhaps this quote, in which the problem also occurs, is in line with
your other comments:
There is only one party in the United States, the Property
Party\nbsp \dots{} and it has two right wings: Republican
and Democrat. Republicans are a bit stupider, more rigid,
more doctrinaire in their laissez|-|faire capitalism than
the Democrats, who are cuter, prettier, a bit more
corrupt—until recently\nbsp \dots{} and more willing than the
Republicans to make small adjustments when the poor, the black,
the anti|-|imperialists get out of hand. But, essentially, there
is no difference between the two parties.
(That is from Gore Vidal in 1975. Plus ça change.) In it, I get a space
after "anti-".
But more like this and folks will complain about politics on the list.
Or worse, encourage it.
--
Rik
[-- Attachment #1.2: Type: text/html, Size: 5456 bytes --]
[-- Attachment #2: Type: text/plain, Size: 492 bytes --]
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage : http://www.pragma-ade.nl / http://context.aanhet.net
archive : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Unexpected space after hyphen in xml/html export
2018-10-08 20:24 ` Rik Kabel
@ 2018-10-08 22:32 ` Hans Hagen
2018-10-10 18:50 ` Rik Kabel
0 siblings, 1 reply; 7+ messages in thread
From: Hans Hagen @ 2018-10-08 22:32 UTC (permalink / raw)
To: Rik Kabel, mailing list for ConTeXt users
> Alas, it is fixed for that particular occurence, but it still occurs 29
> times in the document (using today's beta).
>
> A more extended search shows that there are also spaces afters en-dashes
> (in "Press|–|Citizen" and in "Miniatur|–|Bibliothek der Deutschen
> Classiker"), but none after em-dashes. Unfortunately, my attempts to
> reproduce this in a smaller document have so far failed.
well, you know: no mwe, no solution
Hans
-----------------------------------------------------------------
Hans Hagen | PRAGMA ADE
Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage : http://www.pragma-ade.nl / http://context.aanhet.net
archive : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Unexpected space after hyphen in xml/html export
2018-10-08 22:32 ` Hans Hagen
@ 2018-10-10 18:50 ` Rik Kabel
2018-10-10 19:11 ` Rik Kabel
0 siblings, 1 reply; 7+ messages in thread
From: Rik Kabel @ 2018-10-10 18:50 UTC (permalink / raw)
To: Hans Hagen, mailing list for ConTeXt users
[-- Attachment #1.1: Type: text/plain, Size: 1468 bytes --]
On 10/8/2018 18:32, Hans Hagen wrote:
>
>> Alas, it is fixed for that particular occurence, but it still occurs
>> 29 times in the document (using today's beta).
>>
>> A more extended search shows that there are also spaces afters
>> en-dashes (in "Press|–|Citizen" and in "Miniatur|–|Bibliothek der
>> Deutschen Classiker"), but none after em-dashes. Unfortunately, my
>> attempts to reproduce this in a smaller document have so far failed.
> well, you know: no mwe, no solution
And here is the mwe. The culprit, it appears, is bidi. I have tried all
documented options (but not all combinations) for \setupdirections, and
the only one under which there is no problem is "off". With bidi active,
there is a spurious space wherever a linebreak is introduced. As the
example demonstrates, this is not a function of the compounds, but of
hyphenation in general.
\setupbackend [export=yes]
\setupdirections [bidi=on]
\starttext
abraca% adjust to cause hyphenation with your textwidth
abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra
abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra
abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra
abra-cadabra abra-cadabra abra-cadabra abra-cadabra
abra-cadabra abra-cadabra abra-cadabra abra-cadabra
abra-cadabra abra-cadabra abra-cadabra abra-cadabra
\stoptext
(The problem appears in the export html/xml file, not in the pdf.)
--
Rik
[-- Attachment #1.2: Type: text/html, Size: 1983 bytes --]
[-- Attachment #2: Type: text/plain, Size: 492 bytes --]
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage : http://www.pragma-ade.nl / http://context.aanhet.net
archive : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Unexpected space after hyphen in xml/html export
2018-10-10 18:50 ` Rik Kabel
@ 2018-10-10 19:11 ` Rik Kabel
2018-10-11 21:01 ` Rik Kabel
0 siblings, 1 reply; 7+ messages in thread
From: Rik Kabel @ 2018-10-10 19:11 UTC (permalink / raw)
To: Hans Hagen, mailing list for ConTeXt users
[-- Attachment #1.1: Type: text/plain, Size: 1688 bytes --]
On 10/10/2018 14:50, Rik Kabel wrote:
> On 10/8/2018 18:32, Hans Hagen wrote:
>>
>>> Alas, it is fixed for that particular occurence, but it still occurs
>>> 29 times in the document (using today's beta).
>>>
>>> A more extended search shows that there are also spaces afters
>>> en-dashes (in "Press|–|Citizen" and in "Miniatur|–|Bibliothek der
>>> Deutschen Classiker"), but none after em-dashes. Unfortunately, my
>>> attempts to reproduce this in a smaller document have so far failed.
>> well, you know: no mwe, no solution
> And here is the mwe. The culprit, it appears, is bidi. I have tried
> all documented options (but not all combinations) for
> \setupdirections, and the only one under which there is no problem is
> "off". With bidi active, there is a spurious space wherever a
> linebreak is introduced. As the example demonstrates, this is not a
> function of the compounds, but of hyphenation in general.
>
> \setupbackend [export=yes]
> \setupdirections [bidi=on]
> \starttext
> abraca% adjust to cause hyphenation with your textwidth
> abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra
> abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra
> abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra
> abra-cadabra abra-cadabra abra-cadabra abra-cadabra
> abra-cadabra abra-cadabra abra-cadabra abra-cadabra
> abra-cadabra abra-cadabra abra-cadabra abra-cadabra
> \stoptext
>
> (The problem appears in the export html/xml file, not in the pdf.)
>
Not a function of explicit compounds (||) but of hyphenation of
compounds. Normal hyphenation does not bring about the problem.
--
RIk
[-- Attachment #1.2: Type: text/html, Size: 2458 bytes --]
[-- Attachment #2: Type: text/plain, Size: 492 bytes --]
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage : http://www.pragma-ade.nl / http://context.aanhet.net
archive : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Unexpected space after hyphen in xml/html export
2018-10-10 19:11 ` Rik Kabel
@ 2018-10-11 21:01 ` Rik Kabel
0 siblings, 0 replies; 7+ messages in thread
From: Rik Kabel @ 2018-10-11 21:01 UTC (permalink / raw)
To: ntg-context, Hans Hagen
[-- Attachment #1.1: Type: text/plain, Size: 2028 bytes --]
On 10/10/2018 15:11, Rik Kabel wrote:
> On 10/10/2018 14:50, Rik Kabel wrote:
>> On 10/8/2018 18:32, Hans Hagen wrote:
>>>
>>>> Alas, it is fixed for that particular occurence, but it still
>>>> occurs 29 times in the document (using today's beta).
>>>>
>>>> A more extended search shows that there are also spaces afters
>>>> en-dashes (in "Press|–|Citizen" and in "Miniatur|–|Bibliothek der
>>>> Deutschen Classiker"), but none after em-dashes. Unfortunately, my
>>>> attempts to reproduce this in a smaller document have so far failed.
>>> well, you know: no mwe, no solution
>> And here is the mwe. The culprit, it appears, is bidi. I have tried
>> all documented options (but not all combinations) for
>> \setupdirections, and the only one under which there is no problem is
>> "off". With bidi active, there is a spurious space wherever a
>> linebreak is introduced. As the example demonstrates, this is not a
>> function of the compounds, but of hyphenation in general.
>>
>> \setupbackend [export=yes]
>> \setupdirections [bidi=on]
>> \starttext
>> abraca% adjust to cause hyphenation with your textwidth
>> abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra
>> abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra
>> abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra
>> abra-cadabra abra-cadabra abra-cadabra abra-cadabra
>> abra-cadabra abra-cadabra abra-cadabra abra-cadabra
>> abra-cadabra abra-cadabra abra-cadabra abra-cadabra
>> \stoptext
>>
>> (The problem appears in the export html/xml file, not in the pdf.)
>>
> Not a function of explicit compounds (||) but of hyphenation of
> compounds. Normal hyphenation does not bring about the problem.
>
I also note that \setupdirection with every option combination I have
tried has no discernible effect on my export output, and can safely be
removed from the export mode of my document, so for me this issue
disappears.
I do not know if this is the general case.
--
Rik
[-- Attachment #1.2: Type: text/html, Size: 3121 bytes --]
[-- Attachment #2: Type: text/plain, Size: 492 bytes --]
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage : http://www.pragma-ade.nl / http://context.aanhet.net
archive : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2018-10-11 21:01 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-06 22:19 Unexpected space after hyphen in xml/html export Rik Kabel
2018-10-06 23:28 ` Hans Hagen
2018-10-08 20:24 ` Rik Kabel
2018-10-08 22:32 ` Hans Hagen
2018-10-10 18:50 ` Rik Kabel
2018-10-10 19:11 ` Rik Kabel
2018-10-11 21:01 ` Rik Kabel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).