* Unexpected space after hyphen in xml/html export @ 2018-10-06 22:19 Rik Kabel 2018-10-06 23:28 ` Hans Hagen 0 siblings, 1 reply; 7+ messages in thread From: Rik Kabel @ 2018-10-06 22:19 UTC (permalink / raw) To: mailing list for ConTeXt users [-- Attachment #1.1: Type: text/plain, Size: 2188 bytes --] List, Occasionally an unexpected and unwanted space is inserted following the hyphen of a compound word in html/xml exports. In a document with about 500 such compounds, this occurs 30 times. The following input: \setupbackend [export=yes,xhtml=yes] \starttext Theocracy, the priest power; monarchy, the one|-|man power; and oligarchy, the few|-|men power|—|are three forms of vicarious government over the people, perhaps for them, not by them. Democracy is direct self|-|government over all the people, for all the people, by all the people. Our institutions are democratic: theocratic, monarchic, oligarchic vicariousness is all gone. We have no Divine vicar who is responsible to God for our politics and religion; only a human attorney, answerable to the people for his official work. The axis of rotation has changed: the equator of the old civilization passes through the poles of the new. This makes some change in the geography of both Church and State. \stopsection \stoptext Produces, in relevant part, the following xml (wrapped for convenience): Theocracy, the priest power; monarchy, the one-man power; and oligarchy, the few- men power—are three forms of vicarious government over the people, perhaps for them, not by them. Democracy is direct self-government over all the people, for all the people, by all the people. Our institutions are democratic: theocratic, monarchic, oligarchic vicariousness is all gone. We have no Divine vicar who is responsible to God for our politics and religion; only a human attorney, answerable to the people for his official work. The axis of rotation has changed: the equator of the old civilization passes through the poles of the new. This makes some change in the geography of both Church and State.</document> Note the space after "few-" in the second line of the output text. (The paragraph is a quotation from Theodore Parker's sermon "The Effect of Slavery on the American People," delivered on July 4, 1858. It is thought by many to be the inspiration for part of Lincoln's Gettysburg Address.) -- Rik [-- Attachment #1.2: Type: text/html, Size: 2488 bytes --] [-- Attachment #2: Type: text/plain, Size: 492 bytes --] ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___________________________________________________________________________________ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Unexpected space after hyphen in xml/html export 2018-10-06 22:19 Unexpected space after hyphen in xml/html export Rik Kabel @ 2018-10-06 23:28 ` Hans Hagen 2018-10-08 20:24 ` Rik Kabel 0 siblings, 1 reply; 7+ messages in thread From: Hans Hagen @ 2018-10-06 23:28 UTC (permalink / raw) To: mailing list for ConTeXt users, Rik Kabel On 10/7/2018 12:19 AM, Rik Kabel wrote: > List, > > Occasionally an unexpected and unwanted space is inserted following the > hyphen of a compound word in html/xml exports. In a document with about > 500 such compounds, this occurs 30 times. > > The following input: > > \setupbackend [export=yes,xhtml=yes] > \starttext > Theocracy, the priest power; monarchy, the one|-|man power; and > oligarchy, the few|-|men power|—|are three forms of vicarious > government over the people, perhaps for them, not by them. Democracy is > direct self|-|government over all the people, for all the people, by > all the people. Our institutions are democratic: theocratic, monarchic, > oligarchic vicariousness is all gone. We have no Divine vicar who is > responsible to God for our politics and religion; only a human attorney, > answerable to the people for his official work. The axis of rotation has > changed: the equator of the old civilization passes through the poles > of the new. This makes some change in the geography of both Church and > State. > \stopsection > \stoptext > > Produces, in relevant part, the following xml (wrapped for convenience): > > Theocracy, the priest power; monarchy, the one-man power; and oligarchy, > the few- men power—are three forms of vicarious government over > the people, perhaps for them, not by them. Democracy is direct > self-government over all the people, for all the people, by all the > people. Our institutions are democratic: theocratic, monarchic, > oligarchic vicariousness is all gone. We have no Divine vicar who is > responsible to God for our politics and religion; only a human attorney, > answerable to the people for his official work. The axis of rotation has > changed: the equator of the old civilization passes through the poles > of the new. This makes some change in the geography of both Church and > State.</document> > > Note the space after "few-" in the second line of the output text. > > (The paragraph is a quotation from Theodore Parker's sermon "The Effect > of Slavery on the American People," delivered on July 4, 1858. It is > thought by many to be the inspiration for part of Lincoln's Gettysburg > Address.) But it's not what happened: quite some folks in power have middle age monarchic characteristics, oligarchies are around etc. Old institutions (that probably root deeply in mankind0 are just better in pretending to be different. Anyway fixed in next beta (but you need to keep an eye on disc side effects. Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl ----------------------------------------------------------------- ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___________________________________________________________________________________ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Unexpected space after hyphen in xml/html export 2018-10-06 23:28 ` Hans Hagen @ 2018-10-08 20:24 ` Rik Kabel 2018-10-08 22:32 ` Hans Hagen 0 siblings, 1 reply; 7+ messages in thread From: Rik Kabel @ 2018-10-08 20:24 UTC (permalink / raw) To: Hans Hagen, mailing list for ConTeXt users [-- Attachment #1.1: Type: text/plain, Size: 4079 bytes --] On 10/6/2018 19:28, Hans Hagen wrote: > On 10/7/2018 12:19 AM, Rik Kabel wrote: >> List, >> >> Occasionally an unexpected and unwanted space is inserted following >> the hyphen of a compound word in html/xml exports. In a document with >> about 500 such compounds, this occurs 30 times. >> >> The following input: >> >> \setupbackend [export=yes,xhtml=yes] >> \starttext >> Theocracy, the priest power; monarchy, the one|-|man power; and >> oligarchy, the few|-|men power|—|are three forms of vicarious >> government over the people, perhaps for them, not by them. >> Democracy is >> direct self|-|government over all the people, for all the people, by >> all the people. Our institutions are democratic: theocratic, >> monarchic, >> oligarchic vicariousness is all gone. We have no Divine vicar who is >> responsible to God for our politics and religion; only a human >> attorney, >> answerable to the people for his official work. The axis of >> rotation has >> changed: the equator of the old civilization passes through the >> poles >> of the new. This makes some change in the geography of both >> Church and >> State. >> \stopsection >> \stoptext >> >> Produces, in relevant part, the following xml (wrapped for convenience): >> >> Theocracy, the priest power; monarchy, the one-man power; and >> oligarchy, >> the few- men power—are three forms of vicarious government over >> the people, perhaps for them, not by them. Democracy is direct >> self-government over all the people, for all the people, by all the >> people. Our institutions are democratic: theocratic, monarchic, >> oligarchic vicariousness is all gone. We have no Divine vicar who is >> responsible to God for our politics and religion; only a human >> attorney, >> answerable to the people for his official work. The axis of >> rotation has >> changed: the equator of the old civilization passes through the >> poles >> of the new. This makes some change in the geography of both >> Church and >> State.</document> >> >> Note the space after "few-" in the second line of the output text. >> >> (The paragraph is a quotation from Theodore Parker's sermon "The >> Effect of Slavery on the American People," delivered on July 4, 1858. >> It is thought by many to be the inspiration for part of Lincoln's >> Gettysburg Address.) > > But it's not what happened: quite some folks in power have middle age > monarchic characteristics, oligarchies are around etc. Old > institutions (that probably root deeply in mankind0 are just better in > pretending to be different. > > Anyway fixed in next beta (but you need to keep an eye on disc side > effects. > > Hans Alas, it is fixed for that particular occurence, but it still occurs 29 times in the document (using today's beta). A more extended search shows that there are also spaces afters en-dashes (in "Press|–|Citizen" and in "Miniatur|–|Bibliothek der Deutschen Classiker"), but none after em-dashes. Unfortunately, my attempts to reproduce this in a smaller document have so far failed. Perhaps this quote, in which the problem also occurs, is in line with your other comments: There is only one party in the United States, the Property Party\nbsp \dots{} and it has two right wings: Republican and Democrat. Republicans are a bit stupider, more rigid, more doctrinaire in their laissez|-|faire capitalism than the Democrats, who are cuter, prettier, a bit more corrupt—until recently\nbsp \dots{} and more willing than the Republicans to make small adjustments when the poor, the black, the anti|-|imperialists get out of hand. But, essentially, there is no difference between the two parties. (That is from Gore Vidal in 1975. Plus ça change.) In it, I get a space after "anti-". But more like this and folks will complain about politics on the list. Or worse, encourage it. -- Rik [-- Attachment #1.2: Type: text/html, Size: 5456 bytes --] [-- Attachment #2: Type: text/plain, Size: 492 bytes --] ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___________________________________________________________________________________ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Unexpected space after hyphen in xml/html export 2018-10-08 20:24 ` Rik Kabel @ 2018-10-08 22:32 ` Hans Hagen 2018-10-10 18:50 ` Rik Kabel 0 siblings, 1 reply; 7+ messages in thread From: Hans Hagen @ 2018-10-08 22:32 UTC (permalink / raw) To: Rik Kabel, mailing list for ConTeXt users > Alas, it is fixed for that particular occurence, but it still occurs 29 > times in the document (using today's beta). > > A more extended search shows that there are also spaces afters en-dashes > (in "Press|–|Citizen" and in "Miniatur|–|Bibliothek der Deutschen > Classiker"), but none after em-dashes. Unfortunately, my attempts to > reproduce this in a smaller document have so far failed. well, you know: no mwe, no solution Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl ----------------------------------------------------------------- ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___________________________________________________________________________________ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Unexpected space after hyphen in xml/html export 2018-10-08 22:32 ` Hans Hagen @ 2018-10-10 18:50 ` Rik Kabel 2018-10-10 19:11 ` Rik Kabel 0 siblings, 1 reply; 7+ messages in thread From: Rik Kabel @ 2018-10-10 18:50 UTC (permalink / raw) To: Hans Hagen, mailing list for ConTeXt users [-- Attachment #1.1: Type: text/plain, Size: 1468 bytes --] On 10/8/2018 18:32, Hans Hagen wrote: > >> Alas, it is fixed for that particular occurence, but it still occurs >> 29 times in the document (using today's beta). >> >> A more extended search shows that there are also spaces afters >> en-dashes (in "Press|–|Citizen" and in "Miniatur|–|Bibliothek der >> Deutschen Classiker"), but none after em-dashes. Unfortunately, my >> attempts to reproduce this in a smaller document have so far failed. > well, you know: no mwe, no solution And here is the mwe. The culprit, it appears, is bidi. I have tried all documented options (but not all combinations) for \setupdirections, and the only one under which there is no problem is "off". With bidi active, there is a spurious space wherever a linebreak is introduced. As the example demonstrates, this is not a function of the compounds, but of hyphenation in general. \setupbackend [export=yes] \setupdirections [bidi=on] \starttext abraca% adjust to cause hyphenation with your textwidth abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra abra-cadabra abra-cadabra abra-cadabra abra-cadabra abra-cadabra abra-cadabra abra-cadabra abra-cadabra abra-cadabra abra-cadabra abra-cadabra abra-cadabra \stoptext (The problem appears in the export html/xml file, not in the pdf.) -- Rik [-- Attachment #1.2: Type: text/html, Size: 1983 bytes --] [-- Attachment #2: Type: text/plain, Size: 492 bytes --] ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___________________________________________________________________________________ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Unexpected space after hyphen in xml/html export 2018-10-10 18:50 ` Rik Kabel @ 2018-10-10 19:11 ` Rik Kabel 2018-10-11 21:01 ` Rik Kabel 0 siblings, 1 reply; 7+ messages in thread From: Rik Kabel @ 2018-10-10 19:11 UTC (permalink / raw) To: Hans Hagen, mailing list for ConTeXt users [-- Attachment #1.1: Type: text/plain, Size: 1688 bytes --] On 10/10/2018 14:50, Rik Kabel wrote: > On 10/8/2018 18:32, Hans Hagen wrote: >> >>> Alas, it is fixed for that particular occurence, but it still occurs >>> 29 times in the document (using today's beta). >>> >>> A more extended search shows that there are also spaces afters >>> en-dashes (in "Press|–|Citizen" and in "Miniatur|–|Bibliothek der >>> Deutschen Classiker"), but none after em-dashes. Unfortunately, my >>> attempts to reproduce this in a smaller document have so far failed. >> well, you know: no mwe, no solution > And here is the mwe. The culprit, it appears, is bidi. I have tried > all documented options (but not all combinations) for > \setupdirections, and the only one under which there is no problem is > "off". With bidi active, there is a spurious space wherever a > linebreak is introduced. As the example demonstrates, this is not a > function of the compounds, but of hyphenation in general. > > \setupbackend [export=yes] > \setupdirections [bidi=on] > \starttext > abraca% adjust to cause hyphenation with your textwidth > abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra > abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra > abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra > abra-cadabra abra-cadabra abra-cadabra abra-cadabra > abra-cadabra abra-cadabra abra-cadabra abra-cadabra > abra-cadabra abra-cadabra abra-cadabra abra-cadabra > \stoptext > > (The problem appears in the export html/xml file, not in the pdf.) > Not a function of explicit compounds (||) but of hyphenation of compounds. Normal hyphenation does not bring about the problem. -- RIk [-- Attachment #1.2: Type: text/html, Size: 2458 bytes --] [-- Attachment #2: Type: text/plain, Size: 492 bytes --] ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___________________________________________________________________________________ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Unexpected space after hyphen in xml/html export 2018-10-10 19:11 ` Rik Kabel @ 2018-10-11 21:01 ` Rik Kabel 0 siblings, 0 replies; 7+ messages in thread From: Rik Kabel @ 2018-10-11 21:01 UTC (permalink / raw) To: ntg-context, Hans Hagen [-- Attachment #1.1: Type: text/plain, Size: 2028 bytes --] On 10/10/2018 15:11, Rik Kabel wrote: > On 10/10/2018 14:50, Rik Kabel wrote: >> On 10/8/2018 18:32, Hans Hagen wrote: >>> >>>> Alas, it is fixed for that particular occurence, but it still >>>> occurs 29 times in the document (using today's beta). >>>> >>>> A more extended search shows that there are also spaces afters >>>> en-dashes (in "Press|–|Citizen" and in "Miniatur|–|Bibliothek der >>>> Deutschen Classiker"), but none after em-dashes. Unfortunately, my >>>> attempts to reproduce this in a smaller document have so far failed. >>> well, you know: no mwe, no solution >> And here is the mwe. The culprit, it appears, is bidi. I have tried >> all documented options (but not all combinations) for >> \setupdirections, and the only one under which there is no problem is >> "off". With bidi active, there is a spurious space wherever a >> linebreak is introduced. As the example demonstrates, this is not a >> function of the compounds, but of hyphenation in general. >> >> \setupbackend [export=yes] >> \setupdirections [bidi=on] >> \starttext >> abraca% adjust to cause hyphenation with your textwidth >> abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra >> abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra >> abra|-|cadabra abra|-|cadabra abra|-|cadabra abra|-|cadabra >> abra-cadabra abra-cadabra abra-cadabra abra-cadabra >> abra-cadabra abra-cadabra abra-cadabra abra-cadabra >> abra-cadabra abra-cadabra abra-cadabra abra-cadabra >> \stoptext >> >> (The problem appears in the export html/xml file, not in the pdf.) >> > Not a function of explicit compounds (||) but of hyphenation of > compounds. Normal hyphenation does not bring about the problem. > I also note that \setupdirection with every option combination I have tried has no discernible effect on my export output, and can safely be removed from the export mode of my document, so for me this issue disappears. I do not know if this is the general case. -- Rik [-- Attachment #1.2: Type: text/html, Size: 3121 bytes --] [-- Attachment #2: Type: text/plain, Size: 492 bytes --] ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___________________________________________________________________________________ ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2018-10-11 21:01 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-10-06 22:19 Unexpected space after hyphen in xml/html export Rik Kabel 2018-10-06 23:28 ` Hans Hagen 2018-10-08 20:24 ` Rik Kabel 2018-10-08 22:32 ` Hans Hagen 2018-10-10 18:50 ` Rik Kabel 2018-10-10 19:11 ` Rik Kabel 2018-10-11 21:01 ` Rik Kabel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).