ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* xml content and tweaked pdf output
@ 2010-03-10  9:58 Steffen Wolfrum
  2010-03-10 10:25 ` Italic correction is missing Mehdi Omidali
  2010-03-10 10:38 ` xml content and tweaked pdf output Thomas A. Schmitz
  0 siblings, 2 replies; 9+ messages in thread
From: Steffen Wolfrum @ 2010-03-10  9:58 UTC (permalink / raw)
  To: mailing list for ConTeXt users; +Cc: Taco Hoekwater, Thomas A. Schmitz

Hi,

very carefully I am trying to make first steps towards XML and ConTeXt (with MkIV).

Thus, I have enjoyed reading Thomas' MyWay "Getting Web Content and pdf-Output from One Source":

I only kept wondering, how to keep control over the pdf-Output in terms of fine-tuning the actual typesetting?
A quick search in the archive gave me the answer that is attached below: by using XMLentities.

But coming back to Thomas' issue "Getting Web Content and pdf-Output from One Source":
What about the other branch, getting web content?
Doesn't the XML source gets "spoiled" by these inserted XMLentities that only make sense when following the pdf-Output branch?
Or will these XMLentities be silently ignored when feeding the XML source in a CMS system or processing further to web content?

Apologies for asking such basic questions...


Any help or tips to deal with this hybrid will be greatly appreciated

Steffen




Am 13.04.2008 um 11:54 schrieb Taco Hoekwater:

> Thomas A. Schmitz wrote:
>> Hi gang,
>> 
>> speaking of xml... I have two easy questions, but can't find an  
>> answer. It's about tweaking the pdf-output I get:
>> 
>> 1. How do you add an additional hyphenation to a word? How would I  
>> enter the equivalent of super\-duper in an xml-file? I tried  
>> super&addhyphen;duper with this definition:
>> \defineXMLentity[addhyphen]{\-}
>> 
>> in my environment, but this doesn't seem to work.
> 
> Needs an example file, because
> 
>   \defineXMLentity[addhyphen]{\-}
>   \starttext
>   \hsize 1in
>   \startXMLdata
>   I tried super&addhyphen;duper
>   \stopXMLdata
>   \stoptext
> 
> works in both mkii and mkiv.
> 
> 
>> 2. Similar question: how to prevent an unwanted ligature, esp. in  
>> German? In TeX, I write Kauf{}laden. What would be a good way to do  
>> this in xml? I was thinking of Kauf&nolig;laden and
>> \defineXMLentity[nolig]{\kern0pt}
> 
> That should work, but probably better is:
>   \defineXMLentity[nolig]{\prewordbreak\kern0pt\postwordbreak}
> because the \kern will disable hyphenation, otherwise.
> 
> In mkii, just
>   \defineXMLentity[nolig]{{}}
> also works (but not in mkiv, and that is a feature).
> 
> Best wishes,
> Taco
> ___________________________________________________________________________________
> If your question is of interest to others as well, please add an entry to the Wiki!
> 
> maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
> webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
> archive  : https://foundry.supelec.fr/projects/contextrev/
> wiki     : http://contextgarden.net
> ___________________________________________________________________________________

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Italic correction is missing.
  2010-03-10  9:58 xml content and tweaked pdf output Steffen Wolfrum
@ 2010-03-10 10:25 ` Mehdi Omidali
  2010-03-10 10:42   ` Taco Hoekwater
  2010-03-10 10:38 ` xml content and tweaked pdf output Thomas A. Schmitz
  1 sibling, 1 reply; 9+ messages in thread
From: Mehdi Omidali @ 2010-03-10 10:25 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Hi
If you run
\starttext
\startformula
V) V\exists F\exists
\stopformula
\stoptext
the space between V and ) for example is not correct. (luatex .50)
MO

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xml content and tweaked pdf output
  2010-03-10  9:58 xml content and tweaked pdf output Steffen Wolfrum
  2010-03-10 10:25 ` Italic correction is missing Mehdi Omidali
@ 2010-03-10 10:38 ` Thomas A. Schmitz
  2010-03-10 10:48   ` luigi scarso
  2010-03-10 11:49   ` Steffen Wolfrum
  1 sibling, 2 replies; 9+ messages in thread
From: Thomas A. Schmitz @ 2010-03-10 10:38 UTC (permalink / raw)
  To: mailing list for ConTeXt users

On Mar 10, 2010, at 10:58 AM, Steffen Wolfrum wrote:

> Hi,
> 
> very carefully I am trying to make first steps towards XML and ConTeXt (with MkIV).
> 
> Thus, I have enjoyed reading Thomas' MyWay "Getting Web Content and pdf-Output from One Source":
> 
> I only kept wondering, how to keep control over the pdf-Output in terms of fine-tuning the actual typesetting?
> A quick search in the archive gave me the answer that is attached below: by using XMLentities.
> 
> But coming back to Thomas' issue "Getting Web Content and pdf-Output from One Source":
> What about the other branch, getting web content?
> Doesn't the XML source gets "spoiled" by these inserted XMLentities that only make sense when following the pdf-Output branch?
> Or will these XMLentities be silently ignored when feeding the XML source in a CMS system or processing further to web content?
> 
> Apologies for asking such basic questions...
> 

I'm not really that advanced in this area myself, but from what I think I understood, you have to distinguish several aspects:

1. The MyWay addressed xhtml and mapping that to ConTeXt output. In html, you have a list of predefined entities (http://www.w3schools.com/tags/ref_entities.asp) and I don't think that you can simply define your own entities in html - this simply is not the way this is meant to work. So in this case, the answer to your question would be: you're using the wrong tool.

2. In xml, on the other hand, there are almost no predefined entities, you can and must define entities yourself. But xml in itself cannot be shown as web content; you will need a xsl file which translates your xml to some sort of html. This will allow you to define most anything you want, and you can indeed add all these typographical niceties. You can then either use a tool such as xsltproc or saxon to produce a "clean" html version yourself or you can leave it to the browser. 

So: if you're primarily thinking of web content that should also be typeset, use html and be aware that you probably won't be able to use all the power of ConTeXt. If you're thinking of content that will be typeset but which you also want to use in other forms (web content being just one of them), use xml. In that case, you will have to learn at least some xslt as well...

Btw, the thread you quoted refers to mkii entities, you know that the deinitions in mkiv are somewhat different, right?

Thomas
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Italic correction is missing.
  2010-03-10 10:25 ` Italic correction is missing Mehdi Omidali
@ 2010-03-10 10:42   ` Taco Hoekwater
  0 siblings, 0 replies; 9+ messages in thread
From: Taco Hoekwater @ 2010-03-10 10:42 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Mehdi Omidali wrote:
> Hi
> If you run
> \starttext
> \startformula
> V) V\exists F\exists
> \stopformula
> \stoptext
> the space between V and ) for example is not correct. (luatex .50)

The reason why there is no italic correction is because luatex
sees a simple list of mathords, and it does not apply italic
corrections between those when a 'new math font' is being used.
(if it did, operators like "sin" would come out badly).
Instead, it relies on ordinary inter-glyph kerning.

I am not at all sure what the best solution for this is, as I have no
idea how MS Word differentiates between the formula above and
'multi-letter identifiers'. Perhaps luatex should look at the \catcode
of the characters in question? Or perhaps these inter-char kerns
actually exist in Cambria?

Best wishes,
Taco
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xml content and tweaked pdf output
  2010-03-10 10:38 ` xml content and tweaked pdf output Thomas A. Schmitz
@ 2010-03-10 10:48   ` luigi scarso
  2010-03-10 11:49   ` Steffen Wolfrum
  1 sibling, 0 replies; 9+ messages in thread
From: luigi scarso @ 2010-03-10 10:48 UTC (permalink / raw)
  To: mailing list for ConTeXt users

On Wed, Mar 10, 2010 at 11:38 AM, Thomas A. Schmitz
<thomas.schmitz@uni-bonn.de> wrote:
> 2. In xml, on the other hand, there are almost no predefined entities, you can and must define entities yourself. But xml in itself cannot be shown as web content; you will need a xsl file which translates your xml to some sort of html. This will allow you to define most anything you want, and you can indeed add all these typographical niceties. You can then either use a tool such as xsltproc or saxon to produce a "clean" html version yourself or you can leave it to the browser.

You can also use css to show a xml
http://www.w3schools.com/Xml/xml_display.asp

but xslt is the main way.
-- 
luigi
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xml content and tweaked pdf output
  2010-03-10 10:38 ` xml content and tweaked pdf output Thomas A. Schmitz
  2010-03-10 10:48   ` luigi scarso
@ 2010-03-10 11:49   ` Steffen Wolfrum
  2010-03-10 12:35     ` Thomas A. Schmitz
  1 sibling, 1 reply; 9+ messages in thread
From: Steffen Wolfrum @ 2010-03-10 11:49 UTC (permalink / raw)
  To: mailing list for ConTeXt users


Am 10.03.2010 um 11:38 schrieb Thomas A. Schmitz:

> On Mar 10, 2010, at 10:58 AM, Steffen Wolfrum wrote:
> 
>> Hi,
>> 
>> very carefully I am trying to make first steps towards XML and ConTeXt (with MkIV).
>> 
>> Thus, I have enjoyed reading Thomas' MyWay "Getting Web Content and pdf-Output from One Source":
>> 
>> I only kept wondering, how to keep control over the pdf-Output in terms of fine-tuning the actual typesetting?
>> A quick search in the archive gave me the answer that is attached below: by using XMLentities.
>> 
>> But coming back to Thomas' issue "Getting Web Content and pdf-Output from One Source":
>> What about the other branch, getting web content?
>> Doesn't the XML source gets "spoiled" by these inserted XMLentities that only make sense when following the pdf-Output branch?
>> Or will these XMLentities be silently ignored when feeding the XML source in a CMS system or processing further to web content?
>> 
>> Apologies for asking such basic questions...
>> 
> 
> I'm not really that advanced in this area myself, but from what I think I understood, you have to distinguish several aspects:
> 
> 1. The MyWay addressed xhtml and mapping that to ConTeXt output. In html, you have a list of predefined entities (http://www.w3schools.com/tags/ref_entities.asp) and I don't think that you can simply define your own entities in html - this simply is not the way this is meant to work. So in this case, the answer to your question would be: you're using the wrong tool.

Sorry for being confused: In your MyWay you talk about xml and show an xhtml example. It seems I mixed this.



> 2. In xml, on the other hand, there are almost no predefined entities, you can and must define entities yourself. But xml in itself cannot be shown as web content; you will need a xsl file which translates your xml to some sort of html. This will allow you to define most anything you want, and you can indeed add all these typographical niceties. You can then either use a tool such as xsltproc or saxon to produce a "clean" html version yourself or you can leave it to the browser. 

Exactly, this is what I meant:
Wouldn't those typesetting orientated entities cause problems here?

If I follow Luigis link to ...
http://www.w3schools.com/Xml/tryxslt.asp?xmlfile=simple&xsltfile=simple

... and naively insert the mentioned below entity "addhyphen" ...
"two of our famous Belgian&addhyphen;Waffles with plenty of real maple syrup"

... the xslt process get's disturbed:
"XML Parsing Error: undefined entity Location: http://www.w3schools.com/xsl/tryxslt_result.asp Line Number 7, Column 41:"



> So: if you're primarily thinking of web content that should also be typeset, use html and be aware that you probably won't be able to use all the power of ConTeXt. If you're thinking of content that will be typeset but which you also want to use in other forms (web content being just one of them), use xml. In that case, you will have to learn at least some xslt as well...
> 
> Btw, the thread you quoted refers to mkii entities, you know that the deinitions in mkiv are somewhat different, right?


When reading Taco's reply to that thread ...

>>>> Needs an example file, because
>>>> 
>>>>  \defineXMLentity[addhyphen]{\-}
>>>>  \starttext
>>>>  \hsize 1in
>>>>  \startXMLdata
>>>>  I tried super&addhyphen;duper
>>>>  \stopXMLdata
>>>>  \stoptext
>>>> 
>>>> works in both mkii and mkiv.


... I assumed it's the same in mkii and mkiv?

Steffen
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xml content and tweaked pdf output
  2010-03-10 11:49   ` Steffen Wolfrum
@ 2010-03-10 12:35     ` Thomas A. Schmitz
  2010-03-10 13:51       ` Steffen Wolfrum
  2010-03-10 14:40       ` Hans Hagen
  0 siblings, 2 replies; 9+ messages in thread
From: Thomas A. Schmitz @ 2010-03-10 12:35 UTC (permalink / raw)
  To: mailing list for ConTeXt users


On Mar 10, 2010, at 12:49 PM, Steffen Wolfrum wrote:

> 
> Sorry for being confused: In your MyWay you talk about xml and show an xhtml example. It seems I mixed this.
> 
xhtml is a subset of xml, AFAIK. But maybe I should add a paragraph explaining this.

> 
> Exactly, this is what I meant:
> Wouldn't those typesetting orientated entities cause problems here?
> 
> If I follow Luigis link to ...
> http://www.w3schools.com/Xml/tryxslt.asp?xmlfile=simple&xsltfile=simple
> 
> ... and naively insert the mentioned below entity "addhyphen" ...
> "two of our famous Belgian&addhyphen;Waffles with plenty of real maple syrup"
> 
> ... the xslt process get's disturbed:
> "XML Parsing Error: undefined entity Location: http://www.w3schools.com/xsl/tryxslt_result.asp Line Number 7, Column 41:"
> 
> 
Yes, as I said: you have to define your entities, e.g. in the DOCTYPE declaration. That's something I discussed with Hans a few weeks ago: in the case you mention, you would have two different definitions of the entity &addhyphen; One in the DOCTYPE, which will be followed by the xslt processor:

<!ENTITY addhyphen "">

(i.e. do nothing about it)

and one in the ConTeXt environment file:

\xmlsetentity{addhyphen}{\-}

which will add the discretionary hyphen. And that's exactly what you wanted: typographical niceties for pdf output which will not disturb viewing the file on the web.
> 
> When reading Taco's reply to that thread ...
> 
> ..........

> ... I assumed it's the same in mkii and mkiv?
> 
Rule of thumb: mkii setups use uppercase XML, mkiv uses lowercase xml ("Introduction" of xml-mkiv.pdf). 

The main difference between the two is [Hans, is this right? correct me if I'm wrong]: mkii basically uses a streaming model, i.e., it translates one part of the xml file after the other. Reusing nodes and elements that have already been processed is possible, but difficult. mkiv loads the entire xml tree into memory; you can access any element at any time. 

Thomas
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xml content and tweaked pdf output
  2010-03-10 12:35     ` Thomas A. Schmitz
@ 2010-03-10 13:51       ` Steffen Wolfrum
  2010-03-10 14:40       ` Hans Hagen
  1 sibling, 0 replies; 9+ messages in thread
From: Steffen Wolfrum @ 2010-03-10 13:51 UTC (permalink / raw)
  To: mailing list for ConTeXt users


Am 10.03.2010 um 13:35 schrieb Thomas A. Schmitz:

> 
> On Mar 10, 2010, at 12:49 PM, Steffen Wolfrum wrote:
> 
>> 
>> Sorry for being confused: In your MyWay you talk about xml and show an xhtml example. It seems I mixed this.
>> 
> xhtml is a subset of xml, AFAIK. But maybe I should add a paragraph explaining this.
> 
>> 
>> Exactly, this is what I meant:
>> Wouldn't those typesetting orientated entities cause problems here?
>> 
>> If I follow Luigis link to ...
>> http://www.w3schools.com/Xml/tryxslt.asp?xmlfile=simple&xsltfile=simple
>> 
>> ... and naively insert the mentioned below entity "addhyphen" ...
>> "two of our famous Belgian&addhyphen;Waffles with plenty of real maple syrup"
>> 
>> ... the xslt process get's disturbed:
>> "XML Parsing Error: undefined entity Location: http://www.w3schools.com/xsl/tryxslt_result.asp Line Number 7, Column 41:"
>> 
>> 
> Yes, as I said: you have to define your entities, e.g. in the DOCTYPE declaration. That's something I discussed with Hans a few weeks ago: in the case you mention, you would have two different definitions of the entity &addhyphen; One in the DOCTYPE, which will be followed by the xslt processor:
> 
> <!ENTITY addhyphen "">
> 
> (i.e. do nothing about it)
> 
...



So again following Luigi's link ...
http://www.w3schools.com/Xml/tryxslt.asp?xmlfile=simple&xsltfile=simple

... I add a "&addhyphen;" down in the text and add the corresponding definition up in the DOCTYPE line:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE MyWay [<!ENTITY addhyphen " ">]>
<!-- Edited by XMLSpy® -->
<breakfast_menu>
	<food>
		<name>Belgian Waffles</name>
		<price>$5.95</price>
		<description>two of our famous Belgian&addhyphen;Waffles with plenty of real maple syrup</description>
...



Yes, this works!

Many thanks to Thomas for guiding my uncertain first xml steps ;o)

Steffen
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xml content and tweaked pdf output
  2010-03-10 12:35     ` Thomas A. Schmitz
  2010-03-10 13:51       ` Steffen Wolfrum
@ 2010-03-10 14:40       ` Hans Hagen
  1 sibling, 0 replies; 9+ messages in thread
From: Hans Hagen @ 2010-03-10 14:40 UTC (permalink / raw)
  To: mailing list for ConTeXt users; +Cc: Thomas A. Schmitz

On 10-3-2010 13:35, Thomas A. Schmitz wrote:

> The main difference between the two is [Hans, is this right? correct me if I'm wrong]: mkii basically uses a streaming model, i.e., it translates one part of the xml file after the other. Reusing nodes and elements that have already been processed is possible, but difficult. mkiv loads the entire xml tree into memory; you can access any element at any time.

indeed

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-03-10 14:40 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-10  9:58 xml content and tweaked pdf output Steffen Wolfrum
2010-03-10 10:25 ` Italic correction is missing Mehdi Omidali
2010-03-10 10:42   ` Taco Hoekwater
2010-03-10 10:38 ` xml content and tweaked pdf output Thomas A. Schmitz
2010-03-10 10:48   ` luigi scarso
2010-03-10 11:49   ` Steffen Wolfrum
2010-03-10 12:35     ` Thomas A. Schmitz
2010-03-10 13:51       ` Steffen Wolfrum
2010-03-10 14:40       ` Hans Hagen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).