ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
From: luigi scarso <luigi.scarso@gmail.com>
To: mailing list for ConTeXt users <ntg-context@ntg.nl>
Subject: Re: question for the xml-experts
Date: Thu, 19 Feb 2009 11:39:57 +0100	[thread overview]
Message-ID: <fe8d59da0902190239h4596a64fq686e0832560e3691@mail.gmail.com> (raw)
In-Reply-To: <E4D75231-7998-4A77-9716-CB691CB7F5B9@uni-bonn.de>

On Thu, Feb 19, 2009 at 9:54 AM, Thomas A. Schmitz
<thomas.schmitz@uni-bonn.de> wrote:
>
> On Feb 17, 2009, at 11:07 PM, luigi scarso wrote:
>
>> (sorry x my laziness)
>> If I have a good xml , then mkiv is a good choice. As far I know, mkiv
>> ~ xslt by lpeg, so
>> "traditional"
>> xml--( xslt )-->tex--( mkiv )-->pdf
>> is  like
>> xml-->( mkiv )-->pdf
>> Note that in the last chain one mixes xml+tex: if xml become complex,
>> this can end in a messy situation.
>>
>>
> Yes, you're right of course. I have a similar situation here: the xml
> produced by ooo is too messy, so I want to preprocess it to something that
> is easier to maintain and modify (e.g., I will, at some point, add index
> entries and a TOC); that's why I use xslt here. But I still produce xml
> which I process with mkiv.
>
>> But some  documents  need heavy preprocessing:
>> for example, I have one that come from  java classes serialization,
>> and I need the power of python (lxml) to do a clean work .
>> Also, if xml changes , I 've found that lxml is more flexible than xslt.
>> In this case I have
>> xml--( lxml )-->tex--( mkiv )-->pdf
>>
>> The fact is that python and lua are not so differents,
>> so I've to manage two languages
>> (python+lua) and tex;
>> with 'traditional' workflow you have to manage 3 languages
>> xslt,lua and tex
>> and subdivide responsability is not so easy as the former .
>
> Interesting. I have tried to play around with python-lxml, but am having
> some problems to understand it. Just to give me an idea: how would you
> transform this:
>
> <text:span text:style-name="T3">foo</text:span>
>
> to this
>
> <emph>foo</emph>
>
> with lxml? lxml seems to object to the ":" in the tag, even though it's
> declared in the document.
>
> Thomas

t.xml:
<foo xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0">
<text:span  text:style-name="T3">foo</text:span>
</foo>


# python
Python 2.5.2 (r252:60911, Jul 31 2008, 17:28:52)
[GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from lxml import etree
>>> tree = etree.parse(file('t.xml'))
>>> foo = tree.getroot()
>>> foo.tag
'foo'
>>>
>>> [child.tag for child in foo.iterdescendants() ]
['{urn:oasis:names:tc:opendocument:xmlns:text:1.0}span']
>>> print foo.iterdescendants.__doc__
iterdescendants(self, tag=None)

        Iterate over the descendants of this element in document order.

        As opposed to ``el.iter()``, this iterator does not yield the element
        itself.  The generated elements can be restricted to a specific tag
        name with the 'tag' keyword.

>>>
>>> FOO = etree.Element("FOO")
>>> emph =  etree.Element("emph")
>>> [child.tag for child in foo.iterdescendants(tag = '{urn:oasis:names:tc:opendocument:xmlns:text:1.0}span' ) ]
['{urn:oasis:names:tc:opendocument:xmlns:text:1.0}span']
>>> span = [child for child in foo.iterdescendants(tag = '{urn:oasis:names:tc:opendocument:xmlns:text:1.0}span' ) ][0]
>>> emph.text = span.text
>>> FOO.append(emph)
>>> etree.tostring(FOO)
'<FOO><emph>foo</emph></FOO>'
>>>


http://codespeak.net/lxml/tutorial.html
http://codespeak.net/lxml/api.html


-- 
luigi
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


  parent reply	other threads:[~2009-02-19 10:39 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-14 17:40 Thomas A. Schmitz
2009-02-14 18:25 ` Wolfgang Schuster
2009-02-14 18:37   ` Thomas A. Schmitz
2009-02-15  9:39   ` luigi scarso
2009-02-15 17:17     ` Thomas A. Schmitz
2009-02-17 22:07       ` luigi scarso
2009-02-19  8:54         ` Thomas A. Schmitz
2009-02-19  9:24           ` luigi scarso
2009-02-19 10:39           ` luigi scarso [this message]
2009-02-19 11:53             ` Thomas A. Schmitz
2009-02-19 14:10               ` luigi scarso
2009-02-20 15:09                 ` Thomas A. Schmitz
2009-02-20 15:35                   ` luigi scarso
2009-02-19 17:02           ` luigi scarso
2009-02-14 18:31 ` Patrick Gundlach
2009-02-14 19:06   ` Thomas A. Schmitz
2009-02-15 10:14 ` Khaled Hosny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fe8d59da0902190239h4596a64fq686e0832560e3691@mail.gmail.com \
    --to=luigi.scarso@gmail.com \
    --cc=ntg-context@ntg.nl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).