From: "Thomas A. Schmitz" <thomas.schmitz@uni-bonn.de>
To: mailing ConTeXt users list for <ntg-context@ntg.nl>
Subject: question for the xml-experts
Date: Sat, 14 Feb 2009 18:40:51 +0100 [thread overview]
Message-ID: <4C416126-1F10-4206-BD3F-9377AC7C81CC@uni-bonn.de> (raw)
Hi all,
this is not a question about direct technical details, but more of a
conceptual problem, and I would love to have your input and ideas on
this. I will be editing several edited volumes in my field
(humanities, classics). From experience, I know that it's impossible
to make scholars in the humanities adhere to standards. Each and every
one of them will turn in a paper (most of them written in half a dozen
different versions of Word) with its own idiosyncracies. At my last
conference, I asked them to please use Unicode for their Greek
passages, and I got blank looks and the question "What the hell is
Unicode?"
So: I want to extract the content of these papers and process it with
ConTeXt. I thought the easiest route might be convert them to
OpenOffice odt and then use the content.xml as a starting point. Since
the formatting will be unusable anyways, it doesn't make sense to
process the odt directly; instead, I want to transform the xml via
xslt to a simplified format and then process that with ConTeXt. I have
just discovered the tool xalan ( http://xml.apache.org/xalan-c/index.html
) which allows me to use an xslt style sheet and direct the output
to a new file. I will then need to clean up these xml files and write
a mkiv xml setup for them.
So for those who know much more about this sort of workflow: does that
make sense? Is there any better way to achieve these results, i.e.,
have the content of a couple of papers in Word and/or rtf format and
typeset it in a consistent ConTeXt environment? Is there any tool
better than xslt to convert the OpenOffice xml than xslt (anything in
lua that can parse xml)? Anything better than xalan to convert xm ->
xml? I'm just beginning to plan this, so I'd be most grateful for any
pointers.
Thanks for reading this long message, all best
Thomas
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage : http://www.pragma-ade.nl / http://tex.aanhet.net
archive : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___________________________________________________________________________________
next reply other threads:[~2009-02-14 17:40 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-14 17:40 Thomas A. Schmitz [this message]
2009-02-14 18:25 ` Wolfgang Schuster
2009-02-14 18:37 ` Thomas A. Schmitz
2009-02-15 9:39 ` luigi scarso
2009-02-15 17:17 ` Thomas A. Schmitz
2009-02-17 22:07 ` luigi scarso
2009-02-19 8:54 ` Thomas A. Schmitz
2009-02-19 9:24 ` luigi scarso
2009-02-19 10:39 ` luigi scarso
2009-02-19 11:53 ` Thomas A. Schmitz
2009-02-19 14:10 ` luigi scarso
2009-02-20 15:09 ` Thomas A. Schmitz
2009-02-20 15:35 ` luigi scarso
2009-02-19 17:02 ` luigi scarso
2009-02-14 18:31 ` Patrick Gundlach
2009-02-14 19:06 ` Thomas A. Schmitz
2009-02-15 10:14 ` Khaled Hosny
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C416126-1F10-4206-BD3F-9377AC7C81CC@uni-bonn.de \
--to=thomas.schmitz@uni-bonn.de \
--cc=ntg-context@ntg.nl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).