question for the xml-experts

* question for the xml-experts
@ 2009-02-14 17:40 Thomas A. Schmitz
  2009-02-14 18:25 ` Wolfgang Schuster
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Thomas A. Schmitz @ 2009-02-14 17:40 UTC (permalink / raw)
  To: mailing ConTeXt users list for

Hi all,

this is not a question about direct technical details, but more of a  
conceptual problem, and I would love to have your input and ideas on  
this. I will be editing several edited volumes in my field  
(humanities, classics). From experience, I know that it's impossible  
to make scholars in the humanities adhere to standards. Each and every  
one of them will turn in a paper (most of them written in half a dozen  
different versions of Word) with its own idiosyncracies. At my last  
conference, I asked them to please use Unicode for their Greek  
passages, and I got blank looks and the question "What the hell is  
Unicode?"

So: I want to extract the content of these papers and process it with  
ConTeXt. I thought the easiest route might be convert them to  
OpenOffice odt and then use the content.xml as a starting point. Since  
the formatting will be unusable anyways, it doesn't make sense to  
process the odt directly; instead, I want to transform the xml via  
xslt to a simplified format and then process that with ConTeXt. I have  
just discovered the tool xalan ( http://xml.apache.org/xalan-c/index.html 
  ) which allows me to use an xslt style sheet and direct the output  
to a new file. I will then need to clean up these xml files and write  
a mkiv xml setup for them.

So for those who know much more about this sort of workflow: does that  
make sense? Is there any better way to achieve these results, i.e.,  
have the content of a couple of papers in Word and/or rtf format and  
typeset it in a consistent ConTeXt environment? Is there any tool  
better than xslt to convert the OpenOffice xml than xslt (anything in  
lua that can parse xml)? Anything better than xalan to convert xm ->  
xml? I'm just beginning to plan this, so I'd be most grateful for any  
pointers.

Thanks for reading this long message, all best

Thomas
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 17+ messages in thread