From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/47426 Path: news.gmane.org!not-for-mail From: Wolfgang Schuster Newsgroups: gmane.comp.tex.context Subject: Re: question for the xml-experts Date: Sat, 14 Feb 2009 19:25:22 +0100 Message-ID: References: <4C416126-1F10-4206-BD3F-9377AC7C81CC@uni-bonn.de> Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 (Apple Message framework v930.3) Content-Type: text/plain; charset="us-ascii"; Format="flowed"; DelSp="yes" Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1234636124 16291 80.91.229.12 (14 Feb 2009 18:28:44 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 14 Feb 2009 18:28:44 +0000 (UTC) To: mailing list for ConTeXt users Original-X-From: ntg-context-bounces@ntg.nl Sat Feb 14 19:29:59 2009 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from ronja.vet.uu.nl ([131.211.172.88] helo=ronja.ntg.nl) by lo.gmane.org with esmtp (Exim 4.50) id 1LYPGs-0001d0-Pj for gctc-ntg-context-518@m.gmane.org; Sat, 14 Feb 2009 19:29:46 +0100 Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 5FD541FCE3; Sat, 14 Feb 2009 19:28:23 +0100 (CET) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 30006-04-7; Sat, 14 Feb 2009 19:27:02 +0100 (CET) Original-Received: from ronja.vet.uu.nl (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id CFEF01FC82; Sat, 14 Feb 2009 19:27:02 +0100 (CET) Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 8DE651FC71 for ; Sat, 14 Feb 2009 19:27:01 +0100 (CET) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 05119-04 for ; Sat, 14 Feb 2009 19:26:21 +0100 (CET) Original-Received: from filter1-til.mf.surf.net (filter1-til.mf.surf.net [194.171.167.217]) by ronja.ntg.nl (Postfix) with ESMTP id A5DE51FC9B for ; Sat, 14 Feb 2009 19:26:21 +0100 (CET) Original-Received: from fg-out-1718.google.com (fg-out-1718.google.com [72.14.220.153]) by filter1-til.mf.surf.net (8.13.8/8.13.8/Debian-3) with ESMTP id n1EIQK9v027758 for ; Sat, 14 Feb 2009 19:26:21 +0100 Original-Received: by fg-out-1718.google.com with SMTP id e21so68898fga.8 for ; Sat, 14 Feb 2009 10:26:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:received:received:message-id:from:to :in-reply-to:content-type:content-transfer-encoding:mime-version :subject:date:references:x-mailer; bh=PznrOZFKnn00Quj7+DpJcdk92r2/NiUF37fwmM7Pw1w=; b=meyvj3cvEQSpJlJIlPqcSQORjukIqY5adJa948biIPvBfJ+LqV9KNGsRWIhRnzm4Vx sxW1brAjEJnYOCfkE1Xtgak/ud+gCFktIrAB4ZDp7YxmmTRpLjzfcZaE8W0V/olVHPRr 4ClSpzHVZuXAoHF6hzxEzsd1DoybGCD7hS9p0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=message-id:from:to:in-reply-to:content-type :content-transfer-encoding:mime-version:subject:date:references :x-mailer; b=c9wqdQFdmKG+AsaRc7iSRrFTF8xDRrGj02WvR4Z2j+kmJMtdY4zzeBsQ6OJHEXOYb/ cJjWHfjLfkwruLjwyaURXOH3nB2xWunDw8Co+nOrXMHfM0PpRAeu6/t6EkW3I4LUVcli WzU+6n7LFy0acPy0SevcetM+2OUgC/FhdVD40= Original-Received: by 10.86.80.5 with SMTP id d5mr452285fgb.39.1234635980186; Sat, 14 Feb 2009 10:26:20 -0800 (PST) Original-Received: from ?10.70.67.106? (8.106.113.82.net.de.o2.com [82.113.106.8]) by mx.google.com with ESMTPS id l19sm4676309fgb.57.2009.02.14.10.26.14 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sat, 14 Feb 2009 10:26:19 -0800 (PST) In-Reply-To: <4C416126-1F10-4206-BD3F-9377AC7C81CC@uni-bonn.de> X-Mailer: Apple Mail (2.930.3) X-Bayes-Prob: 0.0001 (Score 0, tokens from: @@RPTN) X-CanIt-Geo: ip=72.14.220.153; country=US; region=CA; city=Mountain View; postalcode=94043; latitude=37.4192; longitude=-122.0574; metrocode=807; areacode=650; http://maps.google.com/maps?q=37.4192,-122.0574&z=6 X-CanItPRO-Stream: uu:ntg-context@ntg.nl (inherits from uu:default, base:default) X-Canit-Stats-ID: 180006589 - 17ab0dee32af X-Scanned-By: CanIt (www . roaringpenguin . com) on 194.171.167.217 X-Virus-Scanned: amavisd-new at ntg.nl X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.11 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: ntg-context-bounces@ntg.nl Errors-To: ntg-context-bounces@ntg.nl X-Virus-Scanned: amavisd-new at ntg.nl Xref: news.gmane.org gmane.comp.tex.context:47426 Archived-At: Hi Thomas, why don't you take a look at the OpenOffice export function, I saw it's possible to convert a document to xhtml and this could be a start for you. Wolfgang Am 14.02.2009 um 18:40 schrieb Thomas A. Schmitz: > Hi all, > > this is not a question about direct technical details, but more of a > conceptual problem, and I would love to have your input and ideas on > this. I will be editing several edited volumes in my field > (humanities, classics). From experience, I know that it's impossible > to make scholars in the humanities adhere to standards. Each and > every one of them will turn in a paper (most of them written in half > a dozen different versions of Word) with its own idiosyncracies. At > my last conference, I asked them to please use Unicode for their > Greek passages, and I got blank looks and the question "What the > hell is Unicode?" > > So: I want to extract the content of these papers and process it > with ConTeXt. I thought the easiest route might be convert them to > OpenOffice odt and then use the content.xml as a starting point. > Since the formatting will be unusable anyways, it doesn't make sense > to process the odt directly; instead, I want to transform the xml > via xslt to a simplified format and then process that with ConTeXt. > I have just discovered the tool xalan ( http://xml.apache.org/xalan-c/index.html > ) which allows me to use an xslt style sheet and direct the output > to a new file. I will then need to clean up these xml files and > write a mkiv xml setup for them. > > So for those who know much more about this sort of workflow: does > that make sense? Is there any better way to achieve these results, > i.e., have the content of a couple of papers in Word and/or rtf > format and typeset it in a consistent ConTeXt environment? Is there > any tool better than xslt to convert the OpenOffice xml than xslt > (anything in lua that can parse xml)? Anything better than xalan to > convert xm -> xml? I'm just beginning to plan this, so I'd be most > grateful for any pointers. > > Thanks for reading this long message, all best > > Thomas ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________