From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/22756 Path: news.gmane.org!not-for-mail From: Duncan Hothersall Newsgroups: gmane.comp.tex.context Subject: Re: DOC/RTF to ConTeXt via XML Date: Wed, 28 Sep 2005 09:54:39 +0100 Message-ID: <433A5A4F.4050407@capdm.com> References: <20050928080211.5A0EB127F8@ronja.ntg.nl> Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Trace: sea.gmane.org 1127899874 7138 80.91.229.2 (28 Sep 2005 09:31:14 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 28 Sep 2005 09:31:14 +0000 (UTC) Original-X-From: ntg-context-bounces@ntg.nl Wed Sep 28 11:31:09 2005 Return-path: Original-Received: from ronja.vet.uu.nl ([131.211.172.88] helo=ronja.ntg.nl) by ciao.gmane.org with esmtp (Exim 4.43) id 1EKYFj-0007lx-M6 for gctc-ntg-context-518@m.gmane.org; Wed, 28 Sep 2005 11:29:27 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 4AE55127D8; Wed, 28 Sep 2005 11:29:25 +0200 (CEST) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 09968-04; Wed, 28 Sep 2005 11:29:25 +0200 (CEST) Original-Received: from ronja.vet.uu.nl (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id C502D127E4; Wed, 28 Sep 2005 10:55:24 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id F3E30127E4 for ; Wed, 28 Sep 2005 10:55:22 +0200 (CEST) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 09748-03 for ; Wed, 28 Sep 2005 10:55:22 +0200 (CEST) Original-Received: from liszt-09.ednet.co.uk (liszt-09.ednet.co.uk [212.20.226.21]) by ronja.ntg.nl (Postfix) with ESMTP id 05D26127D8 for ; Wed, 28 Sep 2005 10:55:21 +0200 (CEST) Original-Received: from [192.168.254.41] (unknown [212.20.255.162]) by liszt-09.ednet.co.uk (Postfix) with ESMTP id 94AF4225EE9 for ; Wed, 28 Sep 2005 09:55:21 +0100 (BST) User-Agent: Mozilla Thunderbird 1.0.6 (Windows/20050716) X-Accept-Language: en-us, en Original-To: ntg-context@ntg.nl In-Reply-To: <20050928080211.5A0EB127F8@ronja.ntg.nl> X-Virus-Scanned: amavisd-new at ntg.nl X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.5 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: ntg-context-bounces@ntg.nl Errors-To: ntg-context-bounces@ntg.nl X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on smtp.ntg.nl X-Virus-Scanned: amavisd-new at ntg.nl Xref: news.gmane.org gmane.comp.tex.context:22756 Archived-At: > No need for rtf. That would loose lots of information anyway, wouldn't it? RTF can capture everything that .doc can (MS update it every time they rev the .doc format), and it has the advantage that it is defined in a spec with a grammar, which means that importing routines (like the one in OO.o) tend to be better than for the binary .doc format. So I would usually use .rtf as the Save As... from Word, rather than relying on OO.o's reverse engineering of the .doc format. Others' experiences may vary, of course, and perhaps I do an injustice to OO.o's Word imports, which have certainly improved. But RTF is a fairly safe bet, and additionally it is 'human readable' so that helps debugging. >>\startHans >>converting open office xml is not always easy; stay away from tab's and use >>high level constructs as much as possible >>\stopHans I would add to this - make sure you use either OO.o 1.1.5 or a 2.0 Beta, since earlier versions used a file format which was a lot trickier to post-process (problems with conflating styles into paragraph formats). >>Once I get a sane xml file (this seems to be the biggest problem) what is the >>best tool to convert this to ConTeXt? Well you might not need to - remember that ConTeXt can process XML natively now, which is why I suggested you look at the DocBook-in-ConTeXt project, which uses this feature. You wouldn't necessarily have to use the DocBook standard, but you could use the principles of that project to define a nice output from your own (simple) brand of XML. Duncan