From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/22758 Path: news.gmane.org!not-for-mail From: Christopher Creutzig Newsgroups: gmane.comp.tex.context Subject: Re: DOC/RTF to ConTeXt via XML Date: Wed, 28 Sep 2005 13:45:03 +0200 Message-ID: <433A823F.5000009@creutzig.de> References: <20050928080211.5A0EB127F8@ronja.ntg.nl> <433A5A4F.4050407@capdm.com> Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Trace: sea.gmane.org 1127907988 31425 80.91.229.2 (28 Sep 2005 11:46:28 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 28 Sep 2005 11:46:28 +0000 (UTC) Original-X-From: ntg-context-bounces@ntg.nl Wed Sep 28 13:46:24 2005 Return-path: Original-Received: from ronja.vet.uu.nl ([131.211.172.88] helo=ronja.ntg.nl) by ciao.gmane.org with esmtp (Exim 4.43) id 1EKaNA-0006Bq-16 for gctc-ntg-context-518@m.gmane.org; Wed, 28 Sep 2005 13:45:16 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 9D618127F2; Wed, 28 Sep 2005 13:45:15 +0200 (CEST) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 11110-01; Wed, 28 Sep 2005 13:45:11 +0200 (CEST) Original-Received: from ronja.vet.uu.nl (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 41D9C127E0; Wed, 28 Sep 2005 13:45:11 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 7A535127E0 for ; Wed, 28 Sep 2005 13:45:09 +0200 (CEST) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 11022-03 for ; Wed, 28 Sep 2005 13:45:08 +0200 (CEST) Original-Received: from bayes.math.uni-paderborn.de (bayes.math.uni-paderborn.de [131.234.116.40]) by ronja.ntg.nl (Postfix) with ESMTP id 55989127C0 for ; Wed, 28 Sep 2005 13:45:06 +0200 (CEST) Original-Received: from localhost (localhost.localdomain [127.0.0.1]) by bayes.math.uni-paderborn.de (Postfix) with ESMTP id EF31DE000099 for ; Wed, 28 Sep 2005 13:45:05 +0200 (CEST) Original-Received: from bayes.math.uni-paderborn.de ([127.0.0.1]) by localhost (bayes [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 21087-03 for ; Wed, 28 Sep 2005 13:45:05 +0200 (CEST) Original-Received: from [192.168.1.2] (p548B0EDA.dip0.t-ipconnect.de [84.139.14.218]) by bayes.math.uni-paderborn.de (Postfix) with ESMTP id B7E79E0000B4 for ; Wed, 28 Sep 2005 13:45:04 +0200 (CEST) User-Agent: Mozilla Thunderbird 1.0.6 (Macintosh/20050716) X-Accept-Language: en-us, en Original-To: mailing list for ConTeXt users In-Reply-To: <433A5A4F.4050407@capdm.com> X-Enigmail-Version: 0.92.0.0 X-Virus-Scanned: by mailscan-system at math.uni-paderborn.de X-Virus-Scanned: amavisd-new at ntg.nl X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.5 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: ntg-context-bounces@ntg.nl Errors-To: ntg-context-bounces@ntg.nl X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on smtp.ntg.nl X-Virus-Scanned: amavisd-new at ntg.nl Xref: news.gmane.org gmane.comp.tex.context:22758 Archived-At: Duncan Hothersall wrote: > RTF can capture everything that .doc can (MS update it every time they > rev the .doc format), and it has the advantage that it is defined in a > spec with a grammar, which means that importing routines (like the one Oh, yes, the RTF spec. It really makes you wonder what Microsoft employees understand by the word =93spec.=94 Word breaks almost every single rule in that spec and has done so for ages: =93The LetterSequence is made up of lowercase alphabetic characters (a-z). RTF is case sensitive. The following Word 97-2000 keywords do not currently follow the requirement that keywords may not contain any uppercase alphabetic characters. ...=94 But I should be happy that these violations are actually dcumented. > in OO.o) tend to be better than for the binary .doc format. So I would Okay; I did not know that whatever Microsoft currently calls RTF is actually able to save all Word files losslessly. (I am in the lucky position not to have any Word files to convert.) Makes me wonder if there really is any need for an XML step in between. Can OOo convert RTF to XML without user intervention, such as clicking somewhere with a mouse? Maybe rtf2fo.com, http://www.infinity-loop.de/products/upcast/, or http://sourceforge.net/projects/majix/ are good alternatives for this step? (I never used any one of them.) > which have certainly improved. But RTF is a fairly safe bet, and > additionally it is 'human readable' so that helps debugging. Asking a human to read RTF is certainly inhuman. :-) But there is another advantage of using RTF: Authors can use almost any word processor they want. :-) > Well you might not need to - remember that ConTeXt can process XML > natively now, which is why I suggested you look at the But unless I'm mistaken, this is based on a streaming model, which has its advantages, but also disadvantages. So, the question is whether the xml format is close enough to the order in which ConTeXt would like to get the bits and pieces. Since the format has not been defined yet, this question should be kept in mind. Christopher