From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/22744 Path: news.gmane.org!not-for-mail From: Christopher Creutzig Newsgroups: gmane.comp.tex.context Subject: Re: DOC/RTF to ConTeXt via XML Date: Tue, 27 Sep 2005 15:42:43 +0200 Message-ID: <43394C53.3040309@creutzig.de> References: <20050927100004.7F435127E5@ronja.ntg.nl> <43391DCF.1010805@capdm.com> Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Trace: sea.gmane.org 1127828867 8915 80.91.229.2 (27 Sep 2005 13:47:47 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Tue, 27 Sep 2005 13:47:47 +0000 (UTC) Original-X-From: ntg-context-bounces@ntg.nl Tue Sep 27 15:47:39 2005 Return-path: Original-Received: from ronja.vet.uu.nl ([131.211.172.88] helo=ronja.ntg.nl) by ciao.gmane.org with esmtp (Exim 4.43) id 1EKFl7-0005jS-8r for gctc-ntg-context-518@m.gmane.org; Tue, 27 Sep 2005 15:44:37 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 03BF6127E6; Tue, 27 Sep 2005 15:44:37 +0200 (CEST) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 28440-03; Tue, 27 Sep 2005 15:44:36 +0200 (CEST) Original-Received: from ronja.vet.uu.nl (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id D5571127D8; Tue, 27 Sep 2005 15:42:51 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 788A2127D8 for ; Tue, 27 Sep 2005 15:42:50 +0200 (CEST) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 28440-01-2 for ; Tue, 27 Sep 2005 15:42:49 +0200 (CEST) Original-Received: from bayes.math.uni-paderborn.de (bayes.math.uni-paderborn.de [131.234.116.40]) by ronja.ntg.nl (Postfix) with ESMTP id 2B26A1279C for ; Tue, 27 Sep 2005 15:42:48 +0200 (CEST) Original-Received: from localhost (localhost.localdomain [127.0.0.1]) by bayes.math.uni-paderborn.de (Postfix) with ESMTP id 90FF7E000115 for ; Tue, 27 Sep 2005 15:42:48 +0200 (CEST) Original-Received: from bayes.math.uni-paderborn.de ([127.0.0.1]) by localhost (bayes [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 32208-02 for ; Tue, 27 Sep 2005 15:42:45 +0200 (CEST) Original-Received: from [192.168.1.2] (p548B151A.dip0.t-ipconnect.de [84.139.21.26]) by bayes.math.uni-paderborn.de (Postfix) with ESMTP id 8723DE000112 for ; Tue, 27 Sep 2005 15:42:45 +0200 (CEST) User-Agent: Mozilla Thunderbird 1.0.6 (Macintosh/20050716) X-Accept-Language: en-us, en Original-To: mailing list for ConTeXt users In-Reply-To: <43391DCF.1010805@capdm.com> X-Enigmail-Version: 0.92.0.0 X-Virus-Scanned: by mailscan-system at math.uni-paderborn.de X-Virus-Scanned: amavisd-new at ntg.nl X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.5 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: ntg-context-bounces@ntg.nl Errors-To: ntg-context-bounces@ntg.nl X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on smtp.ntg.nl X-Virus-Scanned: amavisd-new at ntg.nl Xref: news.gmane.org gmane.comp.tex.context:22744 Archived-At: Duncan Hothersall wrote: > Well, XSLT seems to have been designed, and certainly tends to be > implemented, as a tool for simple transformations of small XML chunks. No, xslt is a tool for arbitrary xml -> xml conversions (and a little more than that). With a good implementation (say, saxon), working with moderately large trees is pretty fast. The stylesheet is actually compiled before running. > Obviously complex transformations can be constructed from a bunch of > simple transformations, but there comes a point when you should really Just about any programming language gives you simple operations to build whatever you want from. > just use a better tool - though these tend to cost serious money (e.g. =84Better=93 depends on your task at hand. > OmniMark). Also, most XSLT implementations use the DOM model, which is XSLT uses a DOM model, which is different from the W3C DOM model. > fine for a 50Kb file but will be incredibly resource-hungry if you're > processing files of 5Mb. At that point you want a streaming model, and That depends on what you want to do with your data. For many of my needs, a streaming model simply wouldn't work without keeping lots of information (to be processed later) in memory, defeating the model. I have found splitting my data into files that form conceptional units to be a good way, both for editing the files and for turnaround times. (I am using Makefiles, so the granularity of finding unchanged items for me is the file.) We are talking about almost 15MB here, which I regard as pretty much, considering it is almost pure text. Again, I don't mind using something else on XML data. I'm doing it myself. It all depends on what you want to do. In the case of transforming xml to ConTeXt, I would go for an xslt implementation, but ymmv. After all, the choice of tools always depends on many factors, including familiarity. (I've continued using perl instead of ruby for ages, until recently, for that reason.) > for a streaming model you want a better suited language than XSLT. As I > say, horses for courses. For article-length pieces and simple > transforms, XSLT might suffice. For number crunching, xslt is certainly inadequate. Transforming books of average length (say, 300-500 pages) is certainly doable, although I would go for a transformation chapter-by-chapter,especially considering that we are talking about a process where crossreferences etc. are going to be handled later in the chain. But I thought we were talking about article-length pieces anyway? Christopher