RE: DOC/RTF to ConTeXt via XML

ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed

* RE: DOC/RTF to ConTeXt via XML
@ 2005-09-27 14:50 Idris Samawi Hamid
  2005-09-28  8:02 ` Christopher Creutzig
  0 siblings, 1 reply; 11+ messages in thread
From: Idris Samawi Hamid @ 2005-09-27 14:50 UTC (permalink / raw)
  Cc: Adam Lindsay

Hi Christopher, Duncan, Hans, and Adam,

Thank you so much for your detailed comments and suggestions. Again, I'm 
completely new to xml and feel like a fish out of water. OTOH I use sooo much 
time just manually extracting text (with innumerable transliteration 
diacritics) and then copying-pasting to WinEDT that I am willing to explore 
the xml approach if it can be made sane enough...

>===== Original Message From Christopher Creutzig <christopher@creutzig.de> 
=====
>Duncan Hothersall wrote:
>> Well, XSLT seems to have been designed, and certainly tends to be
>> implemented, as a tool for simple transformations of small XML chunks.
>
> No, xslt is a tool for arbitrary xml -> xml conversions (and a little
>more than that).

Ok, you guys have lost me now-) Maybe the best thing to do is try something 
practical: take an average word article and see what's involved in converting 
it to ConTeXt. From what I gather so far the process goes something like

doc  => rtf 
rtf  => OO.o
OO.o => xml

But here things get dicey because

\startHans
converting open office xml is not always easy; stay away from tab's and use 
high level constructs as much as possible
\stopHans

Question: Will a proper doc (or OO.o) template solve this problem or is this a 
post-OO.o-processing problem no matter what I do beforehand?

>From this discussion it seems that I (as an xml ignoramous) would be better 
off converting to ConTeXt code rather than processing pure xml blocks (but 
maybe I'm wrong).

Once I get a sane xml file (this seems to be the biggest problem) what is the 
best tool to convert this to ConTeXt?

We are all extremely busy, of course, but if anyone finds this interesting I 
can send a sample doc article from my journal. Maybe we can do a MyWay or 
something to document this process for ourselves and others, as well as find 
the most practical approach to creating a sane workflow. Besides, this kind of 
project seems to be exactly the kind of thing to illustrate the full power of 
ConTeXt.

This is a mid-term project so no urgency (I'll keep copying and pasting for 
now->)

Thanks again you all for your advice.

Best
Idris

============================
Professor Idris Samawi Hamid
Department of Philosophy
Colorado State University
Fort Collins, CO 80523

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: DOC/RTF to ConTeXt via XML
  2005-09-27 14:50 DOC/RTF to ConTeXt via XML Idris Samawi Hamid
@ 2005-09-28  8:02 ` Christopher Creutzig
  0 siblings, 0 replies; 11+ messages in thread
From: Christopher Creutzig @ 2005-09-28  8:02 UTC (permalink / raw)

Idris Samawi Hamid wrote:
> Ok, you guys have lost me now-) Maybe the best thing to do is try something 

 Just ignore the detail of what xslt can and can't do for the moment.
That just influences the choice of tools for one particular step and we
all agree that there are tools for this step.

> it to ConTeXt. From what I gather so far the process goes something like
> 
> doc  => rtf 
> rtf  => OO.o
> OO.o => xml

 No need for rtf.  That would loose lots of information anyway, wouldn't it?

> \startHans
> converting open office xml is not always easy; stay away from tab's and use 
> high level constructs as much as possible
> \stopHans

 I'm not really sure what Hand meant by this.  I assume he does have a
valid point, since so far I only had a short and theoretical look at the
format, but I can only guess what it is.  Hans, could you give an
example or two?

>>From this discussion it seems that I (as an xml ignoramous) would be better 
> off converting to ConTeXt code rather than processing pure xml blocks (but 
> maybe I'm wrong).

 XML is much, much easier to parse than just about anything else.  That
means that whatever your conversion process uses, you can simply reuse
an XML parser in whatever language you want to use.  (Interpreting the
file may be easy or hard, depending on the xml structure at hand.)  The
only exception I can see right now would be a rather large and
error-prone “Visual” Basic program to create a sort of export filter for
Word to write ConTeXt.  I certainly don't think that's easier.

> Once I get a sane xml file (this seems to be the biggest problem) what is the 
> best tool to convert this to ConTeXt?

 It depends on who is going to write the conversion.  From the languages
I've used so far, it's probably easiest to do in xslt, but if you
are/have at hand a programmer who's good at ruby but would have to learn
xslt first, the whole thing may not be big enough to warrant learning
another language first.  Unless that programmer wants to, which would be
a very good sign.  Learning a new language per year is not really a bad
idea.

> We are all extremely busy, of course, but if anyone finds this interesting I 
> can send a sample doc article from my journal. Maybe we can do a MyWay or 
> something to document this process for ourselves and others, as well as find 

 It might be a pretty specific thing, though.  My guess is that you
could make more progress by thinking about what sort of structurals you
would like to have, rather than looking at what you have right now.

Christopher

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: DOC/RTF to ConTeXt via XML
  2005-09-28  8:54 ` Duncan Hothersall
@ 2005-09-28 11:45   ` Christopher Creutzig
  0 siblings, 0 replies; 11+ messages in thread
From: Christopher Creutzig @ 2005-09-28 11:45 UTC (permalink / raw)

Duncan Hothersall wrote:

> RTF can capture everything that .doc can (MS update it every time they
> rev the .doc format), and it has the advantage that it is defined in a
> spec with a grammar, which means that importing routines (like the one

 Oh, yes, the RTF spec.  It really makes you wonder what Microsoft
employees understand by the word “spec.”  Word breaks almost every
single rule in that spec and has done so for ages:  “The LetterSequence
is made up of lowercase alphabetic characters (a-z). RTF is case
sensitive.  The following Word 97-2000 keywords do not currently follow
the requirement that keywords may not contain any uppercase alphabetic
characters.  ...”  But I should be happy that these violations are
actually dcumented.

> in OO.o) tend to be better than for the binary .doc format. So I would

 Okay; I did not know that whatever Microsoft currently calls RTF is
actually able to save all Word files losslessly.  (I am in the lucky
position not to have any Word files to convert.)  Makes me wonder if
there really is any need for an XML step in between.  Can OOo convert
RTF to XML without user intervention, such as clicking somewhere with a
mouse?  Maybe rtf2fo.com, http://www.infinity-loop.de/products/upcast/,
or http://sourceforge.net/projects/majix/ are good alternatives for this
step?  (I never used any one of them.)

> which have certainly improved. But RTF is a fairly safe bet, and
> additionally it is 'human readable' so that helps debugging.

 Asking a human to read RTF is certainly inhuman.  :-)

 But there is another advantage of using RTF: Authors can use almost any
word processor they want. :-)

> Well you might not need to - remember that ConTeXt can process XML
> natively now, which is why I suggested you look at the

 But unless I'm mistaken, this is based on a streaming model, which has
its advantages, but also disadvantages.  So, the question is whether the
xml format is close enough to the order in which ConTeXt would like to
get the bits and pieces.  Since the format has not been defined yet,
this question should be kept in mind.

Christopher

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: DOC/RTF to ConTeXt via XML
       [not found] <20050928080211.5A0EB127F8@ronja.ntg.nl>
@ 2005-09-28  8:54 ` Duncan Hothersall
  2005-09-28 11:45   ` Christopher Creutzig
  0 siblings, 1 reply; 11+ messages in thread
From: Duncan Hothersall @ 2005-09-28  8:54 UTC (permalink / raw)


>  No need for rtf.  That would loose lots of information anyway, wouldn't it?

RTF can capture everything that .doc can (MS update it every time they
rev the .doc format), and it has the advantage that it is defined in a
spec with a grammar, which means that importing routines (like the one
in OO.o) tend to be better than for the binary .doc format. So I would
usually use .rtf as the Save As... from Word, rather than relying on
OO.o's reverse engineering of the .doc format. Others' experiences may
vary, of course, and perhaps I do an injustice to OO.o's Word imports,
which have certainly improved. But RTF is a fairly safe bet, and
additionally it is 'human readable' so that helps debugging.

>>\startHans
>>converting open office xml is not always easy; stay away from tab's and use 
>>high level constructs as much as possible
>>\stopHans

I would add to this - make sure you use either OO.o 1.1.5 or a 2.0 Beta,
since earlier versions used a file format which was a lot trickier to
post-process (problems with conflating styles into paragraph formats).

>>Once I get a sane xml file (this seems to be the biggest problem) what is the 
>>best tool to convert this to ConTeXt?

Well you might not need to - remember that ConTeXt can process XML
natively now, which is why I suggested you look at the
DocBook-in-ConTeXt project, which uses this feature. You wouldn't
necessarily have to use the DocBook standard, but you could use the
principles of that project to define a nice output from your own
(simple) brand of XML.

Duncan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: DOC/RTF to ConTeXt via XML
  2005-09-27 15:10 Idris Samawi Hamid
  2005-09-27 15:19 ` Adam Lindsay
@ 2005-09-28  7:08 ` Christopher Creutzig
  1 sibling, 0 replies; 11+ messages in thread
From: Christopher Creutzig @ 2005-09-28  7:08 UTC (permalink / raw)

Idris Samawi Hamid wrote:

>>But you should also explore DocBook-in-ConTeXt, which
>>uses ConTeXt's native XML processing capabilities.
> 
> 
> Is it possible to create a Word template that is isomorphic with a DocBook 
> format?

 You can write a Word template isomorphic to a (pretty large) subset of
DocBook, although I believe Word does not allow you to introduce new
types of crossreferences, so you can't reach everything DocBook has.
Whether you can make your authors use it consistently is a different
matter – DocBook uses, for example, different elements for different
types of what ConTeXt calls typing: <code> for inline code fragments,
<command> for something you invoke (use <option> for its options and
<symbol> for the placeholders to be replaced by actual values),
<computeroutput> should be obvious but there are also <screen> and
<screenshot> – the difference is a bit subtle; programmers might also
use <constant>, <errorcode>, <errorname>, <errortext>, <exceptionname>,
<funcdef>, <funcprototype>, <funcsynopsis>, <cmdsynopsis>,
<constructorsynopsis>, <arg>, <function>, <methodname>, <methodparam>,
<methodsynopsis>, <ooclass>, <ooexception>, <oointerface>,
<progamlisting> and its annotated cousin <programlistingco>, <sgmltag>,
<structfield>, <structname>, <varargs>, <varname>; <envar> denotes
environment variables, <filename> is almost superfluous since it is a
special case of a <systemitem> (yes, many of these elements carry
further meta-information in their attributes), then there are the
unspecific <literal> and <literallayout> elements and also <markup>,
<userinput>, and finally there is also <uri> to format URLs and other URIs.

 Somewhat related elements also abound: <keycap> is used to denote keys
on the keyboard, <keycombo> for combinations of those keys.  <guibutton>
is used for the text on a button in a GUI, <guilabel>, <guimenu>,
<guimenuitem> and <guisubmenu> and many others.

 Note that I do not question this abundance of possibilities.  After
all, it is logical markup taken to an extreme and probably noone really
uses all of it, yet all the parts are already there if you want them.
I do question the likelihood of the average Word user (who, let's face
it, probably never used formats since the introductory course) making
good use of this.  Sure it is nice to have authors' first and last names
explicitly marked in your text, but someone has to go there and do that,
and if they don't see any difference on their screens after doing it,
they will get lazy and not do it for the fifteenth person they name.
Additionally, most of the DocBook elements may only appear nested in the
correct places in other elements, which makes using an isomorphic Word
template rather challenging even for the advanced user.

 If you would like to browse through the long list of markup items in
DocBook, please see http://docbook.org/tdg/en/html/docbook.html – and do
not be afraid; as I said above, very much of what is there is absolutely
special-purpose stuff.

> Adam (privately) suggested hiring someone to write a structured format for 
> authors. Is that where docbook comes in?

 Actually, I would not orient the thing to fit to DocBook.  DocBook is
an extremely flexible beast, so if, after designing the structured
format best suited for your needs (this does not need to invlove any
xml), you want to map it to DocBook, that should not be any problem.

> Basically, authors in the humanities use Word and it's virtually a lost cause 
> getting them to switch to anything else, even free tools like OO.o (let alone 
> ConTeXt). It would have to be someting where I could do 
> word=>docbook=>ConTeXt.

 As I said: Offering Word is obviously a must, but if that were the only
option you offered, you'd be actively adding your part to making sure
the situation does not change.  And getting Word to export DocBook will
certainly be much harder than using OOo for that part.

Christopher

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: DOC/RTF to ConTeXt via XML
  2005-09-27 15:10 Idris Samawi Hamid
@ 2005-09-27 15:19 ` Adam Lindsay
  2005-09-28  7:08 ` Christopher Creutzig
  1 sibling, 0 replies; 11+ messages in thread
From: Adam Lindsay @ 2005-09-27 15:19 UTC (permalink / raw)


Idris Samawi Hamid said this at Tue, 27 Sep 2005 09:10:27 -0600:

>Adam (privately) suggested hiring someone to write a structured format for 
>authors. Is that where docbook comes in?

Ah, sorry about that. I meant you *could* hire someone to design a
format, but the bigger point was that it would be rather futile without
a user-level authoring tool backing it up!
-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Adam T. Lindsay, Computing Dept.     atl@comp.lancs.ac.uk
 Lancaster University, InfoLab21        +44(0)1524/510.514
 Lancaster, LA1 4WA, UK             Fax:+44(0)1524/510.492
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: DOC/RTF to ConTeXt via XML
@ 2005-09-27 15:10 Idris Samawi Hamid
  2005-09-27 15:19 ` Adam Lindsay
  2005-09-28  7:08 ` Christopher Creutzig
  0 siblings, 2 replies; 11+ messages in thread
From: Idris Samawi Hamid @ 2005-09-27 15:10 UTC (permalink / raw)


Hi Duncan,

I know little about xml and virtually nothing about Word (except that it's 
crap) so please forgive me if this is a stupid or clueless question-)

>But you should also explore DocBook-in-ConTeXt, which
>uses ConTeXt's native XML processing capabilities.

Is it possible to create a Word template that is isomorphic with a DocBook 
format?

Adam (privately) suggested hiring someone to write a structured format for 
authors. Is that where docbook comes in?

Basically, authors in the humanities use Word and it's virtually a lost cause 
getting them to switch to anything else, even free tools like OO.o (let alone 
ConTeXt). It would have to be someting where I could do 
word=>docbook=>ConTeXt.

<Sigh>

Best
Idris

============================
Professor Idris Samawi Hamid
Department of Philosophy
Colorado State University
Fort Collins, CO 80523

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: DOC/RTF to ConTeXt via XML
  2005-09-27 10:24 ` Duncan Hothersall
@ 2005-09-27 13:42   ` Christopher Creutzig
  0 siblings, 0 replies; 11+ messages in thread
From: Christopher Creutzig @ 2005-09-27 13:42 UTC (permalink / raw)

Duncan Hothersall wrote:
> Well, XSLT seems to have been designed, and certainly tends to be
> implemented, as a tool for simple transformations of small XML chunks.

 No, xslt is a tool for arbitrary xml -> xml conversions (and a little
more than that).  With a good implementation (say, saxon), working with
moderately large trees is pretty fast.  The stylesheet is actually
compiled before running.

> Obviously complex transformations can be constructed from a bunch of
> simple transformations, but there comes a point when you should really

 Just about any programming language gives you simple operations to
build whatever you want from.

> just use a better tool - though these tend to cost serious money (e.g.

 „Better“ depends on your task at hand.

> OmniMark). Also, most XSLT implementations use the DOM model, which is

 XSLT uses a DOM model, which is different from the W3C DOM model.

> fine for a 50Kb file but will be incredibly resource-hungry if you're
> processing files of 5Mb. At that point you want a streaming model, and

 That depends on what you want to do with your data.  For many of my
needs, a streaming model simply wouldn't work without keeping lots of
information (to be processed later) in memory, defeating the model.

 I have found splitting my data into files that form conceptional units
to be a good way, both for editing the files and for turnaround times.
(I am using Makefiles, so the granularity of finding unchanged items for
me is the file.)  We are talking about almost 15MB here, which I regard
as pretty much, considering it is almost pure text.

 Again, I don't mind using something else on XML data.  I'm doing it
myself.  It all depends on what you want to do.  In the case of
transforming xml to ConTeXt, I would go for an xslt implementation, but
ymmv.  After all, the choice of tools always depends on many factors,
including familiarity.  (I've continued using perl instead of ruby for
ages, until recently, for that reason.)

> for a streaming model you want a better suited language than XSLT. As I
> say, horses for courses. For article-length pieces and simple
> transforms, XSLT might suffice.

 For number crunching, xslt is certainly inadequate.  Transforming books
of average length (say, 300-500 pages) is certainly doable, although I
would go for a transformation chapter-by-chapter,especially considering
that we are talking about a process where crossreferences etc. are going
to be handled later in the chain.  But I thought we were talking about
article-length pieces anyway?

Christopher

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: DOC/RTF to ConTeXt via XML
       [not found] <20050927100004.7F435127E5@ronja.ntg.nl>
@ 2005-09-27 10:24 ` Duncan Hothersall
  2005-09-27 13:42   ` Christopher Creutzig
  0 siblings, 1 reply; 11+ messages in thread
From: Duncan Hothersall @ 2005-09-27 10:24 UTC (permalink / raw)


Slightly OT, sorry:

>>OpenOffice.org does allow you to attach an XSLT stylesheet to an export
>>process which therefore allows you to do a (limited) transformation from
>>the visual markup which is its native format to a more structured one
> 
>  Why „limited“?  

Well, XSLT seems to have been designed, and certainly tends to be
implemented, as a tool for simple transformations of small XML chunks.
Obviously complex transformations can be constructed from a bunch of
simple transformations, but there comes a point when you should really
just use a better tool - though these tend to cost serious money (e.g.
OmniMark). Also, most XSLT implementations use the DOM model, which is
fine for a 50Kb file but will be incredibly resource-hungry if you're
processing files of 5Mb. At that point you want a streaming model, and
for a streaming model you want a better suited language than XSLT. As I
say, horses for courses. For article-length pieces and simple
transforms, XSLT might suffice.

>  Also, don't limit your authors to Word.  Offering Word is obviously a
> requirement, but if you go the way through OOo, there would be no point
> in not offering an OOo template file.  If you are using a standard xml
> format, such as (a subset of) DocBook or TEI, you probably should accept
> articles in that format, too.  And, of course, ConTeXt.

Absolutely; particularly if you can offer authors an incentive or direct
benefit from adopting OO.o, such as speed of turnaround of proofs, etc.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: DOC/RTF to ConTeXt via XML
  2005-09-27  8:05 ` Duncan Hothersall
@ 2005-09-27  9:03   ` Christopher Creutzig
  0 siblings, 0 replies; 11+ messages in thread
From: Christopher Creutzig @ 2005-09-27  9:03 UTC (permalink / raw)

Duncan Hothersall wrote:
>>Question: Is it possible to design a doc or rtf template that Open Office can 
>>convert to a sane, consistent xml format? 
> 
> 
> OpenOffice.org does allow you to attach an XSLT stylesheet to an export
> process which therefore allows you to do a (limited) transformation from
> the visual markup which is its native format to a more structured one

 Why „limited“?  Complicated things are just, well, a bit complicated to
achieve.  It is certainly possible to get a structured document from,
say, an average xhtml file.  I would prefer not to write that code,
though.  It would be rather boring and full of hard-to-read special cases.

> which you would need. But the biggest challenge is that all
> wordprocessors are designed for visual editing, meaning that there are,
> for example, 15 or so different ways to get a bulleted list in Word,
> creating 15 or so different RTF constructs, and coping with this can be
> a nightmare.

 Yes, it can.  (Although RTF is completely unrelated to this problem,
since OOo would read the Word file.  And the OOo step greatly simplifies
the problem, since iirc the OOo format has just one or maybe two ways of
saving bulleted lists.  Or were you refering to different bullets?)  The
stricter your rules for the authors are, the easier it is to write the
required xslt program.  If your authors expect to be able to write
chapter headers by manually switching to a font in the range of 20 to 24
pt and adding a number in front, you've got a hell of a coding session
in front of you.  If, otoh, you take the dictatorical approach of
telling them in advance that manual font changes (maybe apart from
pseudo-italics and pseudo-bold which will be mapped to \em in the end)
will simply be ignored, your code will be much easier but you may have a
problem with the authors.

> The FO approach (Paul Tremblay's focus) is one way to process XML to
> paginated output, but there are many others. Personally I don't like the
> FO approach, for a variety of reasons, but I'm sure others have had
> success with it. But you should also explore DocBook-in-ConTeXt, which
> uses ConTeXt's native XML processing capabilities. And don't rule out

 The advantage of using DocBook is that you get a very rich set of
capabilities.  The disadvantage can be described in almost the same
words, plus, as I said before, DocBook is one of the most verbose
formats in common use.  If you only use the format as an intermediate
step, that is irrelevant, but if your authors willsend in files that
way, it is not.

> using a separate scripting language to convert XML into ConTeXt as a
> batch process, since that will give you the ultimate flexibility in
> accessing all of ConTeXt's abilities.

 Personally, I'd use xslt for that.  Navigating the xml tree is
extremely easy and writing out text instead of xml is not really a problem.

>>Question: Does the entire journal have to be in programmed in xml or can 
>>ConTeXt process xml locally? For example, I may have my own article done in 
>>COnTeXt mixed with other articles done in rtf=>xml.
> 
> 
> You can just put XML into \startXMLdata ... \stopXMLdata blocks. I do
> this for MathML processing within a larger ConTeXt document.

 I'd approach Idris' problem the other way round: Transform the xml
files to ConTeXt and leave the ConTeXt files as is.  Then, texexec the
whole thing.

>>Any other advice (and/or pitfalls to watch for) would be appreciated. This 
>>sounds very promising!
> 
> 
> Horses for courses. It's possible to get sucked into things like an FO
> implementation or an XML conversion and find that you have spent months
> perfecting it and it only shaves half an hour off your production time!

 Amen.

 Also, don't limit your authors to Word.  Offering Word is obviously a
requirement, but if you go the way through OOo, there would be no point
in not offering an OOo template file.  If you are using a standard xml
format, such as (a subset of) DocBook or TEI, you probably should accept
articles in that format, too.  And, of course, ConTeXt.

Christopher

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: DOC/RTF to ConTeXt via XML
       [not found] <20050927074229.9EF85127E2@ronja.ntg.nl>
@ 2005-09-27  8:05 ` Duncan Hothersall
  2005-09-27  9:03   ` Christopher Creutzig
  0 siblings, 1 reply; 11+ messages in thread
From: Duncan Hothersall @ 2005-09-27  8:05 UTC (permalink / raw)


> Question: Is it possible to design a doc or rtf template that Open Office can 
> convert to a sane, consistent xml format? 

OpenOffice.org does allow you to attach an XSLT stylesheet to an export
process which therefore allows you to do a (limited) transformation from
the visual markup which is its native format to a more structured one
which you would need. But the biggest challenge is that all
wordprocessors are designed for visual editing, meaning that there are,
for example, 15 or so different ways to get a bulleted list in Word,
creating 15 or so different RTF constructs, and coping with this can be
a nightmare.

> If the Tremblay approach is rich 
> enough, that would solve a lot of problems! Here is my idea:
> 
> 1. Give each author a doc/rtf template for formatting their article;
> 2. Use OpenOffice to convert to xml;
> 3. Use the Tremblay method (have not tried it yet) to process this in Context.

The FO approach (Paul Tremblay's focus) is one way to process XML to
paginated output, but there are many others. Personally I don't like the
FO approach, for a variety of reasons, but I'm sure others have had
success with it. But you should also explore DocBook-in-ConTeXt, which
uses ConTeXt's native XML processing capabilities. And don't rule out
using a separate scripting language to convert XML into ConTeXt as a
batch process, since that will give you the ultimate flexibility in
accessing all of ConTeXt's abilities.

> Question: Does the entire journal have to be in programmed in xml or can 
> ConTeXt process xml locally? For example, I may have my own article done in 
> COnTeXt mixed with other articles done in rtf=>xml.

You can just put XML into \startXMLdata ... \stopXMLdata blocks. I do
this for MathML processing within a larger ConTeXt document.

> Any other advice (and/or pitfalls to watch for) would be appreciated. This 
> sounds very promising!

Horses for courses. It's possible to get sucked into things like an FO
implementation or an XML conversion and find that you have spent months
perfecting it and it only shaves half an hour off your production time!
Also, you do tend to have to make compromises in design if you want to
be able to process directly from XML. But if you have sufficient
throughput and an appropriate design, it can be a real boon.

Hope that helps.

Duncan

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-09-28 11:45 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-09-27 14:50 DOC/RTF to ConTeXt via XML Idris Samawi Hamid
2005-09-28  8:02 ` Christopher Creutzig
     [not found] <20050928080211.5A0EB127F8@ronja.ntg.nl>
2005-09-28  8:54 ` Duncan Hothersall
2005-09-28 11:45   ` Christopher Creutzig
  -- strict thread matches above, loose matches on Subject: below --
2005-09-27 15:10 Idris Samawi Hamid
2005-09-27 15:19 ` Adam Lindsay
2005-09-28  7:08 ` Christopher Creutzig
     [not found] <20050927100004.7F435127E5@ronja.ntg.nl>
2005-09-27 10:24 ` Duncan Hothersall
2005-09-27 13:42   ` Christopher Creutzig
     [not found] <20050927074229.9EF85127E2@ronja.ntg.nl>
2005-09-27  8:05 ` Duncan Hothersall
2005-09-27  9:03   ` Christopher Creutzig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).