ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
From: "Karsten Heymann" <karsten.heymann@googlemail.com>
To: Yatskovsky <yatskovsky@gmail.com>,
	 "mailing list for ConTeXt users" <ntg-context@ntg.nl>
Subject: Re: Microsoft Word -> Context
Date: Mon, 2 Apr 2007 21:57:56 +0200	[thread overview]
Message-ID: <d0f4811b0704021257k6ab843e1yd138913c44b350b@mail.gmail.com> (raw)
In-Reply-To: <627014675.20070402204746@gmail.com>

Hello Vyatcheslav,

2007/4/2, Vyatcheslav Yatskovsky <yatskovsky@gmail.com>:
> Then, we need something like Word2ConText (or a macro written in VBA) to convert
> incoming papers to ConText code and then easily assemble them. Something, that
> resembles famous Word2Tex application.

I've recently created such a solution for a journal, hand-crafted to a
very specific document template. They now have to pre-format every
article with this template, export it to HTML and
my converter makes Context of it. Be awary, that this required a
significiant amount of time
(and money, as it was contract work). But the basic idea is quite simple:

* preformat the doc in word by applying special paragraph styles to
all paragraphs (which
  will be mapped nicely to CSS classes)
* Export the word doc to HTML
* make XML from it with htmltidy
* filter out those huge amounts of unneeded stuff (CSS-Stuff, DIVs and the like)
* go through the list of paragraphs, and for each paragraph type know what to do

I've implemented it in Python (using DOM and SAX, now that I know
more, I would start with ElementTree from the beginning).
Unfortunately, as it was contract work, I cannot give out the code,
but if specific questions arise, I will gladly share my experiences.

Yours
Karsten

  parent reply	other threads:[~2007-04-02 19:57 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <mailman.1.1175508001.8643.ntg-context@ntg.nl>
2007-04-02 17:47 ` Vyatcheslav Yatskovsky
2007-04-02 19:54   ` Andrea Valle
2007-04-02 19:57   ` Karsten Heymann [this message]
2007-04-03  6:30     ` luigi scarso
2007-04-03  8:40       ` Karsten Heymann
2007-04-02 22:35   ` Ricard Roca
2007-04-03  7:20   ` Mari Voipio
2007-04-03 21:26     ` Henning Hraban Ramm
2007-04-02 17:56 ` SciTe setup Vyatcheslav Yatskovsky
2007-04-03  7:08   ` Hans Hagen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d0f4811b0704021257k6ab843e1yd138913c44b350b@mail.gmail.com \
    --to=karsten.heymann@googlemail.com \
    --cc=ntg-context@ntg.nl \
    --cc=yatskovsky@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).