ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
From: "Idris Samawi Hamid" <ishamid@colostate.edu>
To: "mailing list for ConTeXt users" <ntg-context@ntg.nl>
Subject: Re: Doc to ConTeXt [was Re:  HTML to ConTeXt]
Date: Fri, 09 Nov 2007 20:33:58 -0700	[thread overview]
Message-ID: <op.t1j326pmnx1yh1@your-b27fb1c401> (raw)
In-Reply-To: <9AE14A44-6B23-450B-B3E5-DEE82341AFBC@di.unito.it>

On Fri, 09 Nov 2007 18:30:36 -0700, Andrea Valle <valle@di.unito.it> wrote:

> After wasting my time with an awful pdf to html converter by
> Acrobat,  I discovered this, you may all know:
> http://pdftohtml.sourceforge.net/

Looks impressive...

> The html  conversion is very very good in resulting rendering and
> also in sources, but after some tweakings I got interested in the xml
> conversion it allows.
> The xml format  substantially encodes the infos related to page,
> typically each line is an element. Plus, there are bold and italics
> marked easily as <b> and <i>
> I'm still struggling to understand something really operative of XML
> processing in ConTeXt, so  I switched back to Python.
> I used an incremental sax parser with some replacement.
> This is today's draft.
> Original:
> http://www.semiotiche.it/andrea/membrana/02%20imp.pdf
>
> Recomposed (no setup at all, only \enableregime[utf]):
> http://www.semiotiche.it/andrea/membrana/02imp.pdf

Looks VERY impressive... Tell me, how did you set up the cropmarks etc.?

> pdf --> pdftoxml --> xml --> python script --> tex --> pdf
>
> I recovered par, bold, em, footnotes,  stripping dashes and
> reassembling the text with footnote references. Not bad as a first step.

Did you also try pdftohtml --> html --> context?

> I guess that you xml gurus could probably do much easier and cleaner.
> So, I mean -just for my very specific needs, I con probably  take
> word sources, convert to pdf and then finally reach ConTeXt as
> discussed.

Again, very nice stuff!

Best wishes
Idris

-- 
Professor Idris Samawi Hamid, Editor-in-Chief
International Journal of Shi`i Studies
Department of Philosophy
Colorado State University
Fort Collins, CO 80523

--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


  parent reply	other threads:[~2007-11-10  3:33 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-25 14:50 HTML to ConTeXt Aditya Mahajan
2007-10-25 20:17 ` Idris Samawi Hamid
2007-10-26  4:22   ` Aditya Mahajan
2007-10-26 11:37     ` Doc to ConTeXt [was Re: HTML to ConTeXt] Idris Samawi Hamid
2007-11-10  1:30       ` Andrea Valle
2007-11-10  3:14         ` Idris Samawi Hamid
2007-11-10 11:25           ` Andrea Valle
2007-11-10 12:09             ` Andrea Valle
2007-11-10  3:33         ` Idris Samawi Hamid [this message]
2007-11-10 11:59           ` Andrea Valle
2007-11-10 14:07             ` Idris Samawi Hamid
2007-11-10 14:11               ` Andrea Valle
2007-11-10 19:08                 ` Hans Hagen
2007-11-10  5:44         ` Saji Njarackalazhikam Hameed
2007-11-10 13:10           ` Andrea Valle
     [not found]         ` <6faad9f00711100331h547664c6l97d2c3b82c16d8dd@mail.gmail.com>
2007-11-10 12:30           ` Andrea Valle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=op.t1j326pmnx1yh1@your-b27fb1c401 \
    --to=ishamid@colostate.edu \
    --cc=ntg-context@ntg.nl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).