From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/84125
Path: news.gmane.org!not-for-mail
From: Thangalin <thangalin@gmail.com>
Newsgroups: gmane.comp.tex.context
Subject: Re: EPUB XHTML Format
Date: Thu, 5 Sep 2013 15:00:18 -0700
Message-ID: <CAANrE7pjNTxXP95c=R+KWACkx_SY8DU0rrwC_L6deNW5m8WhOg@mail.gmail.com>
References: <CAANrE7r9OaMc65SaiJYhZ8x3=GtrQzYRrmZrfiLvcHiuot=xJA@mail.gmail.com>
 <alpine.LNX.2.02.1309051414280.29678@ybpnyubfg.ybpnyqbznva>
Reply-To: mailing list for ConTeXt users <ntg-context@ntg.nl>
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============1232536436=="
X-Trace: ger.gmane.org 1378418424 15455 80.91.229.3 (5 Sep 2013 22:00:24 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Thu, 5 Sep 2013 22:00:24 +0000 (UTC)
To: mailing list for ConTeXt users <ntg-context@ntg.nl>
Original-X-From: ntg-context-bounces@ntg.nl Fri Sep 06 00:00:28 2013
Return-path: <ntg-context-bounces@ntg.nl>
Envelope-to: gctc-ntg-context-518@m.gmane.org
Original-Received: from balder.ntg.nl ([5.39.185.229])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <ntg-context-bounces@ntg.nl>)
	id 1VHhbI-0001ht-NF
	for gctc-ntg-context-518@m.gmane.org; Fri, 06 Sep 2013 00:00:28 +0200
Original-Received: from localhost (localhost [127.0.0.1])
	by balder.ntg.nl (Postfix) with ESMTP id 21F43101FD;
	Thu,  5 Sep 2013 23:58:54 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl
Original-Received: from balder.ntg.nl ([127.0.0.1])
	by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024)
	with LMTP id EFNfjOzbo5-W; Thu,  5 Sep 2013 23:58:51 +0200 (CEST)
Original-Received: from balder.ntg.nl (localhost [IPv6:::1])
	by balder.ntg.nl (Postfix) with ESMTP id 1848F101EE;
	Thu,  5 Sep 2013 23:58:51 +0200 (CEST)
Original-Received: from localhost (localhost [127.0.0.1])
 by balder.ntg.nl (Postfix) with ESMTP id 55F28101EE
 for <ntg-context@ntg.nl>; Thu,  5 Sep 2013 23:58:49 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl
Original-Received: from balder.ntg.nl ([127.0.0.1])
 by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024)
 with LMTP id Fe18z+VWfPqI for <ntg-context@ntg.nl>;
 Thu,  5 Sep 2013 23:58:47 +0200 (CEST)
Original-Received: from filter3-ams.mf.surf.net (filter3-ams.mf.surf.net
 [192.87.102.71]) by balder.ntg.nl (Postfix) with ESMTP id 49B3A101EA
 for <ntg-context@ntg.nl>; Thu,  5 Sep 2013 23:58:47 +0200 (CEST)
Original-Received: from mail-ie0-x22e.google.com (mail-ie0-x22e.google.com
 [IPv6:2607:f8b0:4001:c03::22e])
 by filter3-ams.mf.surf.net (8.14.3/8.14.3/Debian-9.4) with ESMTP id
 r85M0Jrh026185
 for <ntg-context@ntg.nl>; Fri, 6 Sep 2013 00:00:19 +0200
Original-Received: by mail-ie0-f174.google.com with SMTP id k14so5081285iea.33
 for <ntg-context@ntg.nl>; Thu, 05 Sep 2013 15:00:19 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :content-type; bh=71latsRggeC2saOxI+ZfTGsfDwqsvrXBBFcw0YotRC8=;
 b=bNcmL1/P1hJkkIzSEl2GYJ6krU5DMObpxpxBQakkprc1OvJ/WAojXK1j0cQx351Rmj
 4hIboxudO7XQFCjMTs5WxJa5jig28OoquO6m6kk/73LGJ7QQ4J3nLSNtEDGSqCnctnhb
 hMniXnj+AsJhdQHS2XGjufrUHr1xnq8TPQdouWhsRWch7Gtq8iwoQZAqGZFsJzkIs2+D
 DKwQIRpuweae8UiuoBrW/oG5V63LGzuLweldbcz+s5lSpE2DgLDdo8EzV/jPimfyg9Dt
 bAR3ilggbC+0CHNuPlXpFUWhg/GVdlIObJVast7EWG6ey+rLhwBO7B1brSl/0oXNuYI2
 CWeQ==
X-Received: by 10.50.25.39 with SMTP id z7mr7549415igf.59.1378418418779; Thu,
 05 Sep 2013 15:00:18 -0700 (PDT)
Original-Received: by 10.42.6.203 with HTTP; Thu, 5 Sep 2013 15:00:18 -0700 (PDT)
In-Reply-To: <alpine.LNX.2.02.1309051414280.29678@ybpnyubfg.ybpnyqbznva>
X-Bayes-Prob: 0.8598 (Score 2, tokens from: @@RPTN)
X-CanIt-Geo: ip=2607:f8b0:4001:c03::22e; country=US
X-CanItPRO-Stream: uu:ntg-context@ntg.nl (inherits from uu:default,
 base:default)
X-Canit-Stats-ID: 03Kly0jiQ - 4b1c37b57fe8 - 20130905
X-Scanned-By: CanIt (www . roaringpenguin . com)
X-BeenThere: ntg-context@ntg.nl
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: mailing list for ConTeXt users <ntg-context.ntg.nl>
List-Unsubscribe: <http://www.ntg.nl/mailman/options/ntg-context>,
 <mailto:ntg-context-request@ntg.nl?subject=unsubscribe>
List-Archive: <http://www.ntg.nl/pipermail/ntg-context>
List-Post: <mailto:ntg-context@ntg.nl>
List-Help: <mailto:ntg-context-request@ntg.nl?subject=help>
List-Subscribe: <http://www.ntg.nl/mailman/listinfo/ntg-context>,
 <mailto:ntg-context-request@ntg.nl?subject=subscribe>
Errors-To: ntg-context-bounces@ntg.nl
Original-Sender: ntg-context-bounces@ntg.nl
Xref: news.gmane.org gmane.comp.tex.context:84125
Archived-At: <http://permalink.gmane.org/gmane.comp.tex.context/84125>

--===============1232536436==
Content-Type: multipart/alternative; boundary=047d7bd76898340e5d04e5aa0c2f

--047d7bd76898340e5d04e5aa0c2f
Content-Type: text/plain; charset=ISO-8859-1

Hi,

handle XML+CSS well. However, most (all?) EPUB readers don't. So, the
> question is asking if instead ConTeXt could generate a XHTML


Precisely.


>  If you need both EPUB and PDF, start with a semantically rich XML
>> vocabulary, e.g. DocBook. In this case you can relatively easy transfrom
>>
>
My database doesn't generate DocBook. It generates a custom XML document
from which I generate a web page, and a LaTeX document (though soon to be
ConTeXt!). There is no reason, technically, why I cannot convert the source
XML to either DocBook or directly to EPUB. There are, however, problems
doing that, which Aditya correctly surmises:


> - Automatic section numbering taking care of different conversions.
> - Automatic index generation and sorting
> - Inserting hyphenation points at the appropriate place in the generated
> output (so that the browser can effectively rely on TeX's hyphenation
> algorithm to do line-breaking).
> - Convert TeX math to MathML.
>
> The current ConTeXT XML source can translate a well formed ConTeXt
> document into a XML document with the above features.


Those are exactly the issues that I would love to resolve using ConTeXt for
generating an EPUB. (The MathML isn't as important to me, but I can see
other people wanting such a feature.)

What about accessibility? I expect that visually impaired people would
> depend on document structure rather than its visualisation.


That is a good point. The current XML structure produced by ConTeXt (Hans
correct me here if I'm mistaken) is not accessible, as it doesn't adhere to
strict XHTML. I suspect that <div> tags would not be accessible -- the only
way to provide true accessibility in EPUB format would be by using the
strict XHTML tags.

for instance, we have more levels than H1..H6, so how to do H7? if someone
> has to deal with that, he/she can as well transform all into H1 with some
> class which is a local solution then


I realize there is not going to be a one-to-one map of all possible ConTeXt
macros to XHTML. For someone who has 7 levels of nested sections they would
either have to rewrite some Lua or perform some post-processing (e.g., with
XSLT). I would posit that a document with 7 levels of nested sections is
not going to be a common occurrence.

When I talk about strict XHTML, I'm proposing that a _simple_ ConTeXt
document (up to 6 header levels, numbered and unnumbered lists, images,
text emphasis, etc.) should generate a simple, validating XHTML document.
Trying to attain 100% coverage of ConTeXt transmogrification to XHTML is
ridiculous when, I suspect, 80% coverage would meet most needs. :-)

It is definitely possible to translate the ConTeXt EPUB output to XHTML.
However, there are practical realities that hinder such an approach.
Architecturally, if anyone is going to translate an XML document to EPUB
format, it certainly won't be this way:

*XML + XSLT -> ConTeXT File -> ConTeXt EPUB XML + XSLT -> EPUB + CSS*

It'll be this way, which is less time-consuming, less complex, and less
susceptible to err:

*XML + XSLT (or API) -> EPUB + CSS*

However, it does not, as we all know, produce as feature rich output as
leveraging the ConTeXt abilities that Aditya mentioned, which was the point:

*XML + XSLT -> ConTeXT TeX -> EPUB + CSS*

Kindest regards.

--047d7bd76898340e5d04e5aa0c2f
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Hi,<div><br><div><div class=3D"gmail_extra"><div class=3D"=
gmail_quote"><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px =
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-=
style:solid;padding-left:1ex">
handle XML+CSS well. However, most (all?) EPUB readers don&#39;t. So, the q=
uestion is asking if instead ConTeXt could generate a XHTML</blockquote><di=
v><br></div><div>Precisely.</div><div>=A0</div><blockquote class=3D"gmail_q=
uote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-c=
olor:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div class=3D"im">
<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p=
adding-left:1ex">
If you need both EPUB and PDF, start with a semantically rich XML<br>
vocabulary, e.g. DocBook. In this case you can relatively easy transfrom<br=
></blockquote></div></blockquote><div><br></div><div>My database doesn&#39;=
t generate DocBook. It generates a custom XML document from which I generat=
e a web page, and a LaTeX document (though soon to be ConTeXt!). There is n=
o reason, technically, why I cannot convert the source XML to either DocBoo=
k or directly to EPUB. There are, however, problems doing that, which Adity=
a correctly surmises:</div>
<div>=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px=
 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left=
-style:solid;padding-left:1ex">
- Automatic section numbering taking care of different conversions.<br>
- Automatic index generation and sorting<br>
- Inserting hyphenation points at the appropriate place in the generated ou=
tput (so that the browser can effectively rely on TeX&#39;s hyphenation alg=
orithm to do line-breaking).<br>
- Convert TeX math to MathML.<br>
<br>
The current ConTeXT XML source can translate a well formed ConTeXt document=
 into a XML document with the above features.</blockquote><div><br></div><d=
iv>Those are exactly the issues that I would love to resolve using ConTeXt =
for generating an EPUB. (The MathML isn&#39;t as important to me, but I can=
 see other people wanting such a feature.)</div>
<div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0p=
x 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-lef=
t-style:solid;padding-left:1ex"><span style=3D"color:rgb(80,0,80);font-fami=
ly:arial,sans-serif;font-size:13px">What about accessibility? I expect that=
 visually impaired people would<br>
</span><span style=3D"color:rgb(80,0,80);font-family:arial,sans-serif;font-=
size:13px">depend on document structure rather than its visualisation.</spa=
n></blockquote><div><br></div><div>That is a good point. The current XML st=
ructure produced by ConTeXt (Hans correct me here if I&#39;m mistaken) is n=
ot accessible, as it doesn&#39;t adhere to strict XHTML. I suspect that &lt=
;div&gt; tags would not be accessible -- the only way to provide true acces=
sibility in EPUB format would be by using the strict XHTML tags.</div>
<div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0p=
x 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-lef=
t-style:solid;padding-left:1ex"><span style=3D"font-family:arial,sans-serif=
;font-size:13px">for instance, we have more levels than H1..H6, so how to d=
o H7? if someone has to deal with that, he/she can as well transform all in=
to H1 with some class which is a local solution then</span></blockquote>
<div><br></div><div>I realize there is not going to be a one-to-one map of =
all possible ConTeXt macros to XHTML. For someone who has 7 levels of neste=
d sections they would either have to rewrite some Lua or perform some post-=
processing (e.g., with XSLT). I would posit that a document with 7 levels o=
f nested sections is not going to be a common occurrence.</div>
<div><br></div><div>When I talk about strict XHTML, I&#39;m proposing that =
a _simple_ ConTeXt document (up to 6 header levels, numbered and unnumbered=
 lists, images, text emphasis, etc.) should generate a simple, validating X=
HTML document. Trying to attain 100% coverage of ConTeXt transmogrification=
 to XHTML is ridiculous when, I suspect, 80% coverage would meet most needs=
. :-)</div>
<div><br></div><div>It is definitely possible to translate the ConTeXt EPUB=
 output to XHTML. However, there are practical realities that hinder such a=
n approach. Architecturally, if anyone is going to translate an XML documen=
t to EPUB format, it certainly won&#39;t be this way:</div>
<div><br></div><div><b>XML + XSLT -&gt; ConTeXT File -&gt; ConTeXt EPUB XML=
 + XSLT -&gt; EPUB + CSS</b></div><div><br></div><div>It&#39;ll be this way=
, which is less time-consuming, less complex, and less susceptible to err:<=
/div>
<div><br></div><div><b>XML + XSLT (or API) -&gt; EPUB + CSS</b></div><div><=
br></div><div>However, it does not, as we all know, produce as feature rich=
 output as leveraging the ConTeXt abilities that Aditya mentioned, which wa=
s the point:</div>
<div><br></div><div><b>XML + XSLT -&gt; ConTeXT TeX -&gt; EPUB + CSS</b><br=
></div><div><br></div><div>Kindest regards.<br></div></div></div></div></di=
v></div>

--047d7bd76898340e5d04e5aa0c2f--

--===============1232536436==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________
--===============1232536436==--