From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/84125 Path: news.gmane.org!not-for-mail From: Thangalin Newsgroups: gmane.comp.tex.context Subject: Re: EPUB XHTML Format Date: Thu, 5 Sep 2013 15:00:18 -0700 Message-ID: References: Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1232536436==" X-Trace: ger.gmane.org 1378418424 15455 80.91.229.3 (5 Sep 2013 22:00:24 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 5 Sep 2013 22:00:24 +0000 (UTC) To: mailing list for ConTeXt users Original-X-From: ntg-context-bounces@ntg.nl Fri Sep 06 00:00:28 2013 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from balder.ntg.nl ([5.39.185.229]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1VHhbI-0001ht-NF for gctc-ntg-context-518@m.gmane.org; Fri, 06 Sep 2013 00:00:28 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id 21F43101FD; Thu, 5 Sep 2013 23:58:54 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl Original-Received: from balder.ntg.nl ([127.0.0.1]) by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id EFNfjOzbo5-W; Thu, 5 Sep 2013 23:58:51 +0200 (CEST) Original-Received: from balder.ntg.nl (localhost [IPv6:::1]) by balder.ntg.nl (Postfix) with ESMTP id 1848F101EE; Thu, 5 Sep 2013 23:58:51 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id 55F28101EE for ; Thu, 5 Sep 2013 23:58:49 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl Original-Received: from balder.ntg.nl ([127.0.0.1]) by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id Fe18z+VWfPqI for ; Thu, 5 Sep 2013 23:58:47 +0200 (CEST) Original-Received: from filter3-ams.mf.surf.net (filter3-ams.mf.surf.net [192.87.102.71]) by balder.ntg.nl (Postfix) with ESMTP id 49B3A101EA for ; Thu, 5 Sep 2013 23:58:47 +0200 (CEST) Original-Received: from mail-ie0-x22e.google.com (mail-ie0-x22e.google.com [IPv6:2607:f8b0:4001:c03::22e]) by filter3-ams.mf.surf.net (8.14.3/8.14.3/Debian-9.4) with ESMTP id r85M0Jrh026185 for ; Fri, 6 Sep 2013 00:00:19 +0200 Original-Received: by mail-ie0-f174.google.com with SMTP id k14so5081285iea.33 for ; Thu, 05 Sep 2013 15:00:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=71latsRggeC2saOxI+ZfTGsfDwqsvrXBBFcw0YotRC8=; b=bNcmL1/P1hJkkIzSEl2GYJ6krU5DMObpxpxBQakkprc1OvJ/WAojXK1j0cQx351Rmj 4hIboxudO7XQFCjMTs5WxJa5jig28OoquO6m6kk/73LGJ7QQ4J3nLSNtEDGSqCnctnhb hMniXnj+AsJhdQHS2XGjufrUHr1xnq8TPQdouWhsRWch7Gtq8iwoQZAqGZFsJzkIs2+D DKwQIRpuweae8UiuoBrW/oG5V63LGzuLweldbcz+s5lSpE2DgLDdo8EzV/jPimfyg9Dt bAR3ilggbC+0CHNuPlXpFUWhg/GVdlIObJVast7EWG6ey+rLhwBO7B1brSl/0oXNuYI2 CWeQ== X-Received: by 10.50.25.39 with SMTP id z7mr7549415igf.59.1378418418779; Thu, 05 Sep 2013 15:00:18 -0700 (PDT) Original-Received: by 10.42.6.203 with HTTP; Thu, 5 Sep 2013 15:00:18 -0700 (PDT) In-Reply-To: X-Bayes-Prob: 0.8598 (Score 2, tokens from: @@RPTN) X-CanIt-Geo: ip=2607:f8b0:4001:c03::22e; country=US X-CanItPRO-Stream: uu:ntg-context@ntg.nl (inherits from uu:default, base:default) X-Canit-Stats-ID: 03Kly0jiQ - 4b1c37b57fe8 - 20130905 X-Scanned-By: CanIt (www . roaringpenguin . com) X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.14 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ntg-context-bounces@ntg.nl Original-Sender: ntg-context-bounces@ntg.nl Xref: news.gmane.org gmane.comp.tex.context:84125 Archived-At: --===============1232536436== Content-Type: multipart/alternative; boundary=047d7bd76898340e5d04e5aa0c2f --047d7bd76898340e5d04e5aa0c2f Content-Type: text/plain; charset=ISO-8859-1 Hi, handle XML+CSS well. However, most (all?) EPUB readers don't. So, the > question is asking if instead ConTeXt could generate a XHTML Precisely. > If you need both EPUB and PDF, start with a semantically rich XML >> vocabulary, e.g. DocBook. In this case you can relatively easy transfrom >> > My database doesn't generate DocBook. It generates a custom XML document from which I generate a web page, and a LaTeX document (though soon to be ConTeXt!). There is no reason, technically, why I cannot convert the source XML to either DocBook or directly to EPUB. There are, however, problems doing that, which Aditya correctly surmises: > - Automatic section numbering taking care of different conversions. > - Automatic index generation and sorting > - Inserting hyphenation points at the appropriate place in the generated > output (so that the browser can effectively rely on TeX's hyphenation > algorithm to do line-breaking). > - Convert TeX math to MathML. > > The current ConTeXT XML source can translate a well formed ConTeXt > document into a XML document with the above features. Those are exactly the issues that I would love to resolve using ConTeXt for generating an EPUB. (The MathML isn't as important to me, but I can see other people wanting such a feature.) What about accessibility? I expect that visually impaired people would > depend on document structure rather than its visualisation. That is a good point. The current XML structure produced by ConTeXt (Hans correct me here if I'm mistaken) is not accessible, as it doesn't adhere to strict XHTML. I suspect that
tags would not be accessible -- the only way to provide true accessibility in EPUB format would be by using the strict XHTML tags. for instance, we have more levels than H1..H6, so how to do H7? if someone > has to deal with that, he/she can as well transform all into H1 with some > class which is a local solution then I realize there is not going to be a one-to-one map of all possible ConTeXt macros to XHTML. For someone who has 7 levels of nested sections they would either have to rewrite some Lua or perform some post-processing (e.g., with XSLT). I would posit that a document with 7 levels of nested sections is not going to be a common occurrence. When I talk about strict XHTML, I'm proposing that a _simple_ ConTeXt document (up to 6 header levels, numbered and unnumbered lists, images, text emphasis, etc.) should generate a simple, validating XHTML document. Trying to attain 100% coverage of ConTeXt transmogrification to XHTML is ridiculous when, I suspect, 80% coverage would meet most needs. :-) It is definitely possible to translate the ConTeXt EPUB output to XHTML. However, there are practical realities that hinder such an approach. Architecturally, if anyone is going to translate an XML document to EPUB format, it certainly won't be this way: *XML + XSLT -> ConTeXT File -> ConTeXt EPUB XML + XSLT -> EPUB + CSS* It'll be this way, which is less time-consuming, less complex, and less susceptible to err: *XML + XSLT (or API) -> EPUB + CSS* However, it does not, as we all know, produce as feature rich output as leveraging the ConTeXt abilities that Aditya mentioned, which was the point: *XML + XSLT -> ConTeXT TeX -> EPUB + CSS* Kindest regards. --047d7bd76898340e5d04e5aa0c2f Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi,

handle XML+CSS well. However, most (all?) EPUB readers don't. So, the q= uestion is asking if instead ConTeXt could generate a XHTML

Precisely.
=A0
If you need both EPUB and PDF, start with a semantically rich XML
vocabulary, e.g. DocBook. In this case you can relatively easy transfrom

My database doesn'= t generate DocBook. It generates a custom XML document from which I generat= e a web page, and a LaTeX document (though soon to be ConTeXt!). There is n= o reason, technically, why I cannot convert the source XML to either DocBoo= k or directly to EPUB. There are, however, problems doing that, which Adity= a correctly surmises:
=A0
- Automatic section numbering taking care of different conversions.
- Automatic index generation and sorting
- Inserting hyphenation points at the appropriate place in the generated ou= tput (so that the browser can effectively rely on TeX's hyphenation alg= orithm to do line-breaking).
- Convert TeX math to MathML.

The current ConTeXT XML source can translate a well formed ConTeXt document= into a XML document with the above features.

Those are exactly the issues that I would love to resolve using ConTeXt = for generating an EPUB. (The MathML isn't as important to me, but I can= see other people wanting such a feature.)

What about accessibility? I expect that= visually impaired people would
depend on document structure rather than its visualisation.

That is a good point. The current XML st= ructure produced by ConTeXt (Hans correct me here if I'm mistaken) is n= ot accessible, as it doesn't adhere to strict XHTML. I suspect that <= ;div> tags would not be accessible -- the only way to provide true acces= sibility in EPUB format would be by using the strict XHTML tags.

for instance, we have more levels than H1..H6, so how to d= o H7? if someone has to deal with that, he/she can as well transform all in= to H1 with some class which is a local solution then

I realize there is not going to be a one-to-one map of = all possible ConTeXt macros to XHTML. For someone who has 7 levels of neste= d sections they would either have to rewrite some Lua or perform some post-= processing (e.g., with XSLT). I would posit that a document with 7 levels o= f nested sections is not going to be a common occurrence.

When I talk about strict XHTML, I'm proposing that = a _simple_ ConTeXt document (up to 6 header levels, numbered and unnumbered= lists, images, text emphasis, etc.) should generate a simple, validating X= HTML document. Trying to attain 100% coverage of ConTeXt transmogrification= to XHTML is ridiculous when, I suspect, 80% coverage would meet most needs= . :-)

It is definitely possible to translate the ConTeXt EPUB= output to XHTML. However, there are practical realities that hinder such a= n approach. Architecturally, if anyone is going to translate an XML documen= t to EPUB format, it certainly won't be this way:

XML + XSLT -> ConTeXT File -> ConTeXt EPUB XML= + XSLT -> EPUB + CSS

It'll be this way= , which is less time-consuming, less complex, and less susceptible to err:<= /div>

XML + XSLT (or API) -> EPUB + CSS
<= br>
However, it does not, as we all know, produce as feature rich= output as leveraging the ConTeXt abilities that Aditya mentioned, which wa= s the point:

XML + XSLT -> ConTeXT TeX -> EPUB + CSS

Kindest regards.
--047d7bd76898340e5d04e5aa0c2f-- --===============1232536436== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________ --===============1232536436==--