From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/84137 Path: news.gmane.org!not-for-mail From: Mica Semrick Newsgroups: gmane.comp.tex.context Subject: Re: EPUB XHTML Format Date: Fri, 6 Sep 2013 09:36:56 -0700 Message-ID: References: Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1990853334==" X-Trace: ger.gmane.org 1378485420 11054 80.91.229.3 (6 Sep 2013 16:37:00 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 6 Sep 2013 16:37:00 +0000 (UTC) To: mailing list for ConTeXt users Original-X-From: ntg-context-bounces@ntg.nl Fri Sep 06 18:37:05 2013 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from balder.ntg.nl ([5.39.185.229]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1VHz1s-00033a-MK for gctc-ntg-context-518@m.gmane.org; Fri, 06 Sep 2013 18:37:04 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id D8BB9101FF; Fri, 6 Sep 2013 18:35:28 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl Original-Received: from balder.ntg.nl ([127.0.0.1]) by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id nA5fJGPg0sVl; Fri, 6 Sep 2013 18:35:26 +0200 (CEST) Original-Received: from balder.ntg.nl (localhost [IPv6:::1]) by balder.ntg.nl (Postfix) with ESMTP id 24C4C101F5; Fri, 6 Sep 2013 18:35:26 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id 84BC6101F5 for ; Fri, 6 Sep 2013 18:35:24 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl Original-Received: from balder.ntg.nl ([127.0.0.1]) by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id fT8zEJNBJ1LT for ; Fri, 6 Sep 2013 18:35:22 +0200 (CEST) Original-Received: from filter4-ams.mf.surf.net (filter4-ams.mf.surf.net [192.87.102.72]) by balder.ntg.nl (Postfix) with ESMTP id 6186D101F0 for ; Fri, 6 Sep 2013 18:35:22 +0200 (CEST) Original-Received: from mail-wi0-x232.google.com (mail-wi0-x232.google.com [IPv6:2a00:1450:400c:c05::232]) by filter4-ams.mf.surf.net (8.14.3/8.14.3/Debian-9.4) with ESMTP id r86Gc6B2030263 for ; Fri, 6 Sep 2013 18:38:06 +0200 Original-Received: by mail-wi0-f178.google.com with SMTP id hn9so1146896wib.5 for ; Fri, 06 Sep 2013 09:36:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=JSzV3g0USHP8EDXV0wbZiTSftdJpdhMsdC8EqYFuztM=; b=gANSG0a4GzpGjJyUWksBAfLarjvUnn4uzV4xWPAOOUMG5GfP5eLOM/HViF7tck2No8 lpP6UbUfs7btrL/XzXSP420FrAL0g5j+hxT2RBJlECwVtWc4TYSRzZntj7QEaveBHquU xZ9tKjJDniEH03X9FNl8neZon6rtcwQo+InKoGNoIR8GbtA1Fy1lolpSeIWYPnlTyBVz E7cWUKrk8wZrnQWXXT6OvpYD/RMm9Gd0sP0rTHttdLFWw5r9T1Yb/Xmd7yzMT6SOpIhO KVb0TnCejLg/JgsM7pm8vr/UvstVJMdBq6NQ1mxlNsI3ZnyXYak3pHh4kGBEotb+zs83 mBqQ== X-Received: by 10.180.82.164 with SMTP id j4mr2625459wiy.65.1378485416276; Fri, 06 Sep 2013 09:36:56 -0700 (PDT) Original-Received: by 10.217.86.5 with HTTP; Fri, 6 Sep 2013 09:36:56 -0700 (PDT) In-Reply-To: X-Bayes-Prob: 0.0001 (Score 0, tokens from: @@RPTN) X-CanIt-Geo: ip=2a00:1450:400c:c05::232; country=IE X-CanItPRO-Stream: uu:ntg-context@ntg.nl (inherits from uu:default, base:default) X-Canit-Stats-ID: 01KlQC6cV - 78346cd473b3 - 20130906 (trained as not-spam) X-Scanned-By: CanIt (www . roaringpenguin . com) X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.14 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ntg-context-bounces@ntg.nl Original-Sender: ntg-context-bounces@ntg.nl Xref: news.gmane.org gmane.comp.tex.context:84137 Archived-At: --===============1990853334== Content-Type: multipart/alternative; boundary=f46d04428d9690bda104e5b9a5e8 --f46d04428d9690bda104e5b9a5e8 Content-Type: text/plain; charset=ISO-8859-1 Another small note, since I just walked down the ePUB path: you'll be very sad to find out that a lot of rendering engines for popular readers are not consistent, won't render standard XHTML markup correctly (nest an ordered list within an unordered list and then look at it in adobe digital editions and several other readers). "But it is just XHML + CSS!" you'll cry, "How can they not render it correctly?" I don't know, but it was an extremely frustrating process. I even contacted adobe to try and report this nested list bug to them... their suggestion was that I could *pay* them to work with "content experts" who would help me "correct" my source so that it would render "correctly." The best reader imho is iBooks on the iPad, nothing else, from what I've seen, comes close. But that is one expensive eReader. :( On Thu, Sep 5, 2013 at 3:00 PM, Thangalin wrote: > Hi, > > handle XML+CSS well. However, most (all?) EPUB readers don't. So, the >> question is asking if instead ConTeXt could generate a XHTML > > > Precisely. > > >> If you need both EPUB and PDF, start with a semantically rich XML >>> vocabulary, e.g. DocBook. In this case you can relatively easy transfrom >>> >> > My database doesn't generate DocBook. It generates a custom XML document > from which I generate a web page, and a LaTeX document (though soon to be > ConTeXt!). There is no reason, technically, why I cannot convert the source > XML to either DocBook or directly to EPUB. There are, however, problems > doing that, which Aditya correctly surmises: > > >> - Automatic section numbering taking care of different conversions. >> - Automatic index generation and sorting >> - Inserting hyphenation points at the appropriate place in the generated >> output (so that the browser can effectively rely on TeX's hyphenation >> algorithm to do line-breaking). >> >> - Convert TeX math to MathML. >> >> The current ConTeXT XML source can translate a well formed ConTeXt >> document into a XML document with the above features. >> > > Those are exactly the issues that I would love to resolve using ConTeXt > for generating an EPUB. (The MathML isn't as important to me, but I can see > other people wanting such a feature.) > > What about accessibility? I expect that visually impaired people would >> depend on document structure rather than its visualisation. > > > That is a good point. The current XML structure produced by ConTeXt (Hans > correct me here if I'm mistaken) is not accessible, as it doesn't adhere to > strict XHTML. I suspect that
tags would not be accessible -- the only > way to provide true accessibility in EPUB format would be by using the > strict XHTML tags. > > for instance, we have more levels than H1..H6, so how to do H7? if someone >> has to deal with that, he/she can as well transform all into H1 with some >> class which is a local solution then > > > I realize there is not going to be a one-to-one map of all possible > ConTeXt macros to XHTML. For someone who has 7 levels of nested sections > they would either have to rewrite some Lua or perform some post-processing > (e.g., with XSLT). I would posit that a document with 7 levels of nested > sections is not going to be a common occurrence. > > When I talk about strict XHTML, I'm proposing that a _simple_ ConTeXt > document (up to 6 header levels, numbered and unnumbered lists, images, > text emphasis, etc.) should generate a simple, validating XHTML document. > Trying to attain 100% coverage of ConTeXt transmogrification to XHTML is > ridiculous when, I suspect, 80% coverage would meet most needs. :-) > > It is definitely possible to translate the ConTeXt EPUB output to XHTML. > However, there are practical realities that hinder such an approach. > Architecturally, if anyone is going to translate an XML document to EPUB > format, it certainly won't be this way: > > *XML + XSLT -> ConTeXT File -> ConTeXt EPUB XML + XSLT -> EPUB + CSS* > > It'll be this way, which is less time-consuming, less complex, and less > susceptible to err: > > *XML + XSLT (or API) -> EPUB + CSS* > > However, it does not, as we all know, produce as feature rich output as > leveraging the ConTeXt abilities that Aditya mentioned, which was the point: > > *XML + XSLT -> ConTeXT TeX -> EPUB + CSS* > > Kindest regards. > > > ___________________________________________________________________________________ > If your question is of interest to others as well, please add an entry to > the Wiki! > > maillist : ntg-context@ntg.nl / > http://www.ntg.nl/mailman/listinfo/ntg-context > webpage : http://www.pragma-ade.nl / http://tex.aanhet.net > archive : http://foundry.supelec.fr/projects/contextrev/ > wiki : http://contextgarden.net > > ___________________________________________________________________________________ > --f46d04428d9690bda104e5b9a5e8 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Another small note, since I just walked down the ePUB path= : you'll be very sad to find out that a lot of rendering engines for po= pular readers are not consistent, won't render standard XHTML markup co= rrectly (nest an ordered list within an unordered list and then look at it = in adobe digital editions and several other readers). "But it is just = XHML + CSS!" you'll cry, "How can they not render it correctl= y?" I don't know, but it was an extremely frustrating process. I e= ven contacted adobe to try and report this nested list bug to them... their= suggestion was that I could *pay* them to work with "content experts&= quot; who would help me "correct" my source so that it would rend= er "correctly." =A0

The best reader imho is iBooks on the iPad, nothing else, fr= om what I've seen, comes close. But that is one expensive eReader. :(


On = Thu, Sep 5, 2013 at 3:00 PM, Thangalin <thangalin@gmail.com> wrote:
Hi,

handle XML+CSS well. However, most (all?) EPUB readers don't. So, the q= uestion is asking if instead ConTeXt could generate a XHTML

Precisely.
=A0
If you need both EPUB and PDF, start with a semantically rich XML
vocabulary, e.g. DocBook. In this case you can relatively easy transfrom

My database does= n't generate DocBook. It generates a custom XML document from which I g= enerate a web page, and a LaTeX document (though soon to be ConTeXt!). Ther= e is no reason, technically, why I cannot convert the source XML to either = DocBook or directly to EPUB. There are, however, problems doing that, which= Aditya correctly surmises:
=A0
- Automatic section numbering taking care of different conversions.
- Automatic index generation and sorting
- Inserting hyphenation points at the appropriate place in the generated ou= tput (so that the browser can effectively rely on TeX's hyphenation alg= orithm to do line-breaking).

- Convert TeX math to MathML.

The current ConTeXT XML source can translate a well formed ConTeXt document= into a XML document with the above features.

Those are exactly the issues that I would love to resolve using Co= nTeXt for generating an EPUB. (The MathML isn't as important to me, but= I can see other people wanting such a feature.)

What about accessibility? I expect that= visually impaired people would
depend on document structure rather than its visualisation.

That is a good point. The current = XML structure produced by ConTeXt (Hans correct me here if I'm mistaken= ) is not accessible, as it doesn't adhere to strict XHTML. I suspect th= at <div> tags would not be accessible -- the only way to provide true= accessibility in EPUB format would be by using the strict XHTML tags.

for instance, we have more levels than H1..H6, so how to d= o H7? if someone has to deal with that, he/she can as well transform all in= to H1 with some class which is a local solution then

I realize there is not going to be a one-to-one m= ap of all possible ConTeXt macros to XHTML. For someone who has 7 levels of= nested sections they would either have to rewrite some Lua or perform some= post-processing (e.g., with XSLT). I would posit that a document with 7 le= vels of nested sections is not going to be a common occurrence.

When I talk about strict XHTML, I'm proposing that = a _simple_ ConTeXt document (up to 6 header levels, numbered and unnumbered= lists, images, text emphasis, etc.) should generate a simple, validating X= HTML document. Trying to attain 100% coverage of ConTeXt transmogrification= to XHTML is ridiculous when, I suspect, 80% coverage would meet most needs= . :-)

It is definitely possible to translate the ConTeXt EPUB= output to XHTML. However, there are practical realities that hinder such a= n approach. Architecturally, if anyone is going to translate an XML documen= t to EPUB format, it certainly won't be this way:

XML + XSLT -> ConTeXT File -> ConTeXt EPUB XML= + XSLT -> EPUB + CSS

It'll be this way= , which is less time-consuming, less complex, and less susceptible to err:<= /div>

XML + XSLT (or API) -> EPUB + CSS
<= br>
However, it does not, as we all know, produce as feature rich= output as leveraging the ConTeXt abilities that Aditya mentioned, which wa= s the point:

XML + XSLT -> ConTeXT TeX -> EPUB + CSS

Kindest regards.

_______________________________________________________________________= ____________
If your question is of interest to others as well, please add an entry to t= he Wiki!

maillist : ntg-context@ntg.nl / <= a href=3D"http://www.ntg.nl/mailman/listinfo/ntg-context" target=3D"_blank"= >http://www.ntg.nl/mailman/listinfo/ntg-context
webpage =A0: http://= www.pragma-ade.nl / http://tex.aanhet.net
archive =A0: http://foundry.supelec.fr/projects/contextrev/
wiki =A0 =A0 : http:= //contextgarden.net
___________________________________________________________________________= ________

--f46d04428d9690bda104e5b9a5e8-- --===============1990853334== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________ --===============1990853334==--