ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* EPUB XHTML Format
@ 2013-09-04  1:19 Thangalin
  2013-09-04  9:20 ` Hans Hagen
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Thangalin @ 2013-09-04  1:19 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 1241 bytes --]

Hi,

The attached t.tex file produces the attached t.xhtml file. I have looked
at the following documents:

   - http://en.wikipedia.org/wiki/EPUB#Open_Publication_Structure_2.0.1
   - http://en.wikipedia.org/wiki/DTBook
   - http://www.idpf.org/epub/20/spec/OPS_2.0.1_draft.htm
   - http://www.w3.org/TR/xhtml11/doctype.html
   - http://www.w3.org/TR/html5/sections.html

It seems that the macros in t.tex are being written out as XML elements,
verbatim. It is my understanding that these XML elements, however, do not
conform to the minimal content models associated with XHTML 1.1.

What needs to happen to take a minimal ConTeXt file (such as the attached)
to produce a minimum viable EPUB that:

   - Generates XHTML headers (including <!DOCTYPE and <html...>)
   - Produces images as img tags, rather than float tags.
   - Uses typical XHTML tags for <body> elements (e.g., <ol> for ordered
   lists).

Ideally, I would like to do something such as:

   - context t.tex
   - mtxrun --script epub --make t.specification

to generate an EPUB that passes validation of
epubcheck<http://code.google.com/p/epubcheck/wiki/Library>,
with an output XHTML file that more closely matches the XHTML specification.

How can I help?

Kind regards.

[-- Attachment #1.2: Type: text/html, Size: 1807 bytes --]

[-- Attachment #2: t.tex --]
[-- Type: application/x-tex, Size: 995 bytes --]

[-- Attachment #3: t.xhtml --]
[-- Type: application/xhtml+xml, Size: 3036 bytes --]

[-- Attachment #4: epub-errors.log --]
[-- Type: application/octet-stream, Size: 1647 bytes --]

Epubcheck Version 3.0.1

Validating against EPUB version 2.0
ERROR: t.tree/t.epub/OEBPS/t.opf(18,82): value of attribute "id" is invalid; must be an XML name without colons
WARNING: t.tree/t.epub/OEBPS/toc.ncx: meta@dtb:uid content 'BookId' should conform to unique-identifier in content.opf: 'urn:uuid:26204a45-4754-ab8c-067d-97c51c130f64'
ERROR: t.tree/t.epub/OEBPS/t.xhtml(9,202): elements from namespace "" are not allowed
ERROR: t.tree/t.epub/OEBPS/t.xhtml(12,26): element "xhtml:a" not allowed here; expected element "xhtml:html"
ERROR: t.tree/t.epub/OEBPS/t.xhtml(12,26): attribute "name" not allowed here; expected attribute "accesskey", "charset", "class", "coords", "dir", "href", "hreflang", "id", "lang", "rel", "rev", "shape", "style", "tabindex", "target", "title", "type" or "xml:lang"
ERROR: t.tree/t.epub/OEBPS/t.xhtml(12,87): elements from namespace "" are not allowed
ERROR: t.tree/t.epub/OEBPS/t.xhtml(36,27): element "xhtml:a" not allowed here; expected element "xhtml:html"
ERROR: t.tree/t.epub/OEBPS/t.xhtml(36,27): attribute "name" not allowed here; expected attribute "accesskey", "charset", "class", "coords", "dir", "href", "hreflang", "id", "lang", "rel", "rev", "shape", "style", "tabindex", "target", "title", "type" or "xml:lang"
ERROR: t.tree/t.epub/OEBPS/t.xhtml(36,94): elements from namespace "" are not allowed
ERROR: t.tree/t.epub/OEBPS/t.xhtml(40,98): elements from namespace "" are not allowed
ERROR: t.tree/t.epub/OEBPS/t.xhtml(44,75): elements from namespace "" are not allowed
ERROR: t.tree/t.epub/OEBPS/t.xhtml(47,107): elements from namespace "" are not allowed

Check finished with warnings or errors


[-- Attachment #5: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-04  1:19 EPUB XHTML Format Thangalin
@ 2013-09-04  9:20 ` Hans Hagen
  2013-09-04 17:55   ` Thangalin
  2013-09-05 16:38   ` Hans Hagen
  2013-09-05 18:11 ` honyk
       [not found] ` <00b501ceaa63$61805e50$24811af0$@tosovsky@email.cz>
  2 siblings, 2 replies; 24+ messages in thread
From: Hans Hagen @ 2013-09-04  9:20 UTC (permalink / raw)
  To: ntg-context

On 9/4/2013 3:19 AM, Thangalin wrote:
> Hi,
>
> The attached t.tex file produces the attached t.xhtml file. I have
> looked at the following documents:
>
>   * http://en.wikipedia.org/wiki/EPUB#Open_Publication_Structure_2.0..1
>     <http://en.wikipedia.org/wiki/EPUB#Open_Publication_Structure_2.0.1>
>   * http://en.wikipedia.org/wiki/DTBook
>   * http://www.idpf.org/epub/20/spec/OPS_2.0.1_draft.htm
>   * http://www.w3.org/TR/xhtml11/doctype.html
>   * http://www.w3.org/TR/html5/sections.html
>
> It seems that the macros in t.tex are being written out as XML elements,
> verbatim. It is my understanding that these XML elements, however, do
> not conform to the minimal content models associated with XHTML 1.1.

you get a representation in xml indeed, but not verbatim, but as close 
as possible to the genaric (parent) structure elements in context

of course we could alternatively export all as <div 
class="tag-subtag-..."> but i don't like that too much; html itself is 
not rich enough for our purpose

> What needs to happen to take a minimal ConTeXt file (such as the
> attached) to produce a minimum viable EPUB that:
>
>   * Generates XHTML headers (including <!DOCTYPE and <html...>)

not needed as we're 'standalone'

>   * Produces images as img tags, rather than float tags.

the css can deal with them (info is written to files for that)

the only real problematic thing is hyperlinks as css has no provision 
for that so there's an option to inject <a>...

>   * Uses typical XHTML tags for <body> elements (e.g., <ol> for ordered
>     lists).

xhtml has no typical tags .. it's xml + css (or xslt) ... unfortunately 
browsers have messed up html so much (extensions, too tolerant support 
for unmatched tags, different rendering models) that xhtml never really 
took off

the export of context is in fact just xml, and by tagging it as xhtml we 
can apply css to it; but if someone has a workflow for producing epub an 
option if to postprocess that xml file into whatever epub one wants 
(i.e. the export is generic and carries as much info as possible)

> Ideally, I would like to do something such as:
>
>   * context t.tex
>   * mtxrun --script epub --make t.specification
>
> to generate an EPUB that passes validation of epubcheck
> <http://code.google.com/p/epubcheck/wiki/Library>, with an output XHTML
> file that more closely matches the XHTML specification.

Everytime we look into epub there's another issue ... it's not a 
standard but reversed engineered application mess (happen soften with 
xml: turn some application data structures into xml and call it a standard)

I only tested (long ago already) with some firefox plugin (i don't have 
a recent epub device, only an old firts generation one which is dead 
slow, never relly used, probably broken by now) and i refuse to buy a 
new one till resolution is decent (and i only want generic devices, not 
something bound to some shop)

> How can I help?

by testing

as i have no real use/demand for epub it's not something i look into on 
a daily basis

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-04  9:20 ` Hans Hagen
@ 2013-09-04 17:55   ` Thangalin
  2013-09-05 13:55     ` Hans Hagen
  2013-09-05 16:38   ` Hans Hagen
  1 sibling, 1 reply; 24+ messages in thread
From: Thangalin @ 2013-09-04 17:55 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 3566 bytes --]

Hi.

of course we could alternatively export all as <div class="tag-subtag-...">
> but i don't like that too much; html itself is not rich enough for our
> purpose
>

What about giving developers the ability to change the destination element?
For example:

\setuplist[chapter][
  xml={\starttag[h1]#1\stoptag}
]


Would produce, upon export:

<h1>Chapter</h1>


Or (using "export" instead of "xml"; I don't care what it is named):

\setuplist[chapter][

export={\starttag[div]\startattribute[class]{chapter}#1\stopattribute\stoptag}}
]


Similarly, this would produce:

<div class="chapter">Chapter</div>


This would offer the flexibility of custom XML documents without affecting
the default behaviour.

  * Generates XHTML headers (including <!DOCTYPE and <html...>)
>
> not needed as we're 'standalone'
>

Having the ability to produce the <!DOCTYPE...> and <htmnl> elements could
be as simple as:

\setupexport[
  standalone=no,
]



>   * Produces images as img tags, rather than float tags.
>>
> the css can deal with them (info is written to files for that)
>

Yes, but they aren't standard. There is an ecosystem of tools (e.g.,
Calibre, normalizing CSS templates, etc.), not to mention a widespread
knowledge-base, that groks the minimal XHTML specification. Plus, using XML
tags that are not in the minimal XHTML spec. means more testing on more
devices to make sure that their XHTML parsers render correctly.


> xhtml has no typical tags .. it's xml + css (or xslt) ... unfortunately
> browsers have


That is, a Strictly Conforming XHTML Document, as per:

http://www.w3.org/TR/2000/REC-xhtml1-20000126/#docconf

the export of context is in fact just xml, and by tagging it as xhtml we
> can apply css to it; but if someone has a workflow for producing epub an
> option if to postprocess that xml file into whatever epub one wants
>

I could transform the ConTeXt-generated XML into strictly conforming XHTML,
but it was a step I was hoping to avoid. Right now my process is:

   1. Convert XML data to a ConTeXt .tex file.
   2. Convert ConTeXt to either PDF or EPUB.
   3. Stylize EPUB using CSS.

I want to use ConTeXt here (instead of going directly from XML data to
EPUB) because ConTeXt provides functionality such as multiple indexes,
table-of-contents, and bundling the .epub. Having an extra step to generate
strictly conforming XHTML is architecturally painful as it means
transforming the document three times (XML -> ConTeXt, ConTeXt -> XML, then
XML -> XHTML).


> Everytime we look into epub there's another issue ... it's not a standard
> but reversed engineered application mess (happen soften with xml: turn some
> application data structures into xml and call it a standard)
>

Some book vendors only accept validating EPUBs. ConTeXt is documented as
being able to generate EPUBs. The documentation should state the EPUBs do
not validate and do not generate strictly conforming XHTML.

I have spent the last three weeks converting documents from LaTeX to
ConTeXt because the documentation stated that ConTeXt can produce EPUBs.
While true, the documentation did not mention its shortcomings. Had I known
in advance, I probably would have gone straight to EPUB using Java or, with
a little revulsion, PHP classes. ;-) That said, I probably should have
tested this feature sooner. :-)

as i have no real use/demand for epub it's not something i look into on a
> daily basis
>

How can I help resolve these issues?

Merely "testing" (which I am happy to do) isn't going to produce a strictly
conforming XHTML document.

Kindest regards.

[-- Attachment #1.2: Type: text/html, Size: 8125 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-04 17:55   ` Thangalin
@ 2013-09-05 13:55     ` Hans Hagen
  2013-09-12 14:32       ` Alan BRASLAU
  0 siblings, 1 reply; 24+ messages in thread
From: Hans Hagen @ 2013-09-05 13:55 UTC (permalink / raw)
  To: ntg-context

On 9/4/2013 7:55 PM, Thangalin wrote:
> Hi.
>
>     of course we could alternatively export all as <div
>     class="tag-subtag-..."> but i don't like that too much; html itself
>     is not rich enough for our purpose
>
> What about giving developers the ability to change the destination
> element? For example:
>
>     \setuplist[chapter][
>        xml={\starttag[h1]#1\stoptag}
>     ]
>
> Would produce, upon export:
>
>     <h1>Chapter</h1>

export doesn't happen at that level; something like that would add an 
ugly overhead; it's way easier to make some xslt script that converts 
the rather systematic export to something like that and it only has to 
be written once by someone (not me)

> Or (using "export" instead of "xml"; I don't care what it is named):
>
>     \setuplist[chapter][
>
>     export={\starttag[div]\startattribute[class]{chapter}#1\stopattribute\stoptag}}
>     ]
>
> Similarly, this would produce:
>
>     <div class="chapter">Chapter</div>

you use some tex syntax but it all happens in lua; also, the only way to 
provide some kind of different tagging is to support plugins (read: lua 
functions) that could override default behaviour (but again, it's quite 
easy to do that as a postprocessing step)

> This would offer the flexibility of custom XML documents without
> affecting the default behaviour.
>
>            * Generates XHTML headers (including <!DOCTYPE and <html...>)
>
>     not needed as we're 'standalone'
>
> Having the ability to produce the <!DOCTYPE...> and <htmnl> elements
> could be as simple as:
>
>     \setupexport[
>        standalone=no,
>     ]
>
>            * Produces images as img tags, rather than float tags.
>
>     the css can deal with them (info is written to files for that)
>
> Yes, but they aren't standard. There is an ecosystem of tools (e.g.,
> Calibre, normalizing CSS templates, etc.), not to mention a widespread
> knowledge-base, that groks the minimal XHTML specification. Plus, using
> XML tags that are not in the minimal XHTML spec. means more testing on
> more devices to make sure that their XHTML parsers render correctly.

most of the xml we get here is a funny mix of whatever tags and html 
(often for tables) and normaly there is way more structure than in the 
average html document; the export is meant to be close to the source and 
turning it into some html / div mixture makes it messy

for instance, we have more levels than H1..H6, so how to do H7? if 
someone has to deal with that, he/she can as well transform all into H1 
with some class which is a local solution then

>     xhtml has no typical tags .. it's xml + css (or xslt) ...
>     unfortunately browsers have
>
> That is, a Strictly Conforming XHTML Document, as per:
>
> http://www.w3.org/TR/2000/REC-xhtml1-20000126/#docconf
>
>     the export of context is in fact just xml, and by tagging it as
>     xhtml we can apply css to it; but if someone has a workflow for
>     producing epub an option if to postprocess that xml file into
>     whatever epub one wants

indeed. that was the idea: export xml, tag it as xhtml (with the option 
to provide hyperlinks, an exception), provide some standard css as 
starter and then let users deal with matters the way they like; you can 
be pretty sure that what you want is not the same as what someone else 
wants; and if more people want it, they can together write a 
transformation script (or hire someone)

keep in mind that the export itself is already tricky enough and for me 
it doesn't pay off to provide tons of additional functionality (well, it 
doesn't pay of to export anyway)

> I could transform the ConTeXt-generated XML into strictly conforming
> XHTML, but it was a step I was hoping to avoid. Right now my process is:
>
>  1. Convert XML data to a ConTeXt .tex file.
>  2. Convert ConTeXt to either PDF or EPUB.
>  3. Stylize EPUB using CSS.

but writing the transform that suits you is just one step (with yuou 
spending the time on it) while extending the export into a complete 
transformation and configuration thing would put the burden on me -)

> I want to use ConTeXt here (instead of going directly from XML data to
> EPUB) because ConTeXt provides functionality such as multiple indexes,
> table-of-contents, and bundling the .epub. Having an extra step to
> generate strictly conforming XHTML is architecturally painful as it
> means transforming the document three times (XML -> ConTeXt, ConTeXt ->
> XML, then XML -> XHTML).

why is it painful? the export if quite generic and will not change; it 
is also flexible as it honors user defined sectioning and styling

>     Everytime we look into epub there's another issue ... it's not a
>     standard but reversed engineered application mess (happen soften
>     with xml: turn some application data structures into xml and call it
>     a standard)
>
>
> Some book vendors only accept validating EPUBs. ConTeXt is documented as
> being able to generate EPUBs. The documentation should state the EPUBs
> do not validate and do not generate strictly conforming XHTML.

well, i, luigi and some others did tests: the thing is that epub is 
evolving and we had quite some conflicting validations (and specs) and 
we try as good as possible to adapt

so you need to be more precise in "doesn't validate": it's proper xml 
and therefore proper xhtml (and nothing says that there should be html 
tags)

> I have spent the last three weeks converting documents from LaTeX to
> ConTeXt because the documentation stated that ConTeXt can produce EPUBs.
> While true, the documentation did not mention its shortcomings. Had I
> known in advance, I probably would have gone straight to EPUB using Java
> or, with a little revulsion, PHP classes. ;-) That said, I probably
> should have tested this feature sooner. :-)

the export is a reconstruction of the input, and the more structure the 
better; if you really need a multiple out put format, you should use xml 
as source and then use context fo rpdf creation and xslt for html creation

i really see no problem with a transformation from the generic export to 
some epub (whatever variant your whatever device supports) ... really: 
you cannot expect me to provide an extensive configurable export system 
(for only one user) that will never suit all users so ... also, 
configuring it for some document is probably as much work as writing an 
xslt transformation

>     as i have no real use/demand for epub it's not something i look into
>     on a daily basis
>
>
> How can I help resolve these issues?
>
> Merely "testing" (which I am happy to do) isn't going to produce a
> strictly conforming XHTML document.

indeed it isn't producing an html document (with properly matched tags) 
but i'm not convinced that it isn't xhtml

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-04  9:20 ` Hans Hagen
  2013-09-04 17:55   ` Thangalin
@ 2013-09-05 16:38   ` Hans Hagen
  2013-09-05 16:57     ` Thangalin
  2013-09-05 17:22     ` Aditya Mahajan
  1 sibling, 2 replies; 24+ messages in thread
From: Hans Hagen @ 2013-09-05 16:38 UTC (permalink / raw)
  To: ntg-context

On 9/4/2013 11:20 AM, Hans Hagen wrote:

> you get a representation in xml indeed, but not verbatim, but as close
> as possible to the genaric (parent) structure elements in context

probably the most straightforward xhtml export is file with only

<div class="section" ...>
	<div class="..." ...>
         <div>
</div>

i.e. only divs and spans


-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-05 16:38   ` Hans Hagen
@ 2013-09-05 16:57     ` Thangalin
  2013-09-05 17:57       ` Khaled Hosny
  2013-09-05 17:22     ` Aditya Mahajan
  1 sibling, 1 reply; 24+ messages in thread
From: Thangalin @ 2013-09-05 16:57 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 952 bytes --]

Hi,

<div class="section" ...>
>         <div class="..." ...>
>         <div>
> </div>
>
> i.e. only divs and spans


I think that would be a more robust output format, technically, easier to
adapt, and more readily conform to the strict XHTML tag subset.

The other issue I encountered was this:

\startfrontmatter
  \startstandardmakeup
    Title page
  \stopstandardmakeup

  \startstandardmakeup
    Copyright
  \stopstandardmakeup

  \completecontent
\stopfrontmatter


This produced "*Title pageCopyright*" as text without any markup, which
makes the EPUB output a bit difficult to parse. I thought the software
should output something like:

<div class="frontmatter">
  <div id="standardmakeup1" class="standardmakeup">Title page</div>
  <div id="standardmakeup2" class="standardmakeup">Copyright</div>
  <div class="contents"><!-- etc... --></div>
</div>


This way the title and copyright pages can be styled independently.

Kindest regards.

[-- Attachment #1.2: Type: text/html, Size: 3341 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-05 16:38   ` Hans Hagen
  2013-09-05 16:57     ` Thangalin
@ 2013-09-05 17:22     ` Aditya Mahajan
  2013-09-05 18:21       ` Hans Hagen
  1 sibling, 1 reply; 24+ messages in thread
From: Aditya Mahajan @ 2013-09-05 17:22 UTC (permalink / raw)
  To: mailing list for ConTeXt users

On Thu, 5 Sep 2013, Hans Hagen wrote:

> On 9/4/2013 11:20 AM, Hans Hagen wrote:
>
>> you get a representation in xml indeed, but not verbatim, but as close
>> as possible to the genaric (parent) structure elements in context
>
> probably the most straightforward xhtml export is file with only
>
> <div class="section" ...>
> 	<div class="..." ...>
>        <div>
> </div>
>
> i.e. only divs and spans

How easy is it to create a new export format. IIRC, context keeps track of 
the entire document tree, and flushes the XML output only at the end. Is 
it possible to make this pluggable so that users can write their own 
transformers (in lua) on how the document tree can be written. This will 
enable more output formats (opendocument and (shudder) latex).

Aditya
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-05 16:57     ` Thangalin
@ 2013-09-05 17:57       ` Khaled Hosny
  2013-09-05 18:22         ` Hans Hagen
  0 siblings, 1 reply; 24+ messages in thread
From: Khaled Hosny @ 2013-09-05 17:57 UTC (permalink / raw)
  To: mailing list for ConTeXt users

On Thu, Sep 05, 2013 at 09:57:59AM -0700, Thangalin wrote:
> Hi,
> 
> <div class="section" ...>
> >         <div class="..." ...>
> >         <div>
> > </div>
> >
> > i.e. only divs and spans
> 
> 
> I think that would be a more robust output format, technically, easier to
> adapt, and more readily conform to the strict XHTML tag subset.

What about accessibility? I expect that visually impaired people would
depend on document structure rather than its visualisation.

Regards,
Khaled
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-04  1:19 EPUB XHTML Format Thangalin
  2013-09-04  9:20 ` Hans Hagen
@ 2013-09-05 18:11 ` honyk
       [not found] ` <00b501ceaa63$61805e50$24811af0$@tosovsky@email.cz>
  2 siblings, 0 replies; 24+ messages in thread
From: honyk @ 2013-09-05 18:11 UTC (permalink / raw)
  To: 'mailing list for ConTeXt users'

On 2013-09-04 Thangalin wrote:
> 
> What needs to happen to take a minimal ConTeXt file (such as the
> attached) to produce a minimum viable EPUB that:
> 

It is always difficult to parse and further process not well structured
plain text without advanced semantics. Garbage in, garbage out.

If you need both EPUB and PDF, start with a semantically rich XML
vocabulary, e.g. DocBook. In this case you can relatively easy transfrom
(XSLT) input data into almost any format. These basic outputs like EPUB or
PDF (via XSL-FO) you can get out-of-the-box. The Context output can be
generated using dbcontext: http://dblatex.sourceforge.net/

In sum, use XML as your primary source and from it derive everything else.

Jan

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
       [not found] ` <00b501ceaa63$61805e50$24811af0$@tosovsky@email.cz>
@ 2013-09-05 18:20   ` Aditya Mahajan
  2013-09-05 18:24     ` Hans Hagen
  2013-09-05 22:00     ` Thangalin
  0 siblings, 2 replies; 24+ messages in thread
From: Aditya Mahajan @ 2013-09-05 18:20 UTC (permalink / raw)
  To: mailing list for ConTeXt users

On Thu, 5 Sep 2013, honyk wrote:

> On 2013-09-04 Thangalin wrote:
>>
>> What needs to happen to take a minimal ConTeXt file (such as the
>> attached) to produce a minimum viable EPUB that:
>>
> It is always difficult to parse and further process not well structured
> plain text without advanced semantics. Garbage in, garbage out.

The typical ConTeXt document has a lot of structure, and the XML export 
generates a well structured XML output. That can be directly used in most 
modern browsers that handle XML+CSS well. However, most (all?) EPUB 
readers don't. So, the question is asking if instead ConTeXt could 
generate a XHTML

> If you need both EPUB and PDF, start with a semantically rich XML
> vocabulary, e.g. DocBook. In this case you can relatively easy transfrom
> (XSLT) input data into almost any format. These basic outputs like EPUB or
> PDF (via XSL-FO) you can get out-of-the-box. The Context output can be
> generated using dbcontext: http://dblatex.sourceforge.net/
>
> In sum, use XML as your primary source and from it derive everything else.

I haven't used XML-only toolchains. Is it possible to handle:

- Automatic section numbering taking care of different conversions.
- Automatic index generation and sorting
- Inserting hyphenation points at the approriate place in the generated 
ouput (so that the browser can effectively rely on TeX's hyphenation 
algorithm to do linebreaking).
- Convert TeX math to MathML.

The current ConTeXT XML source can translate a well formed ConTeXt 
document into a XML document with the above features.

Aditya
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-05 17:22     ` Aditya Mahajan
@ 2013-09-05 18:21       ` Hans Hagen
  0 siblings, 0 replies; 24+ messages in thread
From: Hans Hagen @ 2013-09-05 18:21 UTC (permalink / raw)
  To: ntg-context

On 9/5/2013 7:22 PM, Aditya Mahajan wrote:
> On Thu, 5 Sep 2013, Hans Hagen wrote:
>
>> On 9/4/2013 11:20 AM, Hans Hagen wrote:
>>
>>> you get a representation in xml indeed, but not verbatim, but as close
>>> as possible to the genaric (parent) structure elements in context
>>
>> probably the most straightforward xhtml export is file with only
>>
>> <div class="section" ...>
>>     <div class="..." ...>
>>        <div>
>> </div>
>>
>> i.e. only divs and spans
>
> How easy is it to create a new export format. IIRC, context keeps track
> of the entire document tree, and flushes the XML output only at the end.
> Is it possible to make this pluggable so that users can write their own
> transformers (in lua) on how the document tree can be written. This will
> enable more output formats (opendocument and (shudder) latex).

sure, but first i want to clean up some code (it's rather complex) ... 
in principle there is a document tree so one can plug into that; 
alternatively one can load the xml tree and mess with that (probably 
easier if we provide some styles for it)

Hans


-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-05 17:57       ` Khaled Hosny
@ 2013-09-05 18:22         ` Hans Hagen
  0 siblings, 0 replies; 24+ messages in thread
From: Hans Hagen @ 2013-09-05 18:22 UTC (permalink / raw)
  To: ntg-context

On 9/5/2013 7:57 PM, Khaled Hosny wrote:
> On Thu, Sep 05, 2013 at 09:57:59AM -0700, Thangalin wrote:
>> Hi,
>>
>> <div class="section" ...>
>>>          <div class="..." ...>
>>>          <div>
>>> </div>
>>>
>>> i.e. only divs and spans
>>
>>
>> I think that would be a more robust output format, technically, easier to
>> adapt, and more readily conform to the strict XHTML tag subset.
>
> What about accessibility? I expect that visually impaired people would
> depend on document structure rather than its visualisation.

For that purpose I'd make a nice special doc. But the basic export has 
at least the similar structure as the original. (After all, it's one of 
the reasons why we *can do* an export.

Hans


-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-05 18:20   ` Aditya Mahajan
@ 2013-09-05 18:24     ` Hans Hagen
  2013-09-05 19:54       ` Mica Semrick
  2013-09-05 21:15       ` Michael Hallgren
  2013-09-05 22:00     ` Thangalin
  1 sibling, 2 replies; 24+ messages in thread
From: Hans Hagen @ 2013-09-05 18:24 UTC (permalink / raw)
  To: ntg-context

On 9/5/2013 8:20 PM, Aditya Mahajan wrote:

> The typical ConTeXt document has a lot of structure, and the XML export
> generates a well structured XML output. That can be directly used in
> most modern browsers that handle XML+CSS well. However, most (all?) EPUB
> readers don't. So, the question is asking if instead ConTeXt could
> generate a XHTML

but how hard would it be to make an xslt tranformation from 
context.export to epub variants (ok, at some point i can look into it 
but only if there is a robust standard and i have devices to test it on)

and indeed the quality of the source is important

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-05 18:24     ` Hans Hagen
@ 2013-09-05 19:54       ` Mica Semrick
  2013-09-05 21:15       ` Michael Hallgren
  1 sibling, 0 replies; 24+ messages in thread
From: Mica Semrick @ 2013-09-05 19:54 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 1975 bytes --]

I'd say use an xml source (docbook, TEI, or DITA) and then write a ConTeXt
stylesheet to typeset your XML. See http://wiki.contextgarden.net/TEI_xml

I think that TEI-lite is a nice, very general XML vocabulary...

Best,
Mica


On Thu, Sep 5, 2013 at 11:24 AM, Hans Hagen <pragma@wxs.nl> wrote:

> On 9/5/2013 8:20 PM, Aditya Mahajan wrote:
>
>  The typical ConTeXt document has a lot of structure, and the XML export
>> generates a well structured XML output. That can be directly used in
>> most modern browsers that handle XML+CSS well. However, most (all?) EPUB
>> readers don't. So, the question is asking if instead ConTeXt could
>> generate a XHTML
>>
>
> but how hard would it be to make an xslt tranformation from context.export
> to epub variants (ok, at some point i can look into it but only if there is
> a robust standard and i have devices to test it on)
>
> and indeed the quality of the source is important
>
>
> Hans
>
> ------------------------------**------------------------------**-----
>                                           Hans Hagen | PRAGMA ADE
>               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
>     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
>                                              | www.pragma-pod.nl
> ------------------------------**------------------------------**-----
> ______________________________**______________________________**
> _______________________
> If your question is of interest to others as well, please add an entry to
> the Wiki!
>
> maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/**
> listinfo/ntg-context <http://www.ntg.nl/mailman/listinfo/ntg-context>
> webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
> archive  : http://foundry.supelec.fr/**projects/contextrev/<http://foundry.supelec.fr/projects/contextrev/>
> wiki     : http://contextgarden.net
> ______________________________**______________________________**
> _______________________
>

[-- Attachment #1.2: Type: text/html, Size: 3108 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-05 18:24     ` Hans Hagen
  2013-09-05 19:54       ` Mica Semrick
@ 2013-09-05 21:15       ` Michael Hallgren
  1 sibling, 0 replies; 24+ messages in thread
From: Michael Hallgren @ 2013-09-05 21:15 UTC (permalink / raw)
  To: ntg-context

Le 05/09/2013 20:24, Hans Hagen a écrit :
> On 9/5/2013 8:20 PM, Aditya Mahajan wrote:
>
>> The typical ConTeXt document has a lot of structure, and the XML export
>> generates a well structured XML output. That can be directly used in
>> most modern browsers that handle XML+CSS well. However, most (all?) EPUB
>> readers don't. So, the question is asking if instead ConTeXt could
>> generate a XHTML
>
> but how hard would it be to make an xslt tranformation from
> context.export to epub variants (ok, at some point i can look into it
> but only if there is a robust standard and i have devices to test it on)
>
> and indeed the quality of the source is important


Sounds by far to be the cleanest approach.

Cheers,

mh

>
> Hans
>
> -----------------------------------------------------------------
>                                           Hans Hagen | PRAGMA ADE
>               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
>     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
>                                              | www.pragma-pod.nl
> -----------------------------------------------------------------
> ___________________________________________________________________________________
>
> If your question is of interest to others as well, please add an entry
> to the Wiki!
>
> maillist : ntg-context@ntg.nl /
> http://www.ntg.nl/mailman/listinfo/ntg-context
> webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
> archive  : http://foundry.supelec.fr/projects/contextrev/
> wiki     : http://contextgarden.net
> ___________________________________________________________________________________
>

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-05 18:20   ` Aditya Mahajan
  2013-09-05 18:24     ` Hans Hagen
@ 2013-09-05 22:00     ` Thangalin
  2013-09-06 16:09       ` Hans Hagen
  2013-09-06 16:36       ` Mica Semrick
  1 sibling, 2 replies; 24+ messages in thread
From: Thangalin @ 2013-09-05 22:00 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 3295 bytes --]

Hi,

handle XML+CSS well. However, most (all?) EPUB readers don't. So, the
> question is asking if instead ConTeXt could generate a XHTML


Precisely.


>  If you need both EPUB and PDF, start with a semantically rich XML
>> vocabulary, e.g. DocBook. In this case you can relatively easy transfrom
>>
>
My database doesn't generate DocBook. It generates a custom XML document
from which I generate a web page, and a LaTeX document (though soon to be
ConTeXt!). There is no reason, technically, why I cannot convert the source
XML to either DocBook or directly to EPUB. There are, however, problems
doing that, which Aditya correctly surmises:


> - Automatic section numbering taking care of different conversions.
> - Automatic index generation and sorting
> - Inserting hyphenation points at the appropriate place in the generated
> output (so that the browser can effectively rely on TeX's hyphenation
> algorithm to do line-breaking).
> - Convert TeX math to MathML.
>
> The current ConTeXT XML source can translate a well formed ConTeXt
> document into a XML document with the above features.


Those are exactly the issues that I would love to resolve using ConTeXt for
generating an EPUB. (The MathML isn't as important to me, but I can see
other people wanting such a feature.)

What about accessibility? I expect that visually impaired people would
> depend on document structure rather than its visualisation.


That is a good point. The current XML structure produced by ConTeXt (Hans
correct me here if I'm mistaken) is not accessible, as it doesn't adhere to
strict XHTML. I suspect that <div> tags would not be accessible -- the only
way to provide true accessibility in EPUB format would be by using the
strict XHTML tags.

for instance, we have more levels than H1..H6, so how to do H7? if someone
> has to deal with that, he/she can as well transform all into H1 with some
> class which is a local solution then


I realize there is not going to be a one-to-one map of all possible ConTeXt
macros to XHTML. For someone who has 7 levels of nested sections they would
either have to rewrite some Lua or perform some post-processing (e.g., with
XSLT). I would posit that a document with 7 levels of nested sections is
not going to be a common occurrence.

When I talk about strict XHTML, I'm proposing that a _simple_ ConTeXt
document (up to 6 header levels, numbered and unnumbered lists, images,
text emphasis, etc.) should generate a simple, validating XHTML document.
Trying to attain 100% coverage of ConTeXt transmogrification to XHTML is
ridiculous when, I suspect, 80% coverage would meet most needs. :-)

It is definitely possible to translate the ConTeXt EPUB output to XHTML.
However, there are practical realities that hinder such an approach.
Architecturally, if anyone is going to translate an XML document to EPUB
format, it certainly won't be this way:

*XML + XSLT -> ConTeXT File -> ConTeXt EPUB XML + XSLT -> EPUB + CSS*

It'll be this way, which is less time-consuming, less complex, and less
susceptible to err:

*XML + XSLT (or API) -> EPUB + CSS*

However, it does not, as we all know, produce as feature rich output as
leveraging the ConTeXt abilities that Aditya mentioned, which was the point:

*XML + XSLT -> ConTeXT TeX -> EPUB + CSS*

Kindest regards.

[-- Attachment #1.2: Type: text/html, Size: 5181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-05 22:00     ` Thangalin
@ 2013-09-06 16:09       ` Hans Hagen
  2013-09-06 16:36       ` Mica Semrick
  1 sibling, 0 replies; 24+ messages in thread
From: Hans Hagen @ 2013-09-06 16:09 UTC (permalink / raw)
  To: ntg-context

On 9/6/2013 12:00 AM, Thangalin wrote:

> That is a good point. The current XML structure produced by ConTeXt
> (Hans correct me here if I'm mistaken) is not accessible, as it doesn't
> adhere to strict XHTML. I suspect that <div> tags would not be
> accessible -- the only way to provide true accessibility in EPUB format
> would be by using the strict XHTML tags.

html is not rich enough .. one ends up with abusing tags which in turn 
is confusing for accesibility ... i once saw an epub where h1 was used 
for the chapter number and h2 for the chapter title

> When I talk about strict XHTML, I'm proposing that a _simple_ ConTeXt
> document (up to 6 header levels, numbered and unnumbered lists, images,
> text emphasis, etc.) should generate a simple, validating XHTML
> document. Trying to attain 100% coverage of ConTeXt transmogrification
> to XHTML is ridiculous when, I suspect, 80% coverage would meet most
> needs.. :-)

in that case a few page transformation could do, isn't it?

> *XML + XSLT -> ConTeXT TeX -> EPUB + CSS*

probably ok for novels but who there is no way to limit the user ... so 
in the end we still have a complex mix to deal with ... i'd rather have

ConTeXT TeX reading xml -> export -> optional transform -> EPUB + CSS*

you want 'direct epub html from context' (no xslt) but on the other hand 
use xslt to map onto context while context can do xml directly ... 
chicken egg

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-05 22:00     ` Thangalin
  2013-09-06 16:09       ` Hans Hagen
@ 2013-09-06 16:36       ` Mica Semrick
  2013-09-06 20:20         ` Thangalin
  1 sibling, 1 reply; 24+ messages in thread
From: Mica Semrick @ 2013-09-06 16:36 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 4830 bytes --]

Another small note, since I just walked down the ePUB path: you'll be very
sad to find out that a lot of rendering engines for popular readers are not
consistent, won't render standard XHTML markup correctly (nest an ordered
list within an unordered list and then look at it in adobe digital editions
and several other readers). "But it is just XHML + CSS!" you'll cry, "How
can they not render it correctly?" I don't know, but it was an extremely
frustrating process. I even contacted adobe to try and report this nested
list bug to them... their suggestion was that I could *pay* them to work
with "content experts" who would help me "correct" my source so that it
would render "correctly."

The best reader imho is iBooks on the iPad, nothing else, from what I've
seen, comes close. But that is one expensive eReader. :(


On Thu, Sep 5, 2013 at 3:00 PM, Thangalin <thangalin@gmail.com> wrote:

> Hi,
>
> handle XML+CSS well. However, most (all?) EPUB readers don't. So, the
>> question is asking if instead ConTeXt could generate a XHTML
>
>
> Precisely.
>
>
>>  If you need both EPUB and PDF, start with a semantically rich XML
>>> vocabulary, e.g. DocBook. In this case you can relatively easy transfrom
>>>
>>
> My database doesn't generate DocBook. It generates a custom XML document
> from which I generate a web page, and a LaTeX document (though soon to be
> ConTeXt!). There is no reason, technically, why I cannot convert the source
> XML to either DocBook or directly to EPUB. There are, however, problems
> doing that, which Aditya correctly surmises:
>
>
>> - Automatic section numbering taking care of different conversions.
>> - Automatic index generation and sorting
>> - Inserting hyphenation points at the appropriate place in the generated
>> output (so that the browser can effectively rely on TeX's hyphenation
>> algorithm to do line-breaking).
>>
>> - Convert TeX math to MathML.
>>
>> The current ConTeXT XML source can translate a well formed ConTeXt
>> document into a XML document with the above features.
>>
>
> Those are exactly the issues that I would love to resolve using ConTeXt
> for generating an EPUB. (The MathML isn't as important to me, but I can see
> other people wanting such a feature.)
>
> What about accessibility? I expect that visually impaired people would
>> depend on document structure rather than its visualisation.
>
>
> That is a good point. The current XML structure produced by ConTeXt (Hans
> correct me here if I'm mistaken) is not accessible, as it doesn't adhere to
> strict XHTML. I suspect that <div> tags would not be accessible -- the only
> way to provide true accessibility in EPUB format would be by using the
> strict XHTML tags.
>
> for instance, we have more levels than H1..H6, so how to do H7? if someone
>> has to deal with that, he/she can as well transform all into H1 with some
>> class which is a local solution then
>
>
> I realize there is not going to be a one-to-one map of all possible
> ConTeXt macros to XHTML. For someone who has 7 levels of nested sections
> they would either have to rewrite some Lua or perform some post-processing
> (e.g., with XSLT). I would posit that a document with 7 levels of nested
> sections is not going to be a common occurrence.
>
> When I talk about strict XHTML, I'm proposing that a _simple_ ConTeXt
> document (up to 6 header levels, numbered and unnumbered lists, images,
> text emphasis, etc.) should generate a simple, validating XHTML document.
> Trying to attain 100% coverage of ConTeXt transmogrification to XHTML is
> ridiculous when, I suspect, 80% coverage would meet most needs. :-)
>
> It is definitely possible to translate the ConTeXt EPUB output to XHTML.
> However, there are practical realities that hinder such an approach.
> Architecturally, if anyone is going to translate an XML document to EPUB
> format, it certainly won't be this way:
>
> *XML + XSLT -> ConTeXT File -> ConTeXt EPUB XML + XSLT -> EPUB + CSS*
>
> It'll be this way, which is less time-consuming, less complex, and less
> susceptible to err:
>
> *XML + XSLT (or API) -> EPUB + CSS*
>
> However, it does not, as we all know, produce as feature rich output as
> leveraging the ConTeXt abilities that Aditya mentioned, which was the point:
>
> *XML + XSLT -> ConTeXT TeX -> EPUB + CSS*
>
> Kindest regards.
>
>
> ___________________________________________________________________________________
> If your question is of interest to others as well, please add an entry to
> the Wiki!
>
> maillist : ntg-context@ntg.nl /
> http://www.ntg.nl/mailman/listinfo/ntg-context
> webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
> archive  : http://foundry.supelec.fr/projects/contextrev/
> wiki     : http://contextgarden.net
>
> ___________________________________________________________________________________
>

[-- Attachment #1.2: Type: text/html, Size: 7505 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-06 16:36       ` Mica Semrick
@ 2013-09-06 20:20         ` Thangalin
  2013-09-06 21:22           ` Thangalin
  2013-09-07 12:07           ` Hans Hagen
  0 siblings, 2 replies; 24+ messages in thread
From: Thangalin @ 2013-09-06 20:20 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 951 bytes --]

Hi,

The best reader imho is iBooks on the iPad, nothing else, from what I've
> seen, comes close. But that is one expensive eReader. :(
>

We'll just have everybody in the world who has a Kindle, Kobo, or other
reader exchange their existing hardware, and then purchase an iPad plus
iBook. Problem solved? ;-)

ConTeXT TeX reading xml -> export -> optional transform -> EPUB + CSS*
> you want 'direct epub html from context' (no xslt) but on the other hand
> use xslt to map onto context while context can do xml directly ... chicken
> egg


Well, given that ConTeXt doesn't actually produce validating EPUB
documents, I suspect not many people will actually use that feature. It's
great in theory, but if it produces books that don't actually work on the
Kindle or Kobo, then it's unusable in practice -- never mind not being able
to add the books to online marketplaces (such as Amazon) because, again,
the output does not validate.

Kind regards.

[-- Attachment #1.2: Type: text/html, Size: 1739 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-06 20:20         ` Thangalin
@ 2013-09-06 21:22           ` Thangalin
  2013-09-06 21:27             ` Aditya Mahajan
  2013-09-07 12:07           ` Hans Hagen
  1 sibling, 1 reply; 24+ messages in thread
From: Thangalin @ 2013-09-06 21:22 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 530 bytes --]

Hi,

never mind not being able to add the books to online marketplaces (such as
> Amazon) because, again, the output does not validate.
>

I think the simplest thing to do would be to update the wiki and have a
note that informs readers that while ConTeXt can be used to generate an
EPUB, it is likely that that EPUB will be unusable for devices without
further transformation of the XML content. At least that way the knowledge
is out there and people are forewarned that not all EPUB documents are
equivalent.

Kindest regards.

[-- Attachment #1.2: Type: text/html, Size: 877 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-06 21:22           ` Thangalin
@ 2013-09-06 21:27             ` Aditya Mahajan
  0 siblings, 0 replies; 24+ messages in thread
From: Aditya Mahajan @ 2013-09-06 21:27 UTC (permalink / raw)
  To: mailing list for ConTeXt users

On Fri, 6 Sep 2013, Thangalin wrote:

> Hi,
>
> never mind not being able to add the books to online marketplaces (such as
>> Amazon) because, again, the output does not validate.
>>
>
> I think the simplest thing to do would be to update the wiki and have a
> note that informs readers that while ConTeXt can be used to generate an
> EPUB, it is likely that that EPUB will be unusable for devices without
> further transformation of the XML content. At least that way the knowledge
> is out there and people are forewarned that not all EPUB documents are
> equivalent.

It will also be nice to add a table that lists the EPUB readers (hardware 
and software) and tells whether ConTeXt produced EPUB documents work on 
them.

Aditya
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-06 20:20         ` Thangalin
  2013-09-06 21:22           ` Thangalin
@ 2013-09-07 12:07           ` Hans Hagen
  2013-09-07 18:31             ` Thangalin
  1 sibling, 1 reply; 24+ messages in thread
From: Hans Hagen @ 2013-09-07 12:07 UTC (permalink / raw)
  To: ntg-context

On 9/6/2013 10:20 PM, Thangalin wrote:
> Hi,
>
>     The best reader imho is iBooks on the iPad, nothing else, from what
>     I've seen, comes close. But that is one expensive eReader. :(
>
>
> We'll just have everybody in the world who has a Kindle, Kobo, or other
> reader exchange their existing hardware, and then purchase an iPad plus
> iBook. Problem solved? ;-)
>
>     ConTeXT TeX reading xml -> export -> optional transform -> EPUB + CSS*
>     you want 'direct epub html from context' (no xslt) but on the other
>     hand use xslt to map onto context while context can do xml directly
>     ... chicken egg
>
>
> Well, given that ConTeXt doesn't actually produce validating EPUB
> documents, I suspect not many people will actually use that feature.
> It's great in theory, but if it produces books that don't actually work
> on the Kindle or Kobo, then it's unusable in practice -- never mind not
> being able to add the books to online marketplaces (such as Amazon)
> because, again, the output does not validate.

context doesn't produce epub (which at this moment is so floating that i 
would keep updating, which is fine if i'd use it myself or in projects 
at pragma, but not for the sake of keeping up) but does an export to xml 
(*.export)

as a bonus it can output some extra stuff so that in a browser that can 
deal with xml+css (and a few xhtml tags for hyperlinks) we can preview

then there is mtx-epub that can make an epub but that is a moving target 
(at some point we stopped extending waiting for a decent standard)

so, i'd never claim that context produces epub but it can be used in a 
workflow that involves epub as it outputs xml which can be transformed

supporting all variants of epub in the backend would be the same as 
hardcoding all kind of xml dts in the frontend (docbook, tei, whatever); 
instead we provide a general xml handler and a general xml export

Hans


-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-07 12:07           ` Hans Hagen
@ 2013-09-07 18:31             ` Thangalin
  0 siblings, 0 replies; 24+ messages in thread
From: Thangalin @ 2013-09-07 18:31 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 1052 bytes --]

Hi,

so, i'd never claim that context produces epub but it can be used in a
> workflow that involves epub as it outputs xml which can be transformed
>

That's a distinction that either might not matter or sometimes is lost:

http://tex.stackexchange.com/a/17642/2148
http://wiki.contextgarden.net/epub
"ConTeXt has preliminary epub <http://en.wikipedia.org/wiki/EPUB>support..."

Does ConTeXt refer to a suite of tools, or only the "context" command?
Either way, it appears that the line between the command and the tool set
is blurred a bit. This is completely understandable, too, as you wouldn't
want to write, "the ConTeXt suite of tools includes a command, mtxrun, that
can produce EPUB files" all the time when talking about EPUBs.


> supporting all variants of epub in the backend would be the same as
> hardcoding all kind of xml dts in the frontend (docbook, tei, whatever);
> instead we provide a general xml handler and a general xml export


That paragraph would be an excellent addition to the wiki; not sure where
though.

Kind regards.

[-- Attachment #1.2: Type: text/html, Size: 1763 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: EPUB XHTML Format
  2013-09-05 13:55     ` Hans Hagen
@ 2013-09-12 14:32       ` Alan BRASLAU
  0 siblings, 0 replies; 24+ messages in thread
From: Alan BRASLAU @ 2013-09-12 14:32 UTC (permalink / raw)
  To: mailing list for ConTeXt users

On Thu, 5 Sep 2013 19:22:42
Aditya Mahajan <adityam@umich.edu> wrote:

> How easy is it to create a new export format. IIRC, context keeps track of 
> the entire document tree, and flushes the XML output only at the end. Is 
> it possible to make this pluggable so that users can write their own 
> transformers (in lua) on how the document tree can be written. This will 
> enable more output formats (opendocument and (shudder) latex).

Or, (gasp!) MSword .docx

Alan
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2013-09-12 14:32 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-04  1:19 EPUB XHTML Format Thangalin
2013-09-04  9:20 ` Hans Hagen
2013-09-04 17:55   ` Thangalin
2013-09-05 13:55     ` Hans Hagen
2013-09-12 14:32       ` Alan BRASLAU
2013-09-05 16:38   ` Hans Hagen
2013-09-05 16:57     ` Thangalin
2013-09-05 17:57       ` Khaled Hosny
2013-09-05 18:22         ` Hans Hagen
2013-09-05 17:22     ` Aditya Mahajan
2013-09-05 18:21       ` Hans Hagen
2013-09-05 18:11 ` honyk
     [not found] ` <00b501ceaa63$61805e50$24811af0$@tosovsky@email.cz>
2013-09-05 18:20   ` Aditya Mahajan
2013-09-05 18:24     ` Hans Hagen
2013-09-05 19:54       ` Mica Semrick
2013-09-05 21:15       ` Michael Hallgren
2013-09-05 22:00     ` Thangalin
2013-09-06 16:09       ` Hans Hagen
2013-09-06 16:36       ` Mica Semrick
2013-09-06 20:20         ` Thangalin
2013-09-06 21:22           ` Thangalin
2013-09-06 21:27             ` Aditya Mahajan
2013-09-07 12:07           ` Hans Hagen
2013-09-07 18:31             ` Thangalin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).