help / color / mirror / Atom feed
From: Ingo Schwarze <schwarze@usta.de>
To: Kristaps Dzonsons <kristaps@bsd.lv>
Cc: discuss@mdocml.bsd.lv
Subject: Re: HTML5
Date: Sun, 10 Aug 2014 18:12:29 +0200	[thread overview]
Message-ID: <20140810161229.GD325@iris.usta.de> (raw)
In-Reply-To: <53E75A07.2070907@bsd.lv>

Hi Kristaps,

Kristaps Dzonsons wrote on Sun, Aug 10, 2014 at 01:39:51PM +0200:

> I agree.  In short, unifying under HTML5 will simplify the code (no
> switching)--that much is clear.  I don't care whether it's HTML5 or
> asciidoc

I care that we stay away from asciidoc.  :-)

> so long as it gets the job done.  And for browsers, it's
> between flavours of HTML.  So let's consider why HTML5 instead of
> just HTML4 or XHTML1 as-is.
> First, note that the patch's HTML5 is called "polyglot" HTML5, which
> is to say, HTML5 with XML syntax.
> (Link: <http://dev.w3.org/html5/html-polyglot>.)
> A "pro" is that polyglot HTML5 has the same doctype *and content
> type* for its XHTML and HTML modes.  So we can create well-formed,
> parseable HTML5 mark-up using strict XML syntax, then serve it with
> text/html and be happily standards-compliant.  As it is, we put a
> burden on the agency serving -Thtml or -Txhtml pages to know the
> difference.
> The "con" is that by unifying -Thtml/xhtml as HTML5--or anything,
> really--we lose strict HTML4 callers of -Ofragment (XHTML1 callers
> would be fine).  The only caller right now is cgi.c, which
> stipulates HTML4. This can be fixed easily: remove the HTML4 parts
> of to cgi.c's DOCTYPE and close the void img, meta, and link
> elements.  See the "pro" above for why that's also a smart idea.

Fixing that sounds easy indeed and does not seem to have a downside.

As far as i see, the only thing that's really messed up between
HTML 4.01 and XHTML (and hence polyglot HTML 5) is void elements.
Void elements have to be <br/> in polyglot.  That parses as
<br>&gt; in HTML 4.01 if i understand correctly, so strictly
speaking, a document that is supposed to be both valid polyglot 
and valid HTML 4.01 cannot use any void elements.  But i think
ignoring that detail and just shrugging our shoulders with respect
to the extra &gt; that HTML 4.01 parsing would output seems the
best we can do, and good enough.  It is not likely to become a
problem in practice, i think.

> Another "pro" is that we get eqn.  Ingo, this is the "feature"
> you're worried about.

Not really.  I didn't think about that.  What i meant is HTML 5
only syntax creeping in for stuff that can be rendered in HTML 4
because it looks a bit better or even just because it's more modern.

There is no sane way to render eqn in HTML 4, it's beyond the scope.
So a document using eqn will look crappy with a HTML 4 browser in
any case.  Whether that is because the document doesn't attempt to
render the eqn content at all, or renders it in terminal-style,
or emits MathML that the browser cannot handle makes no difference.
So just go ahead emitting MathML for eqn, i have no problem with that.

> And it's a pretty big issue for me (so this
> is a "for me"): all of my equations (in eqn, or really in
> DocBook--which I use with docbook2mdoc for some scientific
> applications--which I'd like to convert into eqn) are lost.  Also
> lost are LAPACK manuals, OpenGL, and a host of other eqn systems.
> If we stick to HTML4, we'd need to cripple ourselves with
> table-based equations.  If we stick with XHTML1, we need to jump
> through namespace hoops.  But with HTML5, we get embedded MathML.

Makes sense to me.

> The "con", yes, is that MathML is a scary feature, and it doesn't
> exist yet.  YET.
> At the end of the day, the browser doesn't really care whether it's
> HTML4, XHTML1, or HTML5.  It'll render regardless.  Callers of
> -Ofragment will care, but right now that's just us.  Ingo, you
> mentioned non-conforming browsers.  Care to point me to one that
> will puke on the HTML5 output from the patch?  Even lynx(1) can read
> that!

I have no specific browser in mind.  My remark was more about
syntax bloat, that is, using elaborate syntax because we can,
as opposed to because it is useful (like for eqn).

> If we're really at loggerheads over it, /adding/ HTML5 is as easy as
> another switch statement of two.  I think it's a good idea
> regardless of whether it's added or replacing for the reasons above.

If Anthony puts forward a good argument why he needs strict HTML 4.01 -
so far, i don't understand why - i guess that's the way to go.
Otherwise, just drop 4.01 and XHTML and call the polyglot output
close enough.

> ...on to other matters: the style-sheet.
> The status quo bugs me because of the header and footer table.
> These have hard-coded widths and alignments to make them look decent
> without a stylesheet.  This is inadvisable in any modern flavour of
> HTML.  At the very least, we should replace the "width" and "height"
> for embedded styles.  Unfortunately, that's a problem: inline styles
> can't be overriden without the "!important" qualifier in CSS, which
> is annoying.  (That's why I used the width attributes in the first
> place.)  So I think that putting just the table styles *before* the
> <link stylesheet /> is a good idea.

Makes perfect sense to me.

> The question goes: if we're
> going to do that, is there anything else that should go there?

Right now, i don't see anything.  If anything crops up later,
with a reasoning as good as the above, it can be added later.

> The stylesheet I put in place does serve an important purpose: it
> prevents overriding styles.  In man(7) and mdoc(7), for example, you
> can't have overlapping styles.  E.g.,
> .Bf Sy
> Hi
> .Ar there
> .Ef
> The "there" doesn't have both styles: they reset when they're
> nested. (There are probably better ways to do it in CSS, but it
> needs to be done one way or another.)  Without some sort of
> style-sheet, font modes will be nested.  Yes, I'm at conflict with
> myself: on the one hand, mdoc(7) does this because consoles
> generally haven't supported overlapping fonts, and I don't like
> console hold-vers in HTML output.  Or should we discard that
> convention?  (groff's -Tps does it too!)

Font-changing blocks that can contain elements or other blocks
are rare in mdoc(7) do not exist in man(7) in the first place:

  .Sh .Ss HEAD (discouraged to contain elements)
  .Dl .Bd -literal

For .Dl and .Bd, we definitely *want* embedded elements to be
both literal (fixed-width) and italic or bold, respectively.

The .Bf block is almost never useful in the first place; when
do you ever need to embolden or italicise a whole block of text?
And if you do, embedded markup should probably add up, so the
"there" above *should* be bold and italic.

For -Tascii, i'll stay bug-compatible with groff.  But for -Thtml,
lets just do what makes sense.

 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

  reply	other threads:[~2014-08-10 16:13 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-09 23:33 HTML5 Kristaps Dzonsons
2014-08-10  2:23 ` HTML5 Ingo Schwarze
2014-08-10 11:39   ` HTML5 Kristaps Dzonsons
2014-08-10 16:12     ` Ingo Schwarze [this message]
2014-08-10  4:33 ` HTML5 Anthony J. Bentley
2014-08-10 15:17   ` HTML5 Ingo Schwarze
2014-08-10 17:38     ` HTML5 Anthony J. Bentley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140810161229.GD325@iris.usta.de \
    --to=schwarze@usta.de \
    --cc=discuss@mdocml.bsd.lv \
    --cc=kristaps@bsd.lv \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).