public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Should the HTML writer generate empty '<span>' elements?
@ 2021-09-04 16:08 Gwern Branwen
       [not found] ` <CAMwO0gywhE+u4wgaL0sieohjuqAm+VM2Lj489EsQST74FMqrMA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Gwern Branwen @ 2021-09-04 16:08 UTC (permalink / raw)
  To: pandoc-discuss

While working on cleaning up & debugging my smallcaps/<wbr> code I
noticed again that Pandoc will compile empty Span Inline nodes to
empty HTML '<span></span>' (or just '<span />') elements. Is there any
reason Pandoc should do this?

Example:

    $ echo 'Span ("",[],[]) []' | pandoc -f native -w html
    <span></span>

(Hakyll seems to render that as a self-closed '<span />' somewhere in
the pipeline, but same thing.)

The reasons not to are that an empty Span would appear to be
meaningless, since it cannot style or wrap any *span* of text. As
there is nothing in it, it cannot be styled easily by CSS (you could
write CSS rules which affect it but I think only by selecting on all
span elements by either class or possibly position, which would
seriously screw up regular useful spans since whatever you were doing
to the empty span would affect them too, and so I guess you'd have to
negate every single class or position you *didn't* want to affect?).
Since there is no reasonable use I can come up with, that implies an
empty span is probably an error by the user or Pandoc somewhere, which
would be a good reason to warn/error about it.

Further, it is possibly buggy. Empty spans appear to confuse and
enrage various tools (although I haven't found any statement that it
is outright illegal by any standards); the W3C HTML checker is
extremely unhappy about empty self-closed spans*, Firefox's
view-source parses it in strange ways, empty spans appeared to cause
issues with my auto-smallcaps rendering the entire following paragraph
in smallcaps as well until hitting a </p> implicitly closing the span
run amok, ckeditor removes empty spans by default
(https://github.com/ckeditor/ckeditor4/issues/2484), and HTML Tidy
both warns about & outright deletes empty spans when reformatting.**

So, do any Pandoc users ever use completely empty spans? Empty spans
but with IDs or classes or attributes? Empty spans in any context
whatsoever? Can you come up with any legitimate reason that the HTML
writer should generate empty spans?

If not, then I'll file a bug about suppressing empty spans.

* The W3C validator errors for '<span />':

    Error: Self-closing syntax (/>) used on a non-void HTML element.
Ignoring the slash and treating as a start tag.
    From line 152, column 1; to line 152, column 8
    d>↩<body>↩<span />↩</bod

    Error: End tag for body seen, but there were unclosed elements.
    From line 153, column 1; to line 153, column 7
    ↩<span />↩</body>↩</htm

    Error: Unclosed element span.
    From line 152, column 1; to line 152, column 8
    d>↩<body>↩<span />↩</bod
** Tidy output:

    $ echo 'Span ("",[],[]) []' | pandoc --standalone -f native -w
html | tidy  -errors --doctype html5
    [WARNING] This document format requires a nonempty <title> element.
      Defaulting to '-' as the title.
      To specify a title, use 'title' in metadata or --metadata title="...".
    line 2 column 1 - Warning: <html> attribute "lang" lacks value
    line 152 column 1 - Warning: trimming empty <span>
    Info: Document content looks like XHTML5
    Tidy found 2 warnings and 0 errors!

-- 
gwern
https://www.gwern.net

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAMwO0gywhE%2Bu4wgaL0sieohjuqAm%2BVM2Lj489EsQST74FMqrMA%40mail.gmail.com.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Should the HTML writer generate empty '<span>' elements?
       [not found] ` <CAMwO0gywhE+u4wgaL0sieohjuqAm+VM2Lj489EsQST74FMqrMA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2021-09-04 22:49   ` MarLinn
       [not found]     ` <543ae04c-c5ef-99c3-cff7-15068893b3e4-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: MarLinn @ 2021-09-04 22:49 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 1588 bytes --]

Possible use cases for an empty span that I can think of off the top of 
my hat:

  * Clearfix
  * As a placeholder that will be filled/replaced by javascript
  * As a placeholder that will be filled/replaced by a pandoc filter
  * As an anchor for before/after pseudoelements
  * As a container for a background-image
  * As a piece in some CSS trickery to get fancy effects without images
  * To have the right count of elements for some table-like presentation
  * Because of symmetry with some other structure where there is content
    inside the span

There's probably a lot more, these are just some cases I thought of. Is 
an empty span always the /best/ solution to these problems? Probably 
not. But sometimes it is.

The most relevant as it relates to pandoc is probably the third one: in 
a pandoc filter. Because of that I would not want to see this feature be 
removed.

Now, should the span be self-closing? That's a different question. I 
suspect that /that's/ the main reason for many/all of the errors you're 
seeing, not the pure empty-ness. So that might be something worth 
changing. But as you're saying, that's a Hakyll problem, not a Pandoc 
problem.


Cheers.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/543ae04c-c5ef-99c3-cff7-15068893b3e4%40gmail.com.

[-- Attachment #2: Type: text/html, Size: 2344 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Should the HTML writer generate empty '<span>' elements?
       [not found]     ` <543ae04c-c5ef-99c3-cff7-15068893b3e4-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2021-09-04 23:15       ` Gwern Branwen
       [not found]         ` <CAMwO0gyKnt3aaaGqM_D61ttQOM0v4iz2uHM3RtW9pgkiO+fWXg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Gwern Branwen @ 2021-09-04 23:15 UTC (permalink / raw)
  To: pandoc-discuss

On Sat, Sep 4, 2021 at 6:49 PM MarLinn <monkleyon-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> There's probably a lot more, these are just some cases I thought of.

Most of your cases appear to require IDs, classes, or attributes: #2,
#3, #4 (you're not going to just 'background image' *every* span in
the page can you?), possibly #5; I'm not sure about #1 'clearfix' or
#7 'count of elements' (how does that work?), and #8 is too vague for
me to comment on.

> Is an empty span always the best solution to these problems? Probably not. But sometimes it is.

Have you run into an actual case? It seems like it must be quite rare
given how many tools warn or delete empty spans already, and I didn't
find anyone complaining about how they genuinely needed them because
all the alternatives were worse.

> The most relevant as it relates to pandoc is probably the third one: in a pandoc filter. Because of that I would not want to see this feature be removed.

Can you give an example of a Pandoc filter where an empty Span or
completely empty Span would be useful? Also, why would a filter care
about the compiled HTML not having empty '<span></span>'s in it? It'd
be doing all its work before the HTML writer is ever called. Any kind
of tagging or marking is finished well before.

> Now, should the span be self-closing? That's a different question. I suspect that that's the main reason for many/all of the errors you're seeing, not the pure empty-ness. So that might be something worth changing. But as you're saying, that's a Hakyll problem, not a Pandoc problem.

Well, I should clarify that I *think* it's a Hakyll problem simply
because the standard Pandoc settings on the CLI and ghci don't seem to
produce self-closed tags but I get them in the final generated HTML; I
haven't looked into it in more detail to figure out where the
self-closing happens because I couldn't convince myself that writing
out empty spans is even desirable in the first place (so what happens
further downstream may be moot).

-- 
gwern
https://www.gwern.net

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAMwO0gyKnt3aaaGqM_D61ttQOM0v4iz2uHM3RtW9pgkiO%2BfWXg%40mail.gmail.com.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Should the HTML writer generate empty '<span>' elements?
       [not found]         ` <CAMwO0gyKnt3aaaGqM_D61ttQOM0v4iz2uHM3RtW9pgkiO+fWXg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2021-09-05  0:51           ` MarLinn
  0 siblings, 0 replies; 4+ messages in thread
From: MarLinn @ 2021-09-05  0:51 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

There's a lot of fun things we could talk about, but most of that is 
beside the point because I would be surprised if

A) a span that contains an ID and/or classes but no content and that is 
self-closing

B) an empty, self-closing div

don't produce the same errors in the validators that you tested as they 
did for a self-closing span.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-09-05  0:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-04 16:08 Should the HTML writer generate empty '<span>' elements? Gwern Branwen
     [not found] ` <CAMwO0gywhE+u4wgaL0sieohjuqAm+VM2Lj489EsQST74FMqrMA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2021-09-04 22:49   ` MarLinn
     [not found]     ` <543ae04c-c5ef-99c3-cff7-15068893b3e4-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2021-09-04 23:15       ` Gwern Branwen
     [not found]         ` <CAMwO0gyKnt3aaaGqM_D61ttQOM0v4iz2uHM3RtW9pgkiO+fWXg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2021-09-05  0:51           ` MarLinn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).