public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: "S. Manning" <scriptor-aFO/2INALiozYggVrLCuDg@public.gmane.org>
To: Pandoc discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Subject: Side Effects from HTML to HTML conversion
Date: Mon, 12 Apr 2021 23:57:47 -0700	[thread overview]
Message-ID: <40bf250d3cff42be22088054dc3fa618@ageofdatini.info> (raw)

I seem to still be getting side effects when I take HTML as input and 
output to HTML (so all I use pandoc for is to take some variables and 
wrap the contents in header and footer code with the variables inserted 
in the appropriate places).  Passages like the following in the input:

<figure>
	<a href="/images/2021/04/acme-widgets.jpg">
	<img src="/images/2021/04/acme-widgets.jpg" alt="a mysterious machine 
sticking out of a cardboard shipping box" />
	<figcaption aria-hidden="true">One of this proud company's most famous 
products, the type 37 widget ...</figcaption>
	</a>
</figure>

become like so in the output:

<figure>
<img src="/images/2021/04/acme-widgets.jpg" alt="One of this proud 
company's most famous products, the type 37 widget ..." /><figcaption 
aria-hidden="true">One of this proud company's most famous products, the 
type 37 widget ...</figcaption>
</figure>

I lose the <a> tag and I lose the contents of the alt attribute (good 
alt text is not the same as a good caption!  The caption tells you how 
to interpret the picture, the alt text tells you what the picture would 
be if you could see it).  Are there any ways of avoiding these side 
effects?

If any of you can suggest a more appropriate tool than pandoc for my use 
case (take a HTML fragment and some metadata, wrap the fragment in 
header and footer text with some values inserted from the metadata to 
create a valid HTML file) I will consider it.


             reply	other threads:[~2021-04-13  6:57 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-13  6:57 S. Manning [this message]
     [not found] ` <40bf250d3cff42be22088054dc3fa618-aFO/2INALiozYggVrLCuDg@public.gmane.org>
2021-04-13 13:25   ` Daniel Staal
     [not found]     ` <950926c0-2980-d7c1-c8a0-c624a540d300-Jdbf3xiKgS8@public.gmane.org>
2021-04-13 15:23       ` S. Manning
2021-04-13 21:53   ` John MacFarlane

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=40bf250d3cff42be22088054dc3fa618@ageofdatini.info \
    --to=scriptor-afo/2inaliozyggvrlcudg@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).