The trouble with using -r html or -f html is that this strips out the element, so I lose the formatting. That is, if I apply pandoc -r html -t html+smart to this:

Font names that have more than one word — like Trebuchet MS — need to be surrounded by quotes, for example "Trebuchet MS".

The outcome is just:

Font names that have more than one word — like Trebuchet MS — need to be surrounded by quotes, for example "Trebuchet MS".

On Tuesday, December 28, 2021 at 11:26:40 AM UTC-5 tkur...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org wrote: > If you don't specify an input format, pandoc assumes markdown input, and > while markdown allows literal inclusions of HTML elements, it apparently > doesn't allow DOCTYPE declarations, so it does not consider that to be > HTML, and translates the angle brackets into character entities. > > $ echo '
  1. Bogus
' | pandoc -t html > <!DOCTYPE html> >
    >
  1. > Bogus >
  2. >
> > However, if you add "-r html" everything is fine: > > $ echo '
  1. Bogus
' | pandoc -r html -t html >
    >
  1. Bogus
  2. >
> > > > > On Tue, Dec 28, 2021 at 11:19 AM phi...-97jfqw80gc6171pxa8y+qA@public.gmane.org > wrote: > >> Thank you for your assistance! Indeed, I misread the situation, though >> the outcome is still strange. The HTML I am starting with in my clipboard >> is a complete document with a doctype declaration. The first line is: >> >> > http://www.w3.org/TR/html4/strict.dtd"> >> >> Pandoc (pandoc -t html+smart) converts the angle brackets into HTML >> entity names: >> >> <!DOCTYPE html PUBLIC “-//W3C//DTD HTML 4.01//EN” “ >> http://www.w3.org/TR/html4/strict.dtd”> >> >> Later on in my process, the content gets converted to RTF using textutil, >> which removes doctype declarations but retains the line above, converting >> the entity names back into angle brackets—which is how I got the idea that >> Pandoc had put it there. >> >> I am not sure why my Pandoc command converts the angle brackets in that >> first line—it leaves the other angle brackets in the document alone—but I >> can just remove that line from the clipboard text before processing it with >> Pandoc, so no problem. >> On Tuesday, December 28, 2021 at 10:48:46 AM UTC-5 tkur...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org >> wrote: >> >>> When standalone is not specified, pandoc typically outputs fragments >>> rather than a complete document. This is convenient for the case where you >>> are processing multiple fragments into one document. (This happens in HTML >>> output but also in other output; groff -ms, ConTeXt, LaTeX.) So normal >>> HTML output I see when I don't specify standalone does *not* include >>> the doctype. >>> >>> $ echo '* Bogus' | pandoc -r rst -w html >>> >>> >>> This is with pandoc 2.16.2, installed with homebrew. >>> >>> >>> On Tue, Dec 28, 2021 at 9:33 AM Joseph Reagle >>> wrote: >>> >>>> The doctype declaration is a standard HTML feature and declares the >>>> version of the HTML. Pandoc, especially in `--standalone` mode includes >>>> these at the start of an HTML document. >>>> >>>> I'm confused, however. You haven't specified standalone mode. (And why >>>> would you want them removed in any case?) And the behavior you are >>>> describing doesn't correspond to recent versions -- I'm using 2.16.2. I'm >>>> not sure when/if pandoc last used HTML4.01 strict. >>>> >>>> In any case, you could create your own HTML template, without a doctype >>>> declaration. >>>> >>>> https://pandoc.org/MANUAL.html#templates >>>> >>>> On 21-12-27 15:04, phi...-97jfqw80gc6171pxa8y+qA@public.gmane.org wrote: >>>> > I am using Pandoc to convert dumb quotes to smart quotes in HTML. The >>>> HTML is on my MacOS clipboard: >>>> > >>>> > pbpaste | pandoc -t html+smart | pbcopy >>>> > >>>> > The output begins with >>>> > >>>> > >>> http://www.w3.org/TR/html4/strict.dtd”> >>>> > >>>> > and a blank line. >>>> > >>>> > Is it possible to turn this off? >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "pandoc-discuss" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/pandoc-discuss/e8eac3cc-feb6-e3af-dc9d-d3fe0b964925%40reagle.org >>>> . >>>> >>> >>> >>> -- >>> T. Kurt Bond, tkur...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org, https://tkurtbond.github.io >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "pandoc-discuss" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/pandoc-discuss/60674d49-1a0d-485d-ac2f-ae6a8283dde9n%40googlegroups.com >> >> . >> > > > -- > T. Kurt Bond, tkur...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org, https://tkurtbond.github.io > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/6ae1c100-a3f1-4c6c-b763-3c1f2ace6dbfn%40googlegroups.com.