are you telling pandoc that the source is html and not markdown?
On Tue, Dec 28, 2021, 13:19 philmac-97jfqw80gc6171pxa8y+qA@public.gmane.org wrote:
> Thank you for your assistance! Indeed, I misread the situation, though the
> outcome is still strange. The HTML I am starting with in my clipboard is a
> complete document with a doctype declaration. The first line is:
>
>
>
> Pandoc (pandoc -t html+smart) converts the angle brackets into HTML
> entity names:
>
> <!DOCTYPE html PUBLIC “-//W3C//DTD HTML 4.01//EN” “
> http://www.w3.org/TR/html4/strict.dtd”>
>
> Later on in my process, the content gets converted to RTF using textutil,
> which removes doctype declarations but retains the line above, converting
> the entity names back into angle brackets—which is how I got the idea that
> Pandoc had put it there.
>
> I am not sure why my Pandoc command converts the angle brackets in that
> first line—it leaves the other angle brackets in the document alone—but I
> can just remove that line from the clipboard text before processing it with
> Pandoc, so no problem.
> On Tuesday, December 28, 2021 at 10:48:46 AM UTC-5 tkur...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
> wrote:
>
>> When standalone is not specified, pandoc typically outputs fragments
>> rather than a complete document. This is convenient for the case where you
>> are processing multiple fragments into one document. (This happens in HTML
>> output but also in other output; groff -ms, ConTeXt, LaTeX.) So normal
>> HTML output I see when I don't specify standalone does *not* include the
>> doctype.
>>
>> $ echo '* Bogus' | pandoc -r rst -w html
>>
>>
>> This is with pandoc 2.16.2, installed with homebrew.
>>
>>
>> On Tue, Dec 28, 2021 at 9:33 AM Joseph Reagle
>> wrote:
>>
>>> The doctype declaration is a standard HTML feature and declares the
>>> version of the HTML. Pandoc, especially in `--standalone` mode includes
>>> these at the start of an HTML document.
>>>
>>> I'm confused, however. You haven't specified standalone mode. (And why
>>> would you want them removed in any case?) And the behavior you are
>>> describing doesn't correspond to recent versions -- I'm using 2.16.2. I'm
>>> not sure when/if pandoc last used HTML4.01 strict.
>>>
>>> In any case, you could create your own HTML template, without a doctype
>>> declaration.
>>>
>>> https://pandoc.org/MANUAL.html#templates
>>>
>>> On 21-12-27 15:04, phi...-97jfqw80gc6171pxa8y+qA@public.gmane.org wrote:
>>> > I am using Pandoc to convert dumb quotes to smart quotes in HTML. The
>>> HTML is on my MacOS clipboard:
>>> >
>>> > pbpaste | pandoc -t html+smart | pbcopy
>>> >
>>> > The output begins with
>>> >
>>> > >> http://www.w3.org/TR/html4/strict.dtd”>
>>> >
>>> > and a blank line.
>>> >
>>> > Is it possible to turn this off?
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "pandoc-discuss" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/pandoc-discuss/e8eac3cc-feb6-e3af-dc9d-d3fe0b964925%40reagle.org
>>> .
>>>
>>
>>
>> --
>> T. Kurt Bond, tkur...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org, https://tkurtbond.github.io
>>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/60674d49-1a0d-485d-ac2f-ae6a8283dde9n%40googlegroups.com
>
> .
>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAK0LiymrsEZNYPmEoJOrBfzXaensH1_tGTC3iv9Km878KGpsuA%40mail.gmail.com.