From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/29853 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: "philmac-97jfqw80gc6171pxa8y+qA@public.gmane.org" Newsgroups: gmane.text.pandoc Subject: Re: Turn off headers for Mac OS clipboard content output in HTML? Date: Tue, 28 Dec 2021 08:19:29 -0800 (PST) Message-ID: <60674d49-1a0d-485d-ac2f-ae6a8283dde9n@googlegroups.com> References: <9ac6c67a-8aba-4a19-bde0-65e37340c5d6n@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_10500_1681924017.1640708369120" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="26200"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBCNPLQHPYMLBBEXSVSHAMGQELZOWTII-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Tue Dec 28 17:19:33 2021 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-ot1-f56.google.com ([209.85.210.56]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1n2FCG-0006dQ-Ug for gtp-pandoc-discuss@m.gmane-mx.org; Tue, 28 Dec 2021 17:19:32 +0100 Original-Received: by mail-ot1-f56.google.com with SMTP id i23-20020a9d1717000000b0058f23b1d6b7sf5469097ota.14 for ; Tue, 28 Dec 2021 08:19:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=sender:date:from:to:message-id:in-reply-to:references:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:list-subscribe :list-unsubscribe; bh=12sXmk5jq5mB5I/LGHqiTlAPWK+A8zNLSY3dygZSXKQ=; b=MuAm2nALO4lVi+ZggehggU+mnh4MTKiAbYwHItcOoH8ooe8WClViou41w6SiNO5MHx cxukXTPKJF0DvPjDFbW2OFLZ2dPWssOmJmgrJW176/QvvShxX2q9+zAyh4CA99pssxSF KM2s4MjI8dl8j4Sn4QCcWkGBL3ZPCLLXpaqwCRcAGWnhodjH0xu7s+GGkKmbqXHRXuFK Z80nshTKL5uvqnxm3rjqtRFEkyig7JI5tNCTyx6mPH92+WPOYa5Py/kLpaYcNMT+Iuys AcpEuj/KhcH3xK8Zkao9bpX3snTb0gxLPNQPGMPEZsB/K/StwLiCvYN529POpjRzdkfI ouGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=sender:x-gm-message-state:date:from:to:message-id:in-reply-to :references:subject:mime-version:x-original-sender:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=12sXmk5jq5mB5I/LGHqiTlAPWK+A8zNLSY3dygZSXKQ=; b=hnk9fRb2klszRzLMHCp7smcoIc3DkSeckp/GGA7RudFfjk1uDAdPVQ5u4tusL6TcYw LWYTy0JSFlbZmhHIycNu12ldUx5aOpfNHoBpx3xojA1hDhD2haCgHun6qXUiG8CfS2HW 8U9Wl8Ca35EdFjzBL3aWNWYesiQ38MbmQOgwZOzyFDA0IhGc3ArP1ic5disYcq10zOgo gw3Lnl39oxh3diNnHTjktYGNuL+f7omxCLe1AMopN01OOB/DhUVnhQRb3bNfdAIrG2qL VJOshVcr3bl1TbZWlNCpwchNmMCBq2aaCbc4M+eCzcAupp+KMo4EdnsXn3VgBJQ6vFCv ZrVg== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM5311cErilmIkwQAUwE8DOsIoSlbMPdD5+qX7MHK4PjaKYYW7KLqM arhDIpCpuQcF7uaJFCAT1lQ= X-Google-Smtp-Source: ABdhPJy7ObACqjBvmRDEtPHr8MZlPg674GLfvG5aEQX1z6fJ2WRjpge6EpMaOw6SfItO3H8dxQM1LA== X-Received: by 2002:a9d:3602:: with SMTP id w2mr16066135otb.311.1640708371437; Tue, 28 Dec 2021 08:19:31 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6830:4110:: with SMTP id w16ls3939983ott.6.gmail; Tue, 28 Dec 2021 08:19:29 -0800 (PST) X-Received: by 2002:a9d:7459:: with SMTP id p25mr15662149otk.231.1640708369718; Tue, 28 Dec 2021 08:19:29 -0800 (PST) In-Reply-To: X-Original-Sender: philmac-97jfqw80gc6171pxa8y+qA@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:29853 Archived-At: ------=_Part_10500_1681924017.1640708369120 Content-Type: multipart/alternative; boundary="----=_Part_10501_1038843698.1640708369120" ------=_Part_10501_1038843698.1640708369120 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thank you for your assistance! Indeed, I misread the situation, though the= =20 outcome is still strange. The HTML I am starting with in my clipboard is a= =20 complete document with a doctype declaration. The first line is: Pandoc (pandoc -t html+smart) converts the angle brackets into HTML entity= =20 names: <!DOCTYPE html PUBLIC =E2=80=9C-//W3C//DTD HTML 4.01//EN=E2=80=9D=20 =E2=80=9Chttp://www.w3.org/TR/html4/strict.dtd=E2=80=9D> Later on in my process, the content gets converted to RTF using textutil,= =20 which removes doctype declarations but retains the line above, converting= =20 the entity names back into angle brackets=E2=80=94which is how I got the id= ea that=20 Pandoc had put it there. I am not sure why my Pandoc command converts the angle brackets in that=20 first line=E2=80=94it leaves the other angle brackets in the document alone= =E2=80=94but I=20 can just remove that line from the clipboard text before processing it with= =20 Pandoc, so no problem. On Tuesday, December 28, 2021 at 10:48:46 AM UTC-5 tkur...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org wrote: > When standalone is not specified, pandoc typically outputs fragments=20 > rather than a complete document. This is convenient for the case where y= ou=20 > are processing multiple fragments into one document. (This happens in HT= ML=20 > output but also in other output; groff -ms, ConTeXt, LaTeX.) So normal= =20 > HTML output I see when I don't specify standalone does *not* include the= =20 > doctype. > > $ echo '* Bogus' | pandoc -r rst -w html >
    >
  • Bogus
  • >
> > This is with pandoc 2.16.2, installed with homebrew. > > > On Tue, Dec 28, 2021 at 9:33 AM Joseph Reagle wrote= : > >> The doctype declaration is a standard HTML feature and declares the=20 >> version of the HTML. Pandoc, especially in `--standalone` mode includes= =20 >> these at the start of an HTML document. >> >> I'm confused, however. You haven't specified standalone mode. (And why= =20 >> would you want them removed in any case?) And the behavior you are=20 >> describing doesn't correspond to recent versions -- I'm using 2.16.2. I'= m=20 >> not sure when/if pandoc last used HTML4.01 strict. >> >> In any case, you could create your own HTML template, without a doctype= =20 >> declaration. >> >> https://pandoc.org/MANUAL.html#templates >> >> On 21-12-27 15:04, phi...-97jfqw80gc6171pxa8y+qA@public.gmane.org wrote: >> > I am using Pandoc to convert dumb quotes to smart quotes in HTML. The= =20 >> HTML is on my MacOS clipboard: >> >=20 >> > pbpaste | pandoc -t html+smart | pbcopy >> >=20 >> > The output begins with >> >=20 >> > > http://www.w3.org/TR/html4/strict.dtd=E2=80=9D> >> >=20 >> > and a blank line. >> >=20 >> > Is it possible to turn this off? >> >> --=20 >> You received this message because you are subscribed to the Google Group= s=20 >> "pandoc-discuss" group. >> To unsubscribe from this group and stop receiving emails from it, send a= n=20 >> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> To view this discussion on the web visit=20 >> https://groups.google.com/d/msgid/pandoc-discuss/e8eac3cc-feb6-e3af-dc9d= -d3fe0b964925%40reagle.org >> . >> > > > --=20 > T. Kurt Bond, tkur...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org, https://tkurtbond.github.io > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/60674d49-1a0d-485d-ac2f-ae6a8283dde9n%40googlegroups.com. ------=_Part_10501_1038843698.1640708369120 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thank you for your assistance! Indeed, I misread the situation, though the = outcome is still strange. The HTML I am starting with in my clipboard is a = complete document with a doctype declaration. The first line is:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN= " "http://www.w3.org/TR/html4/strict.dtd">

Pandoc (pandoc -t html+smart) converts the angle bracket= s into HTML entity names:

&lt;!DOCTYP= E html PUBLIC =E2=80=9C-//W3C//DTD HTML 4.01//EN=E2=80=9D =E2=80=9Chttp://w= ww.w3.org/TR/html4/strict.dtd=E2=80=9D&gt;

Later on in my= process, the content gets converted to RTF using textutil, which removes d= octype declarations but retains the line above, converting the entity names= back into angle brackets=E2=80=94which is how I got the idea that Pandoc h= ad put it there.

I am not sure why my Pandoc command converts the an= gle brackets in that first line=E2=80=94it leaves the other angle brackets = in the document alone=E2=80=94but I can just remove that line from the clip= board text before processing it with Pandoc, so no problem.
On Tuesday, December 28, = 2021 at 10:48:46 AM UTC-5 tkur...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org wrote:
When standalone is n= ot specified, pandoc typically outputs fragments rather than a complete doc= ument.=C2=A0 This is convenient for the case where you are processing multi= ple fragments into one document.=C2=A0 (This happens in HTML output but als= o in other output; groff -ms, ConTeXt, LaTeX.)=C2=A0 So normal HTML output = I see when I don't specify standalone does not=C2=A0include the = doctype.
$ echo '* Bogus' | pandoc -r rst -w html=
<ul>
<li>Bogus</li>
</ul>
This is with= =C2=A0pandoc 2.16.2, installed with=C2=A0homebrew.


On Tue, Dec 28, 2021 at 9:33 AM Joseph R= eagle <josep...-T1oY19WcHSwdnm+yROfE0A@public.gmane.org> wrote:
--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discus...@googleg= roups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/e8eac3cc-feb6-e= 3af-dc9d-d3fe0b964925%40reagle.org.


--

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d= /msgid/pandoc-discuss/60674d49-1a0d-485d-ac2f-ae6a8283dde9n%40googlegroups.= com.
------=_Part_10501_1038843698.1640708369120-- ------=_Part_10500_1681924017.1640708369120--