From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/29854 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: jeremy theler Newsgroups: gmane.text.pandoc Subject: Re: Turn off headers for Mac OS clipboard content output in HTML? Date: Tue, 28 Dec 2021 13:24:29 -0300 Message-ID: References: <9ac6c67a-8aba-4a19-bde0-65e37340c5d6n@googlegroups.com> <60674d49-1a0d-485d-ac2f-ae6a8283dde9n@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="0000000000001daf8305d4374271" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="10539"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBDW7FLGCQYBBBSXUVSHAMGQEW44256A-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Tue Dec 28 17:24:46 2021 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-pg1-f183.google.com ([209.85.215.183]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1n2FHJ-0002ZV-AM for gtp-pandoc-discuss@m.gmane-mx.org; Tue, 28 Dec 2021 17:24:45 +0100 Original-Received: by mail-pg1-f183.google.com with SMTP id m14-20020a633f0e000000b0033fc903c6a4sf9649802pga.12 for ; Tue, 28 Dec 2021 08:24:45 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1640708684; cv=pass; d=google.com; s=arc-20160816; b=RfZuRsWn0tXJA62E/0OmV2jVcNZQiA2lp0sxj9QZiNrETbBY60hWqaNAFoGnKqTSMf UcQKR4/GqOiHm7o9rL2MTUvFJfqEMXn8TtWPZ9ca3AQw2xDEl1VfJZOjKymg7hmLG5hT ghjM671/JiRqpw1y8JIrB6MgQ7YxQRlihfoN19mwesnvIffvWfde84y0g6jcdHuqn0MM +DR2cLv155PLMxyfnqfDtOmHolpbvV+dusb1/xBUALea1jP3ztcp/a7yBpwdBZg2JlXc 78E0Ys04qYY8oeJEfIshFehr95HWz2OKbUxKGDfWTHcMLEQmhFrdsU2WPgLNTWwevUda gVlw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:to:subject:message-id:date :from:in-reply-to:references:mime-version:sender:dkim-signature; bh=hwL8G1OATKdGE7k96o7qFfROqVXDjsgttzAtoPvay30=; b=zQL8mh/3oRmQ6Z62U4l6SL2+/xwPHaK9E4VVVAnKgnsaawwAN9+gRRz/30WnLsOa1C vu48cIMnBJGoUFU27ZRu170xnyhawZ9aZHIfn1ccA41Dk8w4C5pb2ruquicd0heo4F6Y hkIej5VKVN+Y28nkIzUiftluTbe+9AZVom9UKPHIUgQTX6ATwwUQLFBZ1MS0ppMabXuf BzT8NtD9PCiEF/g00o3EMWV2473gOeCQcqC034PBFT/zx0NqnS7BGcogjHngob6A7dXQ 4tgWMPB1/BWzNk+EHG4MWMPlw2K6HGG6BUU8SWbqtJXO9yZPMw4pOc5s4Rj/o6DAh6PS R2Zg== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@seamplex-com.20210112.gappssmtp.com header.s=20210112 header.b=QaGbCgNW; spf=pass (google.com: domain of gtheler-24em0bpozeFWk0Htik3J/w@public.gmane.org designates 2607:f8b0:4864:20::133 as permitted sender) smtp.mailfrom=gtheler-24em0bpozeFWk0Htik3J/w@public.gmane.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=sender:mime-version:references:in-reply-to:from:date:message-id :subject:to:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=hwL8G1OATKdGE7k96o7qFfROqVXDjsgttzAtoPvay30=; b=frDQUyg4Y2aU8B2LnKYwGl+6WtKNcjPROQBqGrZDKNn8iT7my8aFPlE38lkongMM4E SLlFqyjnDJCeAvvISMlXsvIoHDJS/LeQLQ7634j0V2CYxAbCBwQomYS3RU9gQbFgX3QV mNfocalfyGncMxEYZi5cccJ8VOUyQgYgkgA5QvgWTDtPCePU6i5/5vyGnTNIhUcSEHUq JB3/YiwNEfXyrGvPjlF97DvvYSW5yjPkxHeP8qnhxidAqsHdsFiaLAMa8fHOMmCje7W2 jPyNbB1xgNq2lGxrfiRbjLOda9OehZtE6NDnPxdUXfAIHZeTtp8YjfnmlXwhOTn+HPNp EMkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=sender:x-gm-message-state:mime-version:references:in-reply-to:from :date:message-id:subject:to:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=hwL8G1OATKdGE7k96o7qFfROqVXDjsgttzAtoPvay30=; b=0/THbSeSt7oF6H9ZLLNkuMsnCpQkL5utNqMKNnLe5NQ0OO02d9fFBGaN+PHuSL5oeb mGowSZ5hCK9+jLcuVPYAaXiewxDweDUNdPaLSs7n7/MawIRw6o2YAqsAKrhg1HANAbcJ Wxm6C21OYGWm3asFRD8g7XmzJsx+IPrkcDZpXByMzPYplp2cmBj2HIvwUWeSWtiPep6M BfJkOZ18uepcaihpRPPnAY1J1ocEhmWUmUZPRRFs47qL7chyk0bg1yuupd12h9FQ+b6m mw3MJUVoyZWSWNTInyN60gx8BujyuVN4sRZ6jO5DK1UdfvU2JEOcZEvgTXqWVwL7aSEB a/XQ== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM532i0p3tRDlqXFOMGuRluTm/IAjtL7zmnplyJ6sfD4nwSOItIKT8 93QIJ/2KrqjKhMymB2Jzjf4= X-Google-Smtp-Source: ABdhPJwHPQlYjhDFOn74QaTDD4dJuGFaMZvSRUtOep5UOiTRfVCjQ0xB+2WkYQdvZm2uZ4Ad3S1tMQ== X-Received: by 2002:a05:6a00:98f:b0:4bb:1464:1e9a with SMTP id u15-20020a056a00098f00b004bb14641e9amr22696932pfg.61.1640708684016; Tue, 28 Dec 2021 08:24:44 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a63:5c1b:: with SMTP id q27ls3078932pgb.3.gmail; Tue, 28 Dec 2021 08:24:42 -0800 (PST) X-Received: by 2002:a05:6a00:1344:b0:4bb:478e:cbd4 with SMTP id k4-20020a056a00134400b004bb478ecbd4mr22586461pfu.50.1640708681932; Tue, 28 Dec 2021 08:24:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1640708681; cv=none; d=google.com; s=arc-20160816; b=KDubQHiICzHaaVXlJrjzcc7P67zm/DS31T3Mj3kCoqQK4miMqd8Hx/KvKCQ7DDG3A+ sgfFXYH7OTJ84noY5j4Plmst/wtAlZ5K6MoQuFShmn5N2sjhI3eJ7v4cLUvBCyldSQYY adPZMTBzOnEi7/TL9h+2I5WgE1LiC/bc/+82lG8XFXxKg+5QmG3oXAFRAjtWAhRQrVQF eHwUon9xChjSbgDTmnHEAGkcnolvEKerQSdXuJubBTr1x1MrQSx20j1BBaLmTIar/JYF 2uXkXTQNTnEx0U8A0W0KxfvnNCK1anZm8lPwo8wUTLhcN7wR+JTOZ70dwopgn59ApqKZ 5HNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=ixDJcCF6oPBdvAeoQ3aVF7lEAp90KM+Lu1CPlMIjx7U=; b=Kt8t+sWGXG+QMcgZIYYFoUal/hSis4yJ3oF0xjqBvCKEXpBYrgbU/mdVDJIDTTvWbE iIMfLuBP3KQVf0fS6RwApvj1enGMo8DvR6XXEQbqffo/cWUAdh30d8Wo4JVmLrXdaFkS eYGd8CvQ7ZUcajCIC3GXeRrCdDPiwh+2dr+6dm0iRQgCKJihjJrpzP9b9aiKroNkZTaG Vxs6gb06JjKV+sk8j4uyzzmVPhjtC4i0QzFNqRBEDKwnqST5lPa0lLTOTfMafNibF5Jr oass2nd0sTxB82lSTPiMPWUTOXhxNOIxl+vC7bzgcAGAjy79TAQM7ZNV0P0si+8nw2fj w7LA== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@seamplex-com.20210112.gappssmtp.com header.s=20210112 header.b=QaGbCgNW; spf=pass (google.com: domain of gtheler-24em0bpozeFWk0Htik3J/w@public.gmane.org designates 2607:f8b0:4864:20::133 as permitted sender) smtp.mailfrom=gtheler-24em0bpozeFWk0Htik3J/w@public.gmane.org Original-Received: from mail-il1-x133.google.com (mail-il1-x133.google.com. [2607:f8b0:4864:20::133]) by gmr-mx.google.com with ESMTPS id q2si1101845pfu.5.2021.12.28.08.24.41 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 28 Dec 2021 08:24:41 -0800 (PST) Received-SPF: pass (google.com: domain of gtheler-24em0bpozeFWk0Htik3J/w@public.gmane.org designates 2607:f8b0:4864:20::133 as permitted sender) client-ip=2607:f8b0:4864:20::133; Original-Received: by mail-il1-x133.google.com with SMTP id d14so14678085ila.1 for ; Tue, 28 Dec 2021 08:24:41 -0800 (PST) X-Received: by 2002:a05:6e02:1bc6:: with SMTP id x6mr6505174ilv.312.1640708681079; Tue, 28 Dec 2021 08:24:41 -0800 (PST) In-Reply-To: <60674d49-1a0d-485d-ac2f-ae6a8283dde9n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> X-Original-Sender: jeremy-24em0bpozeFWk0Htik3J/w@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@seamplex-com.20210112.gappssmtp.com header.s=20210112 header.b=QaGbCgNW; spf=pass (google.com: domain of gtheler-24em0bpozeFWk0Htik3J/w@public.gmane.org designates 2607:f8b0:4864:20::133 as permitted sender) smtp.mailfrom=gtheler-24em0bpozeFWk0Htik3J/w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:29854 Archived-At: --0000000000001daf8305d4374271 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable are you telling pandoc that the source is html and not markdown? On Tue, Dec 28, 2021, 13:19 philmac-97jfqw80gc6171pxa8y+qA@public.gmane.org wrote= : > Thank you for your assistance! Indeed, I misread the situation, though th= e > outcome is still strange. The HTML I am starting with in my clipboard is = a > complete document with a doctype declaration. The first line is: > > http://www.w3.org/TR/html4/strict.dtd"> > > Pandoc (pandoc -t html+smart) converts the angle brackets into HTML > entity names: > > <!DOCTYPE html PUBLIC =E2=80=9C-//W3C//DTD HTML 4.01//EN=E2=80=9D =E2= =80=9C > http://www.w3.org/TR/html4/strict.dtd=E2=80=9D> > > Later on in my process, the content gets converted to RTF using textutil, > which removes doctype declarations but retains the line above, converting > the entity names back into angle brackets=E2=80=94which is how I got the = idea that > Pandoc had put it there. > > I am not sure why my Pandoc command converts the angle brackets in that > first line=E2=80=94it leaves the other angle brackets in the document alo= ne=E2=80=94but I > can just remove that line from the clipboard text before processing it wi= th > Pandoc, so no problem. > On Tuesday, December 28, 2021 at 10:48:46 AM UTC-5 tkur...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org > wrote: > >> When standalone is not specified, pandoc typically outputs fragments >> rather than a complete document. This is convenient for the case where = you >> are processing multiple fragments into one document. (This happens in H= TML >> output but also in other output; groff -ms, ConTeXt, LaTeX.) So normal >> HTML output I see when I don't specify standalone does *not* include the >> doctype. >> >> $ echo '* Bogus' | pandoc -r rst -w html >>
    >>
  • Bogus
  • >>
>> >> This is with pandoc 2.16.2, installed with homebrew. >> >> >> On Tue, Dec 28, 2021 at 9:33 AM Joseph Reagle >> wrote: >> >>> The doctype declaration is a standard HTML feature and declares the >>> version of the HTML. Pandoc, especially in `--standalone` mode includes >>> these at the start of an HTML document. >>> >>> I'm confused, however. You haven't specified standalone mode. (And why >>> would you want them removed in any case?) And the behavior you are >>> describing doesn't correspond to recent versions -- I'm using 2.16.2. I= 'm >>> not sure when/if pandoc last used HTML4.01 strict. >>> >>> In any case, you could create your own HTML template, without a doctype >>> declaration. >>> >>> https://pandoc.org/MANUAL.html#templates >>> >>> On 21-12-27 15:04, phi...-97jfqw80gc6171pxa8y+qA@public.gmane.org wrote: >>> > I am using Pandoc to convert dumb quotes to smart quotes in HTML. The >>> HTML is on my MacOS clipboard: >>> > >>> > pbpaste | pandoc -t html+smart | pbcopy >>> > >>> > The output begins with >>> > >>> > >> http://www.w3.org/TR/html4/strict.dtd=E2=80=9D> >>> > >>> > and a blank line. >>> > >>> > Is it possible to turn this off? >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "pandoc-discuss" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/pandoc-discuss/e8eac3cc-feb6-e3af-dc9= d-d3fe0b964925%40reagle.org >>> . >>> >> >> >> -- >> T. Kurt Bond, tkur...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org, https://tkurtbond.github.io >> > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/60674d49-1a0d-485d-ac2f-= ae6a8283dde9n%40googlegroups.com > > . > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/CAK0LiymrsEZNYPmEoJOrBfzXaensH1_tGTC3iv9Km878KGpsuA%40mail.g= mail.com. --0000000000001daf8305d4374271 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
are you telling pandoc that the source is html and not ma= rkdown?

Thank yo= u for your assistance! Indeed, I misread the situation, though the outcome = is still strange. The HTML I am starting with in my clipboard is a complete= document with a doctype declaration. The first line is:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN&q= uot; "http://www.w3.org/TR/html4/strict.dtd"><= br>
Pandoc (pandoc -t html+smart) converts the angle brackets into HTML entity names:

&lt;!DOCTYPE html PUBLIC =E2=80=9C-//W3C//DTD HTML 4.01//= EN=E2=80=9D =E2=80=9Chttp://www.w3.org/TR/html4/strict.dtd= =E2=80=9D&gt;

Later on in my process, the content gets co= nverted to RTF using textutil, which removes doctype declarations but retai= ns the line above, converting the entity names back into angle brackets=E2= =80=94which is how I got the idea that Pandoc had put it there.

I am= not sure why my Pandoc command converts the angle brackets in that first l= ine=E2=80=94it leaves the other angle brackets in the document alone=E2=80= =94but I can just remove that line from the clipboard text before processin= g it with Pandoc, so no problem.
On Tuesday, December 28, 2021 at 10:48:46 AM UTC-5 <= a href=3D"mailto:tkur...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org" target=3D"_blank" rel=3D"noreferrer">tk= ur...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org wrote:
When standalone is not specified, pandoc typically ou= tputs fragments rather than a complete document.=C2=A0 This is convenient f= or the case where you are processing multiple fragments into one document.= =C2=A0 (This happens in HTML output but also in other output; groff -ms, Co= nTeXt, LaTeX.)=C2=A0 So normal HTML output I see when I don't specify s= tandalone does not=C2=A0include the doctype.
$ ech= o '* Bogus' | pandoc -r rst -w html
<ul>
<li>B= ogus</li>
</ul>
This is with=C2=A0pandoc 2.16.2, installed wi= th=C2=A0homebrew.


On Tue, Dec 28, 2021 at 9:33 AM Joseph Reagle <josep...-T1oY19WcHSwdnm+yROfE0A@public.gmane.org> wrote:
The doctype dec= laration is a standard HTML feature and declares the version of the HTML. P= andoc, especially in `--standalone` mode includes these at the start of an = HTML document.

I'm confused, however. You haven't specified standalone mode. (And = why would you want them removed in any case?) And the behavior you are desc= ribing doesn't correspond to recent versions -- I'm using 2.16.2. I= 'm not sure when/if pandoc last used HTML4.01 strict.

In any case, you could create your own HTML template, without a doctype dec= laration.

https://pandoc.org/MANUAL.html#templates=

On 21-12-27 15:04, phi...-97jfqw80gc6171pxa8y+qA@public.gmane.org wr= ote:
> I am using Pandoc to convert dumb quotes to smart quotes in HTML. The = HTML is on my MacOS clipboard:
>
> pbpaste | pandoc -t html+smart | pbcopy
>
> The output begins with
>
> <!DOCTYPE html PUBLIC =E2=80=9C-//W3C//DTD HTML 4.01//EN=E2=80=9D = =E2=80=9Chttp://www.w3.org/TR/html4/strict.= dtd=E2=80=9D>
>
> and a blank line.
>
> Is it possible to turn this off?

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit
https://groups.g= oogle.com/d/msgid/pandoc-discuss/e8eac3cc-feb6-e3af-dc9d-d3fe0b964925%40rea= gle.org.


--

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh4Ykp1iOSErHA@public.gmane.org= m.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/60674d49-1= a0d-485d-ac2f-ae6a8283dde9n%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.= google.com/d/msgid/pandoc-discuss/CAK0LiymrsEZNYPmEoJOrBfzXaensH1_tGTC3iv9K= m878KGpsuA%40mail.gmail.com.
--0000000000001daf8305d4374271--