From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/11712 Path: news.gmane.org!not-for-mail From: Adam Wood Newsgroups: gmane.text.pandoc Subject: Re: output html entities? Date: Fri, 9 Jan 2015 11:53:46 -0800 (PST) Message-ID: References: <1b63e5cd-5fe8-424e-a246-c5e39cd2e3b4@k42g2000yqa.googlegroups.com> <14386839.2417.1295205524913.JavaMail.geo-discussion-forums@yqcj39> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_761_166041920.1420833226991" X-Trace: ger.gmane.org 1420833233 18087 80.91.229.3 (9 Jan 2015 19:53:53 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 9 Jan 2015 19:53:53 +0000 (UTC) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBCK2NJNIYQPRBS7DYCSQKGQEBCRAA5I-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Fri Jan 09 20:53:49 2015 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-pd0-f184.google.com ([209.85.192.184]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Y9fcy-0006JN-QB for gtp-pandoc-discuss@m.gmane.org; Fri, 09 Jan 2015 20:53:49 +0100 Original-Received: by mail-pd0-f184.google.com with SMTP id r10sf2279820pdi.1 for ; Fri, 09 Jan 2015 11:53:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20120806; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :content-type:x-original-sender:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:sender:list-subscribe :list-unsubscribe; bh=QW5iVd7Q7HPPQ4vWmoa2JueqlfPv2OkEAgHpF4K7mRU=; b=kA6pFH9D67ceNRydf/8kYKMDqC3EeLXglRRE2nZKaCZq1SJ+Iiw2PnfDR5O+xsh4qM AgEZ6B8y5g4x5w2x9NvDCXtyIvTMGZAuYoYTrBHwYp7hIJaBgPYyJpNMtLDUDRmV0sfr nW7dsiXnY4gjdaAdHQRkH4j6Te2aI8pvU41Vjsw9SP5Ivpx4Q3BP70/hDBg6O+mp9s0j 1uSovos7xbaAqnF26A3ApxyQwsNgyE+28qEukCqNwGBlGP1AFvcvxllIeBLnDUkid2a2 59h+Bso9BjGJ3O8TYzI0F6tLIpPRqwbaLU8fdQw216nkuGmxxtX3PN1kikTPE3poAee9 Wweg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :content-type:x-original-sender:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:sender:list-subscribe :list-unsubscribe; bh=QW5iVd7Q7HPPQ4vWmoa2JueqlfPv2OkEAgHpF4K7mRU=; b=KYGw3lRtqUujWFThxslZGRxoaE5gw3cMrs3DHtMWJzhjynKhnuImbOX8aEuA92OBQd 9GDyuRzuhpquz8zQxmNsrDwZ2J6MdKS066J7ZMp8mDcNAZ9qYCNc8PWrx1jbHMVfPsc9 RLJZg0UjfxDaPRmKYE4l78i1z65deNpqC/EjKIezpQQyJHfu1UrtAMDy0ttvEZ4zdn3Q L7BOZ4i/euQu2rmGglSVY1Hbd3a32eZlCDt9cE1K1lavWr7Sl7LNdpOgJpClL5k2DrjI U/F8kikvO6XNwOLOSsY8+NP945hTr8+cuey3eUkVty8HX4QgVEqVkTD1otUPBgBZZfW7 llHw== X-Received: by 10.140.93.226 with SMTP id d89mr46876qge.13.1420833227975; Fri, 09 Jan 2015 11:53:47 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 10.140.108.246 with SMTP id j109ls1645268qgf.7.gmail; Fri, 09 Jan 2015 11:53:47 -0800 (PST) X-Received: by 10.140.84.103 with SMTP id k94mr284710qgd.3.1420833227542; Fri, 09 Jan 2015 11:53:47 -0800 (PST) In-Reply-To: <14386839.2417.1295205524913.JavaMail.geo-discussion-forums@yqcj39> X-Original-Sender: adam.michael.wood-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.org gmane.text.pandoc:11712 Archived-At: ------=_Part_761_166041920.1420833226991 Content-Type: multipart/alternative; boundary="----=_Part_762_1085291987.1420833226991" ------=_Part_762_1085291987.1420833226991 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Well this is 3 years later, but I happened to be looking into something and= =20 ran across it. So... I can't tell if Mark was being facetious or not in his disavowal of= =20 any desire to have html entities for curly quotes, emdashes, etc. I still strongly prefer it, especially for certain cases where I don't have= =20 control over the display environment. (I write for other people, and=20 sometimes other people have bad html character set declarations --- also=20 commenting systems, feeds, etc.) My solution has been to execute pandoc from within a bash script I wrote=20 that goes back afterwards and uses sed to replace characters with their=20 appropriate entity. (I also use it to direct the output to an appropriate directory, with a=20 filename based on the original, and with all the other options I want ---= =20 rather than trying to remember and have to type a million flags and options= =20 and two file names into the command line) kfile=3D"$1.kramdown" hfile=3D"../html/$1.html" pandoc -f markdown-auto_identifiers -S -o $hfile $kfile sed -i '' -e "s/=E2=80=99/\’/g" -e "s/=E2=80=98/\‘/g" -e 's/=E2= =80=9C/\“/g' -e=20 's/=E2=80=9D/\”/g' -e 's/=E2=80=94/\—/g' -e's/=E2=80=93/\&ndash= ;/g' $hfile open -a "Sublime Text" $hfile On Sunday, January 16, 2011 at 11:18:44 AM UTC-8, Mark (my words) wrote: > > Well, I=E2=80=99m embarrassed. > > I have been out of the loop for a *long* time. Back in the day I saw the= =20 > debate go from not using special characters, to using named entities, to= =20 > numerical entities, and back and forth. And now=E2=80=94 > > Are we actually to a point were we can use real raw characters?! > It strikes me as a fantastic magic. > > Bruce is right, we should be living in the present and planning for the= =20 > future. My bizarre need for human readable machine language is antiquated= =20 > and needless and moot=E2=80=94you can=E2=80=99t get much closer to human-= readable than=20 > straight-out unicode. > > Now I=E2=80=99m feeling rather silly for hacking up the Multimarkdown sou= rce code=20 > to spit out named-entites now, but it was a lot of fun. > > And yeah, I=E2=80=99d installed the latest Tidy a couple months back but = haven=E2=80=99t=20 > had the time to screw with it until now, again more magic, that project h= as=20 > come a long way too! > > > Thanks guys for your patient advice. > > > -Mark > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/e5aaeeee-0d20-414c-a90f-7e67b347fb5a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. ------=_Part_762_1085291987.1420833226991 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Well this is 3 years later, but I happened to be look= ing into something and ran across it.

So... I can't tel= l if Mark was being facetious or not in his disavowal of any desire to have= html entities for curly quotes, emdashes, etc.

I still = strongly prefer it, especially for certain cases where I don't have control= over the display environment. (I write for other people, and sometimes oth= er people have bad html character set declarations --- also commenting syst= ems, feeds, etc.)

My solution has been to execute = pandoc from within a bash script I wrote that goes back afterwards and uses= sed to replace characters with their appropriate entity.
(I also= use it to direct the output to an appropriate directory, with a filename b= ased on the original, and with all the other options I want --- rather than= trying to remember and have to type a million flags and options and two fi= le names into the command line)

kfile=3D"$1.kramdown"
hfile=3D"../h= tml/$1.html"

pandoc -f markdown-auto_identifiers -S -o $hfile $kfile
<= div class=3D"subprettyprint">sed -i '' -e "s/=E2=80=99/\&rsquo;/g" -e "= s/=E2=80=98/\&lsquo;/g" -e 's/=E2=80=9C/\&ldquo;/g' -e 's/=E2=80=9D= /\&rdquo;/g' -e 's/=E2=80=94/\&mdash;/g' -e's/=E2=80=93/\&ndash= ;/g' $hfile
open -a "Sublime Text" $hfil= e








On Sunday, January 16, 2011 at 11:18:44 A= M UTC-8, Mark (my words) wrote:
Well, I=E2=80=99m embarrassed.

I have been out of the l= oop for a long time. Back in the day I saw the debate go from not us= ing special characters, to using named entities, to numerical entities, and= back and forth. And now=E2=80=94

Are we actually = to a point were we can use real raw characters?!
It strikes me as= a fantastic magic.

Bruce is right, we should be l= iving in the present and planning for the future. My bizarre need for human= readable machine language is antiquated and needless and moot=E2=80=94you = can=E2=80=99t get much closer to human-readable than straight-out unicode.<= /div>

Now I=E2=80=99m feeling rather silly for hacking u= p the Multimarkdown source code to spit out named-entites now, but it was a= lot of fun.

And yeah, I=E2=80=99d installed the l= atest Tidy a couple months back but haven=E2=80=99t had the time to screw w= ith it until now, again more magic, that project has come a long way too!


Thanks guys for your patient advice.=


-Mark

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/= msgid/pandoc-discuss/e5aaeeee-0d20-414c-a90f-7e67b347fb5a%40googlegroups.co= m.
For more options, visit http= s://groups.google.com/d/optout.
------=_Part_762_1085291987.1420833226991-- ------=_Part_761_166041920.1420833226991--