From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/21397 Path: news.gmane.org!.POSTED!not-for-mail From: BP Jonsson Newsgroups: gmane.text.pandoc Subject: Re: Markdown writer: emit HTML entities instead of unicode Date: Thu, 1 Nov 2018 12:22:13 +0100 Message-ID: References: Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="0000000000004f2942057998a2d5" X-Trace: blaine.gmane.org 1541071223 9656 195.159.176.226 (1 Nov 2018 11:20:23 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Thu, 1 Nov 2018 11:20:23 +0000 (UTC) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBDIY76M674FRB5OD5PPAKGQEM3VXBPY-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Thu Nov 01 12:20:19 2018 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-qt1-f184.google.com ([209.85.160.184]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gIB1P-0002NC-8y for gtp-pandoc-discuss@m.gmane.org; Thu, 01 Nov 2018 12:20:19 +0100 Original-Received: by mail-qt1-f184.google.com with SMTP id h4-v6sf19712798qtp.7 for ; Thu, 01 Nov 2018 04:22:30 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1541071349; cv=pass; d=google.com; s=arc-20160816; b=sBHEN5k7MUb0+x7UgyiiLdusbJZ1k4WJ9IzxfKwNq7rwJR97nESA5BGm05qP2Nm/IM benY9BzlBMcfHUSSCiY198s9H1PluoNHBwc5Z1Zz30GeusX2kh2czZftpxg5EQ/EXXAZ I4UFRzTUtZXEEHeftFspMfz0T2I8WB284wor/TStc7F8UxWLKDC/M8TqmTuWqAJn9My7 BfoaOn7N3EB/W3h8210Gvdr51KpwYUbsMFHZyB6DUK911jEP5ymrt5OtpEcSWmdTt9IV Yb0zwtH24uK4luFYLxHnKtkFLuCwBHbwu+AfZgOGIjPXrDgDYg2RuLitUvp8Rj0/f6nD nUNw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:to:subject:message-id:date :from:in-reply-to:references:mime-version:sender:dkim-signature :dkim-signature; bh=0xrHAbhcFKHa40Zn9Ol+RJokrjnDcJoZxs8wMe4DWlU=; b=L64pw3IPRRCJ7k1OgzuVaVymuPWcgP0kxgK2T+shWx2bd/VqD5JyJaiaDYfA46WH9G GjsDoSpKSBm6pZ2nqBADtbYL50NijM4DjplU+rh4REri0LoEwfD8DjBQFP4MiFu93hJt mKIBqwOZccrZ5gmzHXp9HSIRcxLZ1TarpK1jvPWVBlc+eXJsvaIlPkPQ7B2gf//2VvFZ B7V3oeFaZdHiDhzKB08JSpva0XK+aL2P9G7FnBr76yjZ2XJlxPovW22GC/hpyZhOMwBB mJXwyWFsWViJmqMxEafyO19BVsrw3Wr+pgOgjMO2BdDouI0pev4BSa28qoQmOgPlnWtV ZAjw== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=peeOThjn; spf=pass (google.com: domain of bpjonsson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::534 as permitted sender) smtp.mailfrom=bpjonsson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:mime-version:references:in-reply-to:from:date:message-id :subject:to:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=0xrHAbhcFKHa40Zn9Ol+RJokrjnDcJoZxs8wMe4DWlU=; b=HMOL/0M5lYTMGAQbtVbt846vpaRujTqp6PyhlmXd53njVhujSZNkQ2BS3uoQbBZHmo eVJknoEFD+GtSn3iMQwui/8FS+RvEOXvNqcyL0kGyK2UulR8kXdaEr3J1My6dzKWRc6f iIar4veBWyIKCZlwOg6U8XJk2ZJdUPc6q1rlvCcK54LI3zFGxfh9jNfEzQscECEQNBc1 4SG7Boglt/XhaeIMX28RiRYEDQmCvqjBA2lg73ORD6RLqP19KxEqj2txVm31OwEfmWTv y17D4ThgRvYd7QwPCdGzaumkvbOLO4LaWt4nBuyvenLkt+qyaDfGYVZSV2ZsqJ0G38x/ VoyQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=0xrHAbhcFKHa40Zn9Ol+RJokrjnDcJoZxs8wMe4DWlU=; b=P7LzPN2L5+AA/hzkuTNwK1mey+2bUdib+dIi16uHxpcH7sfj32tfoXzcFQLFVOjyAx ZiqRO89OzH7nTVKD7ng7kdywEoJtxL32xBJEqyqQD3o/UbWtMesdjexfI0ScENr7fvLi tKa1gwjKCT4AxxqNUOkfjfbQns6xWP/bQI2MQa67FKNzNO9kxyN+JpIqsIsVOCvDjntk 8SgkDXB7UOhJEk4pwuAUxD7WEXzflla+5NMSRiMZOwNiISQA0F0QchYU/4jZ+zdPWSGm DoDAsJj6N5Owc9akULqSqijqaSdHhVf+EOssBeqm5/FbNd4AnH6F4mw7Hc64tB1SWkk8 Z/5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:mime-version:references:in-reply-to:from :date:message-id:subject:to:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=0xrHAbhcFKHa40Zn9Ol+RJokrjnDcJoZxs8wMe4DWlU=; b=sRyo9J+1Q8htnCXryZuuOcZ7Kz4+F+UYPw6uutKd65SDSd2sZtdJtsF2w1B07rPXqm MbTsOiGCteZbBiuvQdmgN5sbr52pRa2dI6ARhEdsnIksyEl7Of2beCeglaCczAxIWCRB 7PIEiJ8TCL2u5jEBL0OFlhikd40dVQmOM5b/KUdxzsvtVL6ggAp0WdML1WS8jMDT5c+3 5z+qtx4P3E102FMF4eiSs2lNEChMEpYH6vwxVddHVJ5mvgV3YWpMuXEgsOt3VtQ80zb4 I7NZH1pYa0vI22ZAJyi9bwGYmJgRgAzescPyxb5ELonZtkf4UfKcWHKTk5rST4WvrncN zTsg== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AGRZ1gKiXGXsgJcNi5DAw1/P+cWco737O67q1tm0njaD3WQnM3YlHSAC +JqF/+gG20A47iM/hxp50fE= X-Google-Smtp-Source: AJdET5cGsiDwD0IVKen1vCQuOGBuPkeFbuCtD0kfBNnqgKzB4bFvU3gPz9pNw9e195Bv7XY4dXWVPw== X-Received: by 2002:a37:6357:: with SMTP id x84mr91952qkb.4.1541071349800; Thu, 01 Nov 2018 04:22:29 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:aed:38e1:: with SMTP id k88-v6ls150506qte.4.gmail; Thu, 01 Nov 2018 04:22:29 -0700 (PDT) X-Received: by 2002:ac8:6a13:: with SMTP id t19-v6mr5373258qtr.12.1541071349165; Thu, 01 Nov 2018 04:22:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1541071349; cv=none; d=google.com; s=arc-20160816; b=ia/yv6Hjt5v6fKwAgt0iY1eI2kQUsojf6FLIoOahCFUPR5gq4CNCwEkvf8BGnMocna 8abIyM9MaC6b0+XG27M1mYvMFOgPSEoi4gOrUjhTuT9S9y5h6dn0lZOMr2LK6DVgKpq0 odRrIfuNizb284gqiYWFmiEzj790DhgnE4GJX6GADArJIjyCo+W3rUFOYVu6RHGS9tkr 7sn+QIzRneRBL2sTSQys1WzgWRtVLJwN4exV6yLR5uyKL+vQoU+tjSkBNoNA7slVDhXs rXhlv4dOhciwyBQPSmCfXJcIpkFXjz7fU0HsJGnjiEu+wIJCwDe5vpU8Fah2PRw4Q5eF 0jzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=e63DeLsHWit12mC96RArPNJZ/ELxfFSQBSoYslBEA3o=; b=hpBz3sq4IM4QMAdG97bAnEiOr1ezCbgxlppiHIkUgIpfSRSIXdrw4q8yAxYFEE38tU pn7Fmiuj63R4wVNhP6mKau+Wdbw+AmtXhrCzHYK+njGxX3w0Lzei3LJOA9kAfwpNfeYR C3xvryJvgyxRa2o0fcRMba/GghOZohv06AOg7/wo9bnzb4jbhfX+Cwar44QyJR6/zyVI /e52Fq+fWvJ87wM6kbGp0aXAebDh8s+MQED3pmstrytXuCtCjGh9GKmKme4bdEa3Ik2i smgiah0B/9FwtpzTeGd+JI0oHq4UWnhWT/hZ5Q6af8x73MkgJmEqc0mHXc63R9C5+hCh tDhA== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=peeOThjn; spf=pass (google.com: domain of bpjonsson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::534 as permitted sender) smtp.mailfrom=bpjonsson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Original-Received: from mail-pg1-x534.google.com (mail-pg1-x534.google.com. [2607:f8b0:4864:20::534]) by gmr-mx.google.com with ESMTPS id h125-v6si349693qkd.4.2018.11.01.04.22.29 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 01 Nov 2018 04:22:29 -0700 (PDT) Received-SPF: pass (google.com: domain of bpjonsson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::534 as permitted sender) client-ip=2607:f8b0:4864:20::534; Original-Received: by mail-pg1-x534.google.com with SMTP id i4-v6so8889892pgq.9 for ; Thu, 01 Nov 2018 04:22:29 -0700 (PDT) X-Received: by 2002:a62:8e0a:: with SMTP id k10-v6mr7516865pfe.182.1541071348638; Thu, 01 Nov 2018 04:22:28 -0700 (PDT) In-Reply-To: X-Original-Sender: bpjonsson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=peeOThjn; spf=pass (google.com: domain of bpjonsson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::534 as permitted sender) smtp.mailfrom=bpjonsson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Spam-Checked-In-Group: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.org gmane.text.pandoc:21397 Archived-At: --0000000000004f2942057998a2d5 Content-Type: text/plain; charset="UTF-8" Just out of curiosity, since the Markdown and HTML readers presumably do a named entity to character lookup to resolve entities, would it be hard or forbiddingly expensive to have the writers do the reverse lookup under the `--ascii` option, only falling back to (preferably hex) numeric entities only if no named entity is found? After all probably everyone has an easier time mentally mapping named entities to characters than numeric entities. I know the [HTML 5 named entity list][] is huge, but AFAIK it is not official yet. [HTML 5 named entity list]: https://metacpan.org/source/TOBYINK/HTML-HTML5-Entities-0.004/lib/HTML/HTML5/Entities.pm#L23 Den ons 31 okt 2018 21:37 skrev John MacFarlane : > mb21 writes: > > > You can use the --ascii flag, which will emit:

> > And, just to be explicit: there's no way to keep > `™`; pandoc throws out information about which > entity was used and just stores the character. > > If you really want `™`, though, you could do: > > `™`{=markdown} > > and this will be passed through to markdown output verbatim. > > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/yh480k7ehx3la0.fsf%40johnmacfarlane.net > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAFC_yuTO0w%2BoW9bdszpuP1iq30gUP0Zm0_Y%3DqyAeDX8WFvDz5Q%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout. --0000000000004f2942057998a2d5 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Just out of curiosity, since the Markdo= wn and HTML readers presumably do a named entity to character lookup to res= olve entities, would it be hard or forbiddingly expensive to have the write= rs do the reverse lookup under the `--ascii` option, only falling back to (= preferably hex) numeric entities only if no named entity is found? After al= l probably everyone has an easier time mentally mapping named entities to c= haracters than numeric entities. I know the [HTML 5 named entity list][] is= huge, but AFAIK it is not official yet.

<= div dir=3D"auto">[HTML 5 named entity list]: = https://metacpan.org/source/TOBYINK/HTML-HTML5-Entities-0.004/lib/HTML/HTML= 5/Entities.pm#L23

Den ons 31 okt 2018 21:37 skrev John MacFarlane <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org>:
mb21 <mauro.bieg-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> You can use the=C2=A0 --ascii flag, which will emit: <p>&#84= 82;</p>

And, just to be explicit: there's no way to keep
`&trade;`; pandoc throws out information about which
entity was used and just stores the character.

If you really want `&trade;`, though, you could do:

=C2=A0 =C2=A0 `&trade;`{=3Dmarkdown}

and this will be passed through to markdown output verbatim.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe@googlegroups.= com.
To post to this group, send email to pandoc-discuss@googlegrou= ps.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/p= andoc-discuss/yh480k7ehx3la0.fsf%40johnmacfarlane.net.
For more options, visit https://groups.google.com/d/op= tout.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://gro= ups.google.com/d/msgid/pandoc-discuss/CAFC_yuTO0w%2BoW9bdszpuP1iq30gUP0Zm0_= Y%3DqyAeDX8WFvDz5Q%40mail.gmail.com.
For more options, visit http= s://groups.google.com/d/optout.
--0000000000004f2942057998a2d5--