From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/22522 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: John MacFarlane Newsgroups: gmane.text.pandoc Subject: Re: How to convert from OPML to Markdown without escaping the Markdown Date: Fri, 12 Apr 2019 08:06:33 -0700 Message-ID: References: <7dd34cfd-e19f-47b2-a20c-0e62f3901ba4@googlegroups.com> <1cc15235-38f9-49f9-a89b-bd37bf331ccb@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="237327"; mail-complaints-to="usenet@blaine.gmane.org" To: Patrick Kenny , pandoc-discuss Original-X-From: pandoc-discuss+bncBCJZJHG45QDBBBWTYLSQKGQERUPTORY-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Fri Apr 12 17:06:49 2019 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-ot1-f55.google.com ([209.85.210.55]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.89) (envelope-from ) id 1hExlQ-000zaA-Ox for gtp-pandoc-discuss@m.gmane.org; Fri, 12 Apr 2019 17:06:49 +0200 Original-Received: by mail-ot1-f55.google.com with SMTP id j17sf4968493otp.9 for ; Fri, 12 Apr 2019 08:06:48 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1555081607; cv=pass; d=google.com; s=arc-20160816; b=TlX907YGI0VbiziYSI5rcvrVkohd0vfhnNtXZO+yngQGtbh0vIrpD6swZqBtlQq7Uk B+rPEQAynpX7/zraSkZBuZonU71xIYJ+52QAPgEmK4k8xKbU/Yrv1Ndu3sHKz2O1Mh4L ezT3ZeKFKHgadZTwW0jNGIZtvvfmnbBVONbwgPKqS4NjmAKWKAtThndSo32/+F0xP9CF vAxReRBq3z17/O9McoLdXhSPe589nC0/7QL5AW4AHra8jRamjR6gVuylL/Pao+ECSMYl BFRCBJCiEbWIRpA0VW6o79NgJ9pCJ3Gjf5TxGpaazpynjd6D37ZYQXTml7ZPYCEhG+Y8 lqWA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:mime-version:message-id :date:references:in-reply-to:subject:to:from:sender:dkim-signature; bh=LWaaDkZV92mXSsHz/Cqzhz0SgwNl3uh6PT7ryA5O9OA=; b=P7foMZTiRNrodJ0siXwzOJzTD4BSFwFvs2EOiKhOk+7JOwgVgSphdNZ5elOQvFBp7C z9O8+GJqqrCqg6SyX4AfkSLnWkBH0tUUgxvcxPXd8GvFlRd+1ZWdNmr7HcZ+3oRxVekU jyorfX6eOICjjsK69LGZ92b6V5oW7WCjDDVXbodyPyG5uw6nelbnWSAcE9o6yY0x1o4v vYd7ZzxRClraPbANDOoHTOIfmZoS5J4TNi5uTqtTBZ/XrUgFPrXoV1fN9YZ3uwyVXchf GYnmDh24KIxuPwfnMGQEiPGjmDxnwD+HMO5JGrmmhCI5MXngp1ohjFwhzWPbmZdIzdcl 0+ag== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=KjDFIXFO; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::52c as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:from:to:subject:in-reply-to:references:date:message-id :mime-version:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=LWaaDkZV92mXSsHz/Cqzhz0SgwNl3uh6PT7ryA5O9OA=; b=j/DrcWQoVB6/p2wa1gknGnjbv0AQxQwGyvgsqHt32MfRk9HOO6zVh71XtaHRDXAj4f w8BQcLPMCco44d1e3ptMSl9l7gBR5a9yR+6GZNb/lBDXzrYjLrphCIGnbAOPqsqjyjJi 51Llr7PKRrC+8PlkRNfMkiOUx3VHvuMp4nY/OfzrDEjW7DefjFTkNcbuAq/0FP2nHgR+ nUlh07tmTXL8H/RrolL4pB7jzmFazfPOSsrwbGDKNiCvlvZwTFAGoZ/WSLuVzyPFdn9l SF0rsoHhWOI9zqXbwhSTettQJ29VFrW6oIfRmMU7+HbNgefHW6H08TjjUduuRe2ackVc WR9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:from:to:subject:in-reply-to:references :date:message-id:mime-version:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=LWaaDkZV92mXSsHz/Cqzhz0SgwNl3uh6PT7ryA5O9OA=; b=DUfsNVjZC80gfXAsFDZko3wDkaAsegmkPZMzmlmmvSk8PlEOettx2GZSW1J61Bjds3 tbjRcz7Y4IGzY//FlxsIM2PI8VncYRFfUjBrXqdrbsTS9kgtSRUWgJ0dOhobGll+WcKE PgZMfu01D6klpeA7Yvpq941qSAvhSi0Am36H4CupzcDZ1LyTyUZQUm4rqT0zSSVLz6X2 bMbClqPWDXiLdPyRlDofMBUZ/MP77njSE/6Oo2IG6GYmfWt8rjWJq3GKXjwb7SwYISKg rcmnQ2+lsXn19LdcnOLG5BzEQPCuTAZk0/91x+MsETCpHQZXxFeX4qG/8oNQ6I373WXu lC8Q== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: APjAAAX2/9EiDN/tCP++ot7uT3BZWltzofBC3feJEfLSdssx6XhXILg2 JBXGSJ/vJAv63UgU5eKf0cc= X-Google-Smtp-Source: APXvYqwZZbzAtM6JJ6eWv1wQ2IDXeJwu6jwlXDwrTNqifcg6x5nOgJ9chhz7U84/ZQ2XpO1yIHcINw== X-Received: by 2002:a05:6830:94:: with SMTP id a20mr28333393oto.194.1555081607188; Fri, 12 Apr 2019 08:06:47 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a9d:7591:: with SMTP id s17ls218668otk.5.gmail; Fri, 12 Apr 2019 08:06:46 -0700 (PDT) X-Received: by 2002:a9d:6287:: with SMTP id x7mr4503492otk.26.1555081606020; Fri, 12 Apr 2019 08:06:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555081606; cv=none; d=google.com; s=arc-20160816; b=ML5QkrG9M7yfFVBKGTvNbfZhb7kkzhKExK6XioGpwKkOwl2JbtVk546ebopzrpODNP Aqbghqq3wEGJSDpnqz+Jgo4S6orQVuG1uFB4+O152hMz1/booUM/htUGDAP1yYe5U6c2 PSwoH2bhXY8HlK9ZsXGJfOc3zKQf3NNBQzEZCLMVLlDaxAh/udJaw7Sqbl+1BlJeJ8ng Sely42pMRZF3P0vm9U0s7HSuL7Y++FPX+EEcKI+NJniCg8eg8zMuoybO5Rr7NJhKch0m rxmqA2MygMRu+Z5G6nL8Z9RA3B6kgQo0VduikD5fy7u3Ff29e+UblW/CYWn5J8nrq8JT 4R2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:message-id:date:references:in-reply-to:subject:to:from :dkim-signature; bh=SLms8NnupVtFx5Anc6REt+WNdrFg2+b0IKYpRl7k5aY=; b=yoEdReD9n/SxRN2OPMtAJkP4mWldO2n48Ai7pvK7bH9cv+nZJoZQq80/jQ/jYn5D+W xc3xqkjv4wU2onfUzjTH5Y1ddpreE34GQ9WKr2YQTKVdgxwCT6h0Lb5K9yOufxP1LFhY /dEXltNOVeXIBmraaNlvbaj+eDi4ASzqKvyFf3cUnBljBYTkWvahRccF2F6zaN9CG0FK 2C1l6M0xsP8usfhdlJ0BtueY3Y+sgSanNpBKVHTQ4ZXbYqIE3DbArAmeI1VVMRHAkzc5 6IoWQmxBzbdi1dYWRookvsoJFWnH4GOyD/3B879MGYNhdnE4yXnQ/lNMk5tvcQnc7+X9 g4OA== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=KjDFIXFO; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::52c as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org Original-Received: from mail-pg1-x52c.google.com (mail-pg1-x52c.google.com. [2607:f8b0:4864:20::52c]) by gmr-mx.google.com with ESMTPS id w188si1534869oig.3.2019.04.12.08.06.45 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 12 Apr 2019 08:06:45 -0700 (PDT) Received-SPF: pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::52c as permitted sender) client-ip=2607:f8b0:4864:20::52c; Original-Received: by mail-pg1-x52c.google.com with SMTP id p6so5331064pgh.9 for ; Fri, 12 Apr 2019 08:06:45 -0700 (PDT) X-Received: by 2002:a63:5a4b:: with SMTP id k11mr18770847pgm.119.1555081605390; Fri, 12 Apr 2019 08:06:45 -0700 (PDT) Original-Received: from johnmacfarlane.net (li55-134.members.linode.com. [74.82.3.134]) by smtp.gmail.com with ESMTPSA id x66sm45774641pfb.78.2019.04.12.08.06.44 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 12 Apr 2019 08:06:44 -0700 (PDT) Original-Received: by johnmacfarlane.net (Postfix, from userid 1000) id 65E52A177; Fri, 12 Apr 2019 11:06:33 -0400 (EDT) In-Reply-To: <1cc15235-38f9-49f9-a89b-bd37bf331ccb-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> X-Original-Sender: jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=KjDFIXFO; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::52c as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.org gmane.text.pandoc:22522 Archived-At: Interesting about the spec allowing HTML formatting. We could implement that. I've created an issue: https://github.com/jgm/pandoc/issues/5444 But that's not the only issue here. The other issue is that pandoc maps all outline elements to *section headings*. I get the impression that you're expecting something else, since you're inserting content that doesn't make sense in a heading (like lists). Content that goes under headings should go in the _note attribute. Currently this accepts markdown formatting. We might want to change that to HTML, as noted in the issue. > So HTML tags should be allowed. If HTML tags are allowed, then Markdown > should be allowed as well, since it's basically a shorthand for HTML. No, it's not (look at what pandoc is used for), and no, it shouldn't be allowed. The meaning of a document is different if it's interpreted as HTML or as Markdown. We should accord with the spec and interpret the text attribute as HTML. We should consider doing the same for the _note attribute, but I'd want to hear from current OPML users before making a change. Patrick Kenny writes: > Ok, thanks, that's helpful. > > The problem may stem from the OPML spec being vague and OPML having two > rather different use cases: cataloging RSS feeds and storing outlines of > books, task lists, and things in outlining software. > > That said, the OPML spec says this about > the element: > > *Text attribute * >> Every outline element must have at least a *text* attribute, which is >> what is displayed when an outliner >> opens the OPML file. To omit the >> text attribute would render the outline useless in an outliner. This is >> what the user would see >> -- >> clearly an unacceptable user experience. Part of the purpose of producing >> OPML is to give users the power to accumulate and organize related >> information in an outliner. This is as important a use for OPML as data >> interchange. >> A missing text attribute in any outline element is an error. >> Text attributes may contain encoded HTML markup. > > So HTML tags should be allowed. If HTML tags are allowed, then Markdown > should be allowed as well, since it's basically a shorthand for HTML. > > Upon further investigation, it's not just the * lists that get escaped. > > Ordered lists 1. 2. 3. etc. are also escaped as 1\. 2\., and headers are > escaped as \#\#\#\#. > > So it seems to me that OPML is not handled correctly during conversion. > > I can write a shell script to get around this, but maybe there's a better > way? > > On Friday, April 12, 2019 at 8:58:36 AM UTC+9, John MacFarlane wrote: >> >> >> The way the OPML reader and writer work is: >> >> - elements correspond to section headings >> The text attribute is the heading text >> - The contents of the _note attribute, if present, are >> parsed as Markdown and treated as text under the >> heading. >> >> I have never used OPML myself, so I don't have a good >> sense why it's this way; maybe someone else does. >> >> >> Patrick Kenny > writes: >> >> > I'm relatively new to Pandoc and hoping someone can point me in the >> right >> > direction. >> > >> > >> > I created an outline in some software that exports to OPML. This >> outline >> > contains HTML tags and Markdown (for example, lists made of *). >> > >> > >> > So, when I export the outline to OPML, it looks like this: >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > I converted from OPML to Markdown like this: >> > >> > >> > pandoc -o -s myfile.md myfile.opml --from=opml --to=commonmark >> > >> > However, this results in the markdown being escaped: >> > >> > >> > List title >> > >> > \* Item 1 >> > >> > \* Item 2 >> > >> > \* Item 3 >> > >> > >> > HTML tags are escaped similarly. >> > >> > >> > How can I turn this escaping off? I want the markdown in the OPML file >> to >> > be preserved (treated as markdown) upon conversion to markdown. >> > >> > >> > -- >> > You received this message because you are subscribed to the Google >> Groups "pandoc-discuss" group. >> > To unsubscribe from this group and stop receiving emails from it, send >> an email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org . >> > To post to this group, send email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org >> . >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/pandoc-discuss/7dd34cfd-e19f-47b2-a20c-0e62f3901ba4%40googlegroups.com. >> >> > For more options, visit https://groups.google.com/d/optout. >> > > -- > You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/1cc15235-38f9-49f9-a89b-bd37bf331ccb%40googlegroups.com. > For more options, visit https://groups.google.com/d/optout.