From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/32506 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: John MacFarlane Newsgroups: gmane.text.pandoc Subject: Re: Attribute-less Markdown from web page Html Date: Fri, 28 Apr 2023 22:04:55 -0700 Message-ID: References: <8AD0B607-B556-48CC-83AA-7D0BACD3B8BE@halloleo.hailmail.net> <1E55C9DB-9A8C-4064-9927-2EC8B70076A0@gmail.com> <0B417699-0C93-4DBC-9B09-8C36B6F39B0F@halloleo.hailmail.net> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.3\)) Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="4528"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBDW7ZIEHTIIBB6WKWKRAMGQEDBBWOUI-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Sat Apr 29 07:05:04 2023 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-pl1-f186.google.com ([209.85.214.186]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1pscla-0000qW-TE for gtp-pandoc-discuss@m.gmane-mx.org; Sat, 29 Apr 2023 07:05:03 +0200 Original-Received: by mail-pl1-f186.google.com with SMTP id d9443c01a7336-1a94e68e8dfsf3622825ad.3 for ; Fri, 28 Apr 2023 22:05:02 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1682744701; cv=pass; d=google.com; s=arc-20160816; b=xSHnpMt/JtxKlvg6E7iWIrJDJKBr+jbvpN2qbLiOrzajizYwt+hHfq4mokQGuUvzU1 Yys4Dn63saT1M60QfOUwMi3+evI9GmYxy0ryK0a7a16haxexnmWR3WrE6r03KlPc72Y7 iimoJu+MRt9Q6ejkwgIeE30XJLRGkyKhxWf2LCTux4Cz/AUXYgW0WdbrtB8ERCcegXkk oWJ8DVcpdSsTx1LL7+zNEIblnd0Omcd6JkvdOfUpiKtK7h2YSsnJke+Rn2vV5Y/lcLgZ fUlb3ZKEQWfSZrYqJnbkIitNuEbTqaY5ci7pFr3CE9jc1HtDX7JllivaHwEafe4RSCAS TpPg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:message-id:in-reply-to:to :references:date:subject:mime-version:content-transfer-encoding:from :sender:dkim-signature:dkim-signature; bh=KcWIz8Rkt8p8aRLLdvwh+vV8gWyFk4ATLtAi31MZ2Mc=; b=Xbr1KZboyPtcGDhomtyxFNnjlwAH0N/9Ztoqnrgn7QJxjLTYTK6j5I57HqfKoitAZ7 yjI4UT/8Vs6BhkACy0Pvkmeh94wRee8jTtNSU8J9TaEVWLGU1hLuqefmYDBZ7JRgGZ4y rxs6xdVxk2qo2zCU3YoMzGbDSz62MqMXI5nFWpp79/V2vBQQFOK2W47GWTloSWYjWfDU 7n9jm+IYyhkdadd9gWOmqmyG+vVnuXlfh57zKDhp9TACLWH1Vn5xktF7LAuE1bcpvcCt 14Es413HNgLovLDBlGIpmlWvIo07IOF9KqZNTAw3y/1P9+64+JJI5ZmRGNSFTSBDxN63 Irvg== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=j5Fa2r5O; spf=pass (google.com: domain of fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::42c as permitted sender) smtp.mailfrom=fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20221208; t=1682744701; x=1685336701; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to :x-original-authentication-results:x-original-sender:message-id :in-reply-to:to:references:date:subject:mime-version :content-transfer-encoding:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=KcWIz8Rkt8p8aRLLdvwh+vV8gWyFk4ATLtAi31MZ2Mc=; b=VoH4YbaKmD02Q5eEyE/MMtJBxku6JWc8z05UecllWDdwNxye+scHf5L7XRC06uph4n IPPOmsUOrDdqH3m+iKuJvnl5L1hoaWAaAeVb7jbf6j4hqVcCtRw29pOR/cGqdyy+20Is hC5u+9/36g8YCVm5l69QLVvx/x5tv9AmpgPDShHizAjld+5CSDZCN6SjtlnyZReizvjU UmXs7xBvGRa9w+znRc1sjBNFSxrOxF5oO9HmNC8J0RN1p9127ZIXfGUW5hhqXytpjIRi nPhBFF5cSyk6U+hasrTEdG2TJd4RbzfhwmH4D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1682744701; x=1685336701; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to :x-original-authentication-results:x-original-sender:message-id :in-reply-to:to:references:date:subject:mime-version :content-transfer-encoding:from:from:to:cc:subject:date:message-id :reply-to; bh=KcWIz8Rkt8p8aRLLdvwh+vV8gWyFk4ATLtAi31MZ2Mc=; b=SrJuEincDs0dxEkjsh3/6Ek8faAXxL1KxKNYkoVNWjIDWTX+MFJPbRH7i2ySrlR00X L3mKwTj2CzqrcwoyDPY3lesWNkozO6kPIt96NqY59YbOZusNk0Rs5jmy/tAIjpgnJIoo tlNEtnUADNtKPerDrf3uRzFmJHDgPZvjiEUmxTSRYxsWUFNBjI4tajElaOlhs3ZU3vsn VtbwksCsmlIv0nT80l3ZDfP3dqe0OLjdimUpuVJOuclUuxufG3qFh+7Ayx0mYp6GT/ZB qzmcEs0dvn3CJmgHToP+Xcp+E0uygxsPQLN2X4/nwL5a7PhKrAN X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682744701; x=1685336701; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :x-spam-checked-in-group:list-id:mailing-list:precedence:reply-to :x-original-authentication-results:x-original-sender:message-id :in-reply-to:to:references:date:subject:mime-version :content-transfer-encoding:from:x-beenthere:x-gm-message-state :sender:from:to:cc:subject:date:message-id:reply-to; bh=KcWIz8Rkt8p8aRLLdvwh+vV8gWyFk4ATLtAi31MZ2Mc=; b=iAtmbI6POYwxeM9OX9bv3lsWOraPU190Hb5ItBtQqcuFCCZPtGiEs0KHsjAoyxDsLa /N5Clw6Gc9zb3lwyZ4kpaeWKbLKZFKPN+xOSJT7B0KCcD4jORiLb2U0fs43KC9OpUjqy gBkcXiOLlTSDEpKsRDySOu3/npQwa5cFQ6G3MIBZpRLMSvCHtjkQtcGtnPUs9dnLmsuX 8Pyp5YQ2F0P0K4+bwdW6xRdI4esUs9gcmfjzqpEN1+mQw4GzJsAqBbxR8m Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AC+VfDylbij0H5BRPq4A2u9scUER7/WbRz17tUPPdq5RDiuZUMNm//lb YhYH1h2Zp4qPABdgC8MrZWxhEw== X-Google-Smtp-Source: ACHHUZ7HWMQFc3n+mA59fHgk7IUl5OUibo8x4jLC/1No47ZV8BmxBSl9EesxKECMlD3aS2mJ2I1T9w== X-Received: by 2002:a17:903:493:b0:1a9:68b9:3226 with SMTP id jj19-20020a170903049300b001a968b93226mr2173959plb.10.1682744701535; Fri, 28 Apr 2023 22:05:01 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a17:903:32c8:b0:1a6:ee17:24ff with SMTP id i8-20020a17090332c800b001a6ee1724ffls6488242plr.2.-pod-prod-gmail; Fri, 28 Apr 2023 22:04:57 -0700 (PDT) X-Received: by 2002:a17:90a:cc5:b0:24b:9460:6b19 with SMTP id 5-20020a17090a0cc500b0024b94606b19mr7469371pjt.9.1682744697720; Fri, 28 Apr 2023 22:04:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682744697; cv=none; d=google.com; s=arc-20160816; b=GN1KSw/xGKZrOn8layCxdB6loTMm9G50zuSNA72DBAt5POB6uBqgDvnOVGUTl1Rl6s ZgrZB/TLvKQyNgrph7Qd599flJAV2R+/SVoXdkmbCuv7rOM9ZkqP04WCqUEdBAumQPTN 8FPT+fWg2xSgFvMVOFaTaamYJcRxUo10UWrcTA3IoZMMrssl860Q4OulMWaWgy11VG1u 2fJewx8w13RxwQ4hqLpMARlEKhUd87cW3c3GK+7W/nTNRAnG95YyEn99nfV3sQMLpAxZ kNkr8LlVnIWYMyfPYzhsWSwltOg6kl6aiJ3lVPuzuiqi7jeJ2QbJVMCDhFn4t/rkdh5I otgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:in-reply-to:to:references:date:subject:mime-version :content-transfer-encoding:from:dkim-signature; bh=8ALn2hy9+VLKCVzUp27RCBkBMCEIQrWAbzkJImg/Znc=; b=VOSUOd9YiHSOrXPOmYsJQ7C5Z4LUojIqpVpx/0UDmPY5CFvEN7EEzAV6aYXc2yol0r glCZV+AR3+bmuz4ugo19LNOAu1s5o38TnsMs23l02Ut+vxLLvNZJ6P2O0cVcfRMEWoHb TxGRxv8lDnEcjAf64olx9BZ3wukGbQZGNGmTNctVZ72uzObQyqZdOou6bFqrNnMrRvz1 RAUxHW+PUAPAcjMvVRgModIxVldp+r1wVIQvj4LRvYyfRhjVRKbOzV/HguPCxUn+6qdS C3aqEW/0mLrQPe8AkaMTq+Elt1v63TrBvQy6ZIytkY5TW3ESG+ObWY0t2AKtD1OaBLai OHMw== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=j5Fa2r5O; spf=pass (google.com: domain of fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::42c as permitted sender) smtp.mailfrom=fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Original-Received: from mail-pf1-x42c.google.com (mail-pf1-x42c.google.com. [2607:f8b0:4864:20::42c]) by gmr-mx.google.com with ESMTPS id ot7-20020a17090b3b4700b0023f99147cfdsi1450523pjb.3.2023.04.28.22.04.57 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 28 Apr 2023 22:04:57 -0700 (PDT) Received-SPF: pass (google.com: domain of fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::42c as permitted sender) client-ip=2607:f8b0:4864:20::42c; Original-Received: by mail-pf1-x42c.google.com with SMTP id d2e1a72fcca58-63b4bf2d74aso544179b3a.2 for ; Fri, 28 Apr 2023 22:04:57 -0700 (PDT) X-Received: by 2002:a05:6a00:1a55:b0:63b:7ac8:1be4 with SMTP id h21-20020a056a001a5500b0063b7ac81be4mr11892929pfv.25.1682744697040; Fri, 28 Apr 2023 22:04:57 -0700 (PDT) Original-Received: from smtpclient.apple ([2601:644:4700:2110:c19d:e5c9:3837:2465]) by smtp.gmail.com with ESMTPSA id y17-20020a056a00191100b0063b85893633sm15984164pfi.197.2023.04.28.22.04.56 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 28 Apr 2023 22:04:56 -0700 (PDT) In-Reply-To: <0B417699-0C93-4DBC-9B09-8C36B6F39B0F-WPTjrydoUPgeaOpM6FAJmQkbCANdLtlA@public.gmane.org> X-Mailer: Apple Mail (2.3696.120.41.1.3) X-Original-Sender: fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=j5Fa2r5O; spf=pass (google.com: domain of fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::42c as permitted sender) smtp.mailfrom=fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:32506 Archived-At: I think that's just an oversight. I just pushed a change to the commonmark = writer so it will use the shortcut forms. > On Apr 28, 2023, at 3:14 PM, Oliver wrote: >=20 > Cool. But CommonMark uses these strange empty link reference labels (when= the link title itself is the link label): >=20 > [Pandoc Manual][] >=20 > [Pandoc Manual]: https://pandoc.org/MANUAL.html >=20 > Is there a way to switch this off? I.e just: >=20 > [Pandoc Manual] >=20 >=20 > On 29 Apr 2023, at 2:49, John MacFarlane wrote: >=20 >> You would get different indentation with `-t commonmark`. markdown_stric= t follows the '4-space rule'. >>=20 >>=20 >>> On Apr 26, 2023, at 4:59 PM, Oliver wrote: >>>=20 >>> Thanks John >>>=20 >>> `-t markdown_strict-raw_html` does the trick for me! >>>=20 >>> One thing though with markdown_strict is odd: The text of lists is inde= nted to the next 4-space column: >>>=20 >>> `- list text` >>>=20 >>> Can I somehow tell the markdown_strict writer to use only _one_ space h= ere: >>>=20 >>> `- list text` >>>=20 >>> Anyway, thousand time thanks for Pandoc! >>>=20 >>>=20 >>> On 26 Apr 2023, at 15:39, John MacFarlane wrote: >>>=20 >>>> Turning off -link_attributes should do it, but looks like you tried th= at. >>>>=20 >>>> I'd have to look at an example of the input that produces this with th= ese settings. >>>>=20 >>>> If you don't need fancy features, you could also try `-t commonmark` o= r `-t markdown_strict`. >>>>=20 >>>>> On Apr 25, 2023, at 5:32 PM, Oliver wrot= e: >>>>>=20 >>>>> Hi all >>>>>=20 >>>>> I try to use Pandoc to convert web pages to markdown without all the = class clutter like `{.underline}`, etc. >>>>>=20 >>>>> So I try >>>>>=20 >>>>> ``` >>>>> pandoc -f html -t markdown-raw_html-native_divs-native_spans-fenced_d= ivs-header_attributes-auto_identifiers-inline_code_attributes-link_attribut= es-raw_attribute-simple_tables-multiline_tables-grid_tables page.html >>>>> ``` >>>>>=20 >>>>> and it works reasonably well, but I still get a bit of class clutter = like >>>>>=20 >>>>> ``` >>>>> {.v-visible-sr .js-screen-reader-info} >>>>> ``` >>>>>=20 >>>>> or attributes like >>>>>=20 >>>>> ``` >>>>> {title=3D"sometext=E2=80=9C} >>>>> ``` >>>>>=20 >>>>> , both after links >>>>>=20 >>>>> How can I supress these? >>>>>=20 >>>>> I want really only the text and (image) links. >>>>>=20 >>>>> Any help much appreciated! >>>>>=20 >>>>>=20 >>>>> --=20 >>>>> You received this message because you are subscribed to the Google Gr= oups "pandoc-discuss" group. >>>>> To unsubscribe from this group and stop receiving emails from it, sen= d an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >>>>> To view this discussion on the web visit https://groups.google.com/d/= msgid/pandoc-discuss/8AD0B607-B556-48CC-83AA-7D0BACD3B8BE%40halloleo.hailma= il.net. >>>>=20 >>>> --=20 >>>> You received this message because you are subscribed to the Google Gro= ups "pandoc-discuss" group. >>>> To unsubscribe from this group and stop receiving emails from it, send= an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >>>> To view this discussion on the web visit https://groups.google.com/d/m= sgid/pandoc-discuss/1E55C9DB-9A8C-4064-9927-2EC8B70076A0%40gmail.com. >>>=20 >>> --=20 >>> You received this message because you are subscribed to the Google Grou= ps "pandoc-discuss" group. >>> To unsubscribe from this group and stop receiving emails from it, send = an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >>> To view this discussion on the web visit https://groups.google.com/d/ms= gid/pandoc-discuss/CA3F14A6-3BC2-47D2-9FDC-ED464D6CAF49%40halloleo.hailmail= .net. >>=20 >> --=20 >> You received this message because you are subscribed to the Google Group= s "pandoc-discuss" group. >> To unsubscribe from this group and stop receiving emails from it, send a= n email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> To view this discussion on the web visit https://groups.google.com/d/msg= id/pandoc-discuss/C0C55D35-D675-4B6E-8D1A-CACEF8F738D1%40gmail.com. >=20 > --=20 > You received this message because you are subscribed to the Google Groups= "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an= email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgi= d/pandoc-discuss/0B417699-0C93-4DBC-9B09-8C36B6F39B0F%40halloleo.hailmail.n= et. --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/A3814551-04F1-4FD2-B6CF-7B89E39D840A%40gmail.com.