From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/29178 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Gwern Branwen Newsgroups: gmane.text.pandoc Subject: Re: Removing parts of the document with [walk] in Haskell Date: Sun, 5 Sep 2021 11:42:23 -0400 Message-ID: References: <915213fc-e4c4-480b-a6a2-fd3420777ddan@googlegroups.com> <1473b62b-4b33-49b5-ac6d-52a5571d8068n@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="10911"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBDFJXQMSYMIRBBGL2OEQMGQEF3CA4JI-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Sun Sep 05 17:43:04 2021 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-ot1-f57.google.com ([209.85.210.57]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1mMuIQ-0002eM-Oh for gtp-pandoc-discuss@m.gmane-mx.org; Sun, 05 Sep 2021 17:43:02 +0200 Original-Received: by mail-ot1-f57.google.com with SMTP id a4-20020a056830008400b005194eddc1d4sf2938528oto.23 for ; Sun, 05 Sep 2021 08:43:02 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1630856581; cv=pass; d=google.com; s=arc-20160816; b=vtaubLBEn+i/ywHT9zd7hqupYov/2R32CAXW8qSLum2o+K8fS2FXCXC/7LagOrwayR 5VFwzqJfEHIbntraRxkBrJlnydoVjA13NOfMz/N3NUP1p+z33FrNFC+J2FnRyz2Frlpp qxm8JtBTlQpDg+HJGnAgPdkVcPG6oAIxcB+TDXFFg9mrmhm5BzLb4P5QcMcaaWqMRgT7 TUvCn7lxlAohItTLZ2vOtlD93nebkeCfKJUUyNgpNLlw1b7oUAl7YgdfcVetXa+z5VQQ 45/9/UYfI42I9DzqtTV4OZ62wuB2vWh3X+Y8R20GVlChOVz+qwlkzIhUS2mTT1wB0wcH kh5Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:content-transfer-encoding :to:subject:message-id:date:from:in-reply-to:references:mime-version :sender:dkim-signature; bh=0p0MkjrthzLMZsoVqtWVJ3Mpgbv0fVN8Z9sC6eQVN/M=; b=gkMrvgcts2bpww9k8DPTwIrJSY9Clcu5a3UEsIJU/ITRPETEMrJzvwNp9O/hDvvl+Q XErxe8Jxwe90BtHO4Hx9DbO0ptOliAI17EFUWjskdB1dwG6eu/WQmQHcab5UPSK2V9wW y1IKhn6UhXyWyIf0Bpuwdkd3OKra394Z9d+DHCWytQnWmpa3lHo5IHXMQzjlZcUQH+GR u0Zpuhdf7XcQoc9+zx0fHceii1IrgBwMtDaL9lB9ZUvu1vWgyHezzDU+ZlgMb92esYL1 1TJhw/Hq9wmQVopCruu2qiyIjfqRLwJ13VDWuymyeaIs+qS+Hsl5VNDcJnHZvGnTP0eB B7dQ== ARC-Authentication-Results: i=2; gmr-mx.google.com; spf=pass (google.com: domain of gwern0-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 209.85.166.51 as permitted sender) smtp.mailfrom=gwern0-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=sender:mime-version:references:in-reply-to:from:date:message-id :subject:to:content-transfer-encoding:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:list-subscribe :list-unsubscribe; bh=0p0MkjrthzLMZsoVqtWVJ3Mpgbv0fVN8Z9sC6eQVN/M=; b=U8Rz6pIWnlJnnShwPpAQ+qIs7g7PV/rFqse7WKPvX5k4G2vRdPqkx9vBRqe3mOkees 3UUtZjwNFW6CbyUoEgLQwr+B/wZCtbZKZv0iGJR6bdDcCWLTn3Hile1Vis8StKldiNet qRWVWi6yUgK2gBJI9UIPht/TedGBaypkbh/7OYHEDfu4WrR5be7p63yBKLHfyCUyorOM Z4lv2RYnqo/2UjNRy81RaKzUH25vM9LlF8gefIlWk9hBL9siac6UMz80qFN7KoRomJha JjoiqE8YJpGyNdKzn6prpicvTJfc4vF/If3P7bm1jo/N/83mdA6+N2ZNaDL8Zc6Jfxy/ JNOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:mime-version:references:in-reply-to:from :date:message-id:subject:to:content-transfer-encoding :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=0p0MkjrthzLMZsoVqtWVJ3Mpgbv0fVN8Z9sC6eQVN/M=; b=LOrhk65+e7mVHiT1UmjZBd6z6JsbJi6gA0lFrmMxkHsTKumPlow+ytlaPrKHMl02dz X+j60FyGNR3WBxI7GDZIY90VDp2dzOIchGAmu7zjRA+y/Cevy9qjsWbJutjjiNdYp6cv astNjVFjAzjvlVBYPPuc2y6Yvv+3SlG18wgaMlUzwUBz8yj4rHG9ho0a/VcB61FbJJC+ Gt79xpUvJVTuTj5Ol6OgJZp983+Uz37hbTXt0el6P7QpEznN0UxI5tsTJBHryMZkvuQe Henl6Z99DOXsa+Xk1Rrkd82ky0y0YBr5ZzzFWT1JBnmCoNIz1/S+4kSTvz4fyoYIbHgC Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM533Nby6ixF5fPIo0EBl5D86KTnyQAoB9uWMc2OxvzxGo2GaM0phf PYcsqj90PKAl/lEZ0fNxsFc= X-Google-Smtp-Source: ABdhPJzGkshgtG7NGiA8+VuN84jbmwWCorRl4w41qa4+V5PTUQKPUehG6OlJiXvD3g/QieNmQbyXlw== X-Received: by 2002:a4a:ab0c:: with SMTP id i12mr10942430oon.24.1630856581695; Sun, 05 Sep 2021 08:43:01 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:aca:210a:: with SMTP id 10ls884310oiz.1.gmail; Sun, 05 Sep 2021 08:43:00 -0700 (PDT) X-Received: by 2002:aca:2305:: with SMTP id e5mr5822003oie.36.1630856580307; Sun, 05 Sep 2021 08:43:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630856580; cv=none; d=google.com; s=arc-20160816; b=tDMh1dlUURGRye+QT4mSGVQyKNzFAVRKs0RIPbs93M6kNk+zHIqhqU7bkgsydMSuv+ XAk+qZLQhajLZGLqvarvmv8gEzxHSMKefk7/wNh99vDomcPSQ9BC/pxK3XQ/lwrX9KUH 2/ewMApIPIP2r1Bf87Rpq78x2t6l52Ou0LWLewbHypbarW2NqJFNfwDO59UpAebu/HE+ HXWhfbXu9MLknwMc1rpXUhld5VZ5e8drtE6AQdLmvVG6wfxP2Kx4R2IeY/PPttQTp/n3 r7kLjhUTreoeC4oT8JGqfbNuqWu94MnXNOHAPbnDDh/+Oiddf7sGCFs3ax/q5y2ulnos F5/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version; bh=Ikh5GwD19zi/zo4iEc4hUTDy0eesyQFc1iU4FnnTFyc=; b=h8Y9lGDoYXZt0BmQjAN1Jy5XQtxM7PTxJQF0Bw/bectwQURlESuPe1o1oI5DEPKyDv QE9nPazeef1b2MRcMXqDChnQskVLJ7y6FJ2zINAXtuYvr1isxfCHUFgcFh4NxVqmjUDq y3Gj85/qtoWrz99dXfY1llx6mAh8dkIUAR7yGkramd1tEeH6QfC38XP+odC+ZrEf/hHV sJxoQ1l8VR8jqeOFQ47SD2ERKcNcgV9klRoo4VUWgQY5Oamm3oFHF+FhuzXWYx9kVcvB 2hnjoswkBLcgTxDr1n3NfhhKR46ExlA/lYaePAymJ2rv7nI0M4zb4f3/GoALxQ+5NRFr 4vMw== ARC-Authentication-Results: i=1; gmr-mx.google.com; spf=pass (google.com: domain of gwern0-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 209.85.166.51 as permitted sender) smtp.mailfrom=gwern0-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Original-Received: from mail-io1-f51.google.com (mail-io1-f51.google.com. [209.85.166.51]) by gmr-mx.google.com with ESMTPS id bg35si279648oib.3.2021.09.05.08.43.00 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 05 Sep 2021 08:43:00 -0700 (PDT) Received-SPF: pass (google.com: domain of gwern0-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 209.85.166.51 as permitted sender) client-ip=209.85.166.51; Original-Received: by mail-io1-f51.google.com with SMTP id j18so5522102ioj.8 for ; Sun, 05 Sep 2021 08:43:00 -0700 (PDT) X-Received: by 2002:a05:6602:2ac7:: with SMTP id m7mr6424549iov.66.1630856579606; Sun, 05 Sep 2021 08:42:59 -0700 (PDT) In-Reply-To: <1473b62b-4b33-49b5-ac6d-52a5571d8068n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> X-Original-Sender: gwern0-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of gwern0-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 209.85.166.51 as permitted sender) smtp.mailfrom=gwern0-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:29178 Archived-At: On Sun, Sep 5, 2021 at 11:09 AM Ilia Zaihcuk wrote: > The trick with using Span is neat, although it does seem more like a work= around than a true solution - it is making additional changes to the AST af= ter all. Can I be sure that they won't influence my output in unexpected wa= ys? (e.g. some obscure issue like auto-hyphenation in LaTeX, page breaks, e= tc.) I only use HTML output, so I have no idea what other targets like LaTeX do, or if they even output spans at all. Using the span trick necessarily introduces spans around the rewritten text, but I think this is generally desirable. (The whole point of the smallcaps pass is to add marked-up spans for the CSS to apply a smallcaps variant to, and the spans allow me to suppress redundant definitions of terms when doing link rewrites.) The spans might change the necessary CSS for something or other, but I haven't run into any edge-cases there. If the span wrappers worry you, you can do an additional pass to change them, but you are going to run into the basic 'expression problem' https://en.wikipedia.org/wiki/Expression_problem regardless: a strong static FP language like Haskell makes it extremely easy and downright trivial to write a function like 'Inline -> Inline' to plug into 'Pandoc -> Pandoc', but the cost of this is that the datatype 'Inline' or 'Pandoc' cannot be changed easily. (But at least doing a cleanup pass scales well in terms of effort: define all the rewrites you need with the Span trick easily, and then write a single painful cleanup pass.) > This is scary! Correctness is certainly more important than speed. Do you= have a concrete example where this issue occurs? If @jgm (the original aut= hor of pandoc!) is using this method and there are cases in which it behave= s unexpectedly, this sounds worthy of an issue and/or a pull request on Git= Hub. I think the recursion patterns tend to work worst when you are adding in new nodes so that it adds a new node, then descends into it, applies the same rewrite adding new nodes, descends into them... Stuff like that. Not necessarily wrong, but subtle and requiring a lot of understanding to know which one to use. --=20 gwern https://www.gwern.net --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/CAMwO0gwipJYRLBdt%3DMOcn_%3Dc_xWu5q-9Y54BZ%2Bvr0SH7anAHaw%40= mail.gmail.com.