From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/29175 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: John MacFarlane Newsgroups: gmane.text.pandoc Subject: Re: Removing parts of the document with [walk] in Haskell Date: Sat, 04 Sep 2021 22:23:03 -0700 Message-ID: References: <915213fc-e4c4-480b-a6a2-fd3420777ddan@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="13241"; mail-complaints-to="usenet@ciao.gmane.io" To: Ilia Zaihcuk , pandoc-discuss Original-X-From: pandoc-discuss+bncBCJZJHG45QDBBRFI2GEQMGQEXXW3PIQ-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Sun Sep 05 07:23:19 2021 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-pg1-f188.google.com ([209.85.215.188]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1mMkch-0003J4-48 for gtp-pandoc-discuss@m.gmane-mx.org; Sun, 05 Sep 2021 07:23:19 +0200 Original-Received: by mail-pg1-f188.google.com with SMTP id q23-20020a6562570000b029023cbfb4fd73sf2396488pgv.14 for ; Sat, 04 Sep 2021 22:23:19 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1630819398; cv=pass; d=google.com; s=arc-20160816; b=bTKJ2vBuWoRuBsvneffMb+mlmcxruqi6uJfY/bZYUm9ixagVxj+V+7JgjRx1TRPC1j ic1abQ9nV0rnwgOymCgsq4eAAGrKShFIU8m5NXc7FbEtP0Wj584uyrxFUJm5kchgRZWS aaEoddI69vjHWVtb24suKlEbjPXr//Zu5H5Sl6gCQCOodUxYH6URUXU1IMVxrMq7azTY lkvjrkpm5g1kFV/SE1IG2514wvBtWcuYivUtrUPkpxUU60rTwloL0uFxxSYUtBSd6n09 FR2mWzvvxIMu/vdx+zoux3u+Nr1NI8bFO3o4O7PGfW1+/trYoBsvb8b/OrP+MqR/Paax n1/g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:mime-version:message-id :date:references:in-reply-to:subject:to:from:sender:dkim-signature; bh=Tyk50EFb8AZpdyRsVBfWxJY0MJ6DorrbZFuVEJWDqns=; b=Z0lKhf4mYZFLpmjD6luj5eV0av6YeQYPNVvh/62auTnLVpFafNIw6m85MMSQZx5JGu AaycJDnhUw/KG8tR25fitxjiPRsO/RIajBAl42HnBBaxZnYDDCBpgt1JsnsPd7/OXCAU TELDFltVvst/AvfZ75q9ux8NUuAUIbONlKhKY0vhptia5Db9/1GRQJBvmY9J1nMF70bn GKaZv5Xv1pYK83Zu9Ijeba12QyvTiSn4aspZWpo5x9KW8RWzi6a0z7Qeikqz5wCL/iXL 8PuL53En4RwbxdXWLrwDgM1CE6wFB0DlfBtnxqz1Y0b7eMa705e7uCVIhWdFiRUWbKS+ 3UQA== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=G6N3HFjv; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::42c as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=sender:from:to:subject:in-reply-to:references:date:message-id :mime-version:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=Tyk50EFb8AZpdyRsVBfWxJY0MJ6DorrbZFuVEJWDqns=; b=DhJgzQYh3AOKklDFIM6VoHMi5ARTW5Tg1EX/Kqy6JDpwgTBt4O5jxDyM1merX9RYJJ swBNX/H0uOgOs/31cNYkDC8/1oIrMmBu+vgdX77iIHdj9rvAi/UcHo6P5/j/He6HdG74 COn5tbxwJoZtblHK3vgtsRSbq4G32Z2xqe0KYKOR3O2+XRpBIXoB3GiW8JF4srymY1la TYV19qaAzfBtn3493BWvMq16scaiV9kGlCbUM6BasmquG2pcx+3o1ZCdEOWAG8ngBw3d ilRAqRWVV+0Gv9P3HAUzFBmGENHIVJ2NJlbX8qw545MgHs7tWQWGYHNAsFqfy6JIqh3z Xx8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:from:to:subject:in-reply-to:references :date:message-id:mime-version:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=Tyk50EFb8AZpdyRsVBfWxJY0MJ6DorrbZFuVEJWDqns=; b=ktUli7iNe32k27vH0hEv9RKXqYVgnVMvmcaOFS5kXtcrrdN4Evg1SW6vHTdekjWw4h qekWFc+ElsZH0nZGliZRhQ4S6niqB8AWFmQzXmwrR2OHTkG+45M0ME2kD/i6jU6cv1oU D+5vUXYCRKvhxfwxmtRcVN8SwBsAKqyxYA+zxgqORFVI0v4vX48GvEFd0vRPx29x8iX9 3OLRMoxwFsRsuwMunKqMBbHyvpV5X/jwX4RTs1F3HOa/t398QMC3WqVnOljEnEC1l7oR VHnLH5Q9BZ5FTxMSBQ4WIRMg+jyzf14yhu6Z405X92mnIux2OYP538al4AJEzIk8hT1T Kjkw== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM533dle6DZyNBdGGBMg0j0JCJ2QsQ6QwghJPTP25zK8f0RqgsZKq7 AKsN/k4HloYIpkGhB4mwgx0= X-Google-Smtp-Source: ABdhPJyNUj5L/fmXUsqn+JoBD8qdmnHsNwU6lGg+RxFa4T6XSoazhK84SMN+cemZsu9YWon/+MHPtA== X-Received: by 2002:a17:90a:b390:: with SMTP id e16mr7610216pjr.49.1630819397882; Sat, 04 Sep 2021 22:23:17 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a17:903:2445:: with SMTP id l5ls1638995pls.8.gmail; Sat, 04 Sep 2021 22:23:15 -0700 (PDT) X-Received: by 2002:a17:902:b48c:b0:139:eec4:737c with SMTP id y12-20020a170902b48c00b00139eec4737cmr5751022plr.11.1630819395522; Sat, 04 Sep 2021 22:23:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630819395; cv=none; d=google.com; s=arc-20160816; b=IHJ7ZS1N1DMfHPR8UHimI2vbgWD/JtltZNB5mywPS7bLdEYoXi6OXl2oaM7ysN+P79 KIipXm4QW0w3LuwdgbOLT/Y4H4BjIcomYacdZXBW3HUuOf1NOHueaEQoXAgrrbEgGTSb ULIgNmAVdjw+atlVI5qh2Y15MgXjA2xcSH4f1ulkO/pd44BlbWoQ4LQBA7TODK08Ms8R M0DvY9PJE3Ms9rxL6U0bHOMJ3H64aiToKTkkovgk/9aEI2Hz6/IBBpO3doV7SaGfsAlZ qYBtgAwwhudi3gpmWOzyXLQd4/WiaIeulG7YhFfVPajRDFLCrzdWb/PeyPd2i86aQ5pw Jqng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:message-id:date:references:in-reply-to:subject:to:from :dkim-signature; bh=Z+lp8pdheZU4nFBKnd7wgqO3LCIP/gM5NPaVkvh2V2s=; b=H7nT326W1GCAnOqRWAAIzWsYTBbe4DHcP9l1j+1ooiv+PYZwZzzHUs1r9/UjQ06bdR PAFXOrBh/cStiQvLI3Ghj0ND5z/0z0kscVBT6oHVnCHxvNJP3+CC0+KzTag+2KPsnnnd 5lXpsGdpw4teWTNu1eih5yA7FD6X18YRHVu0FYrYyQaxqvwcjx1GftxOV0dHiMpqJPyq AKqoLZCp8TVpBMsMCOQO9K8GapGGnvHp9viZoKomR97Zkv2y6uuzukaIJwuXosb11age j2U/PeTUN5RNRnDeXeM/8H+PqhMbxX0vyVqUy031J+TCOU6N1Xvb/3ymYsJiw/0K/ot2 qbxA== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=G6N3HFjv; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::42c as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org Original-Received: from mail-pf1-x42c.google.com (mail-pf1-x42c.google.com. [2607:f8b0:4864:20::42c]) by gmr-mx.google.com with ESMTPS id i2si251571pju.2.2021.09.04.22.23.15 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 04 Sep 2021 22:23:15 -0700 (PDT) Received-SPF: pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::42c as permitted sender) client-ip=2607:f8b0:4864:20::42c; Original-Received: by mail-pf1-x42c.google.com with SMTP id v123so2897035pfb.11 for ; Sat, 04 Sep 2021 22:23:15 -0700 (PDT) X-Received: by 2002:a63:1914:: with SMTP id z20mr6257342pgl.87.1630819394856; Sat, 04 Sep 2021 22:23:14 -0700 (PDT) Original-Received: from johnmacfarlane.net (li55-134.members.linode.com. [74.82.3.134]) by smtp.gmail.com with ESMTPSA id w188sm3704572pfd.32.2021.09.04.22.23.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 04 Sep 2021 22:23:14 -0700 (PDT) Original-Received: by johnmacfarlane.net (Postfix, from userid 1000) id 331BEA1A8; Sun, 5 Sep 2021 01:23:03 -0400 (EDT) In-Reply-To: <915213fc-e4c4-480b-a6a2-fd3420777ddan-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> X-Original-Sender: jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=G6N3HFjv; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::42c as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:29175 Archived-At: I have often used the method > allIsFine = walk $ concatMap theFine But I don't think this will be any more efficient than your first version with `walk (filter $ not . isThe)`. Actually, I'd be interested in seeing benchmarks with these approaches, if efficiency matters enough to you to research this more. Another possibility is to walk with a function like killThe :: Inline -> Inline killThe (Str "the") = Str "" -- or: Span ("",[],[]) [] killThe x = x This should be more efficient than the list versions, though again it would be interesting to see benchmarks. Ilia Zaihcuk writes: > Hi all, > > Using pandoc-types' Walkable > > class, what is the best-performing/most idiomatic way to filter out certain > elements from a document? > > Say I wanted to remove all occurrences of the word "the" from a Pandoc. The > best implementation I've been able to write for this is > > isThe :: Inline -> Bool > isThe (Str "the") = True > isThe _ = False > > removeThe :: Pandoc -> Pandoc > removeThe = walk (filter $ not . isThe) > > Is this right? Does the use of filter here not mean an additional O(n) > traversal happens on every list of Inlines? > > I feel like a better solution for this would be using > > filterThe :: Inline -> [Inline] > filterThe (Str "the") = [] > filterThe x = [x] > > or something similar, but of course > > removeThe = walk filterThe > > does not typecheck. We need a -> a, meaning [Inline] -> [Inline] here. > > > The same question actually applies to any transformation which "changes the > number of elements": > > theFine :: Inline -> [Inline] > theFine (Str "the") = [Str "the", Space, Str "fine"] > theFine i = [i] > > allIsFine :: Pandoc -> Pandoc > allIsFine = walk $ concatMap theFine > > Best, > > Ilia > > P.S. Sorry if this has been asked before. I feel it must be a common issue, > but all I've been able to find is this thread > > with its links, which are 7 years old now and seem to be calling for > changes in pandoc. Everything else suggests stepping outside Haskell. > > -- > You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/915213fc-e4c4-480b-a6a2-fd3420777ddan%40googlegroups.com.