From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/26105 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Albert Krewinkel Newsgroups: gmane.text.pandoc Subject: Re: Filtering entire files based on markdown list item Date: Fri, 11 Sep 2020 18:19:29 +0200 Message-ID: <87h7s4l20e.fsf@zeitkraut.de> References: <2dcd5988-9b74-47a9-80bd-ce4a64fee169n@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="35455"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBCZJF7XJTILRBFWH535AKGQE3MTVECI-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Fri Sep 11 18:19:38 2020 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-wm1-f59.google.com ([209.85.128.59]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1kGlly-00096K-2v for gtp-pandoc-discuss@m.gmane-mx.org; Fri, 11 Sep 2020 18:19:38 +0200 Original-Received: by mail-wm1-f59.google.com with SMTP id 189sf1552332wme.5 for ; Fri, 11 Sep 2020 09:19:38 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1599841177; cv=pass; d=google.com; s=arc-20160816; b=eI3UrcHkVFm/aRP71WHOQeQo0fXJjyqvjhPpQXO4C+ZM+PMFyoSPIvPuVNRXJiOtDw HMn0WnGRoGw+pgagpZxDTbgB8L1JquHfXmOnmZhQXFoUPoD+X450q1256gq+R/Djps5A c/Wdpv/j4t2crKtTNJaLly9nBeLvCdiNuF2JiMwMJvD1XewjT8HLHU/28LmlOkuXqcKZ Y5rUh8jKdQDjTk+qTwQtPVi2Co698FcrfnfNCKv1ZpwrzSz0x1RHKVnRj8z5FstKxSoO 5vOhdHfPCQDTzsV2tMceQzhM0I6tRMI6BfTkaKd3X+W+Tv+2LHr0CYDO+te9Nlowc7sF UaQA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:mime-version:message-id :date:in-reply-to:subject:to:from:references:sender:dkim-signature; bh=N6XaPIuND9FXW6BpxFl+XmI8oRCbr8czSutkvn3+HtU=; b=mPpB+qZDhI0qv0isWLO4xrql1GZfh3PHsal6UsbIGDYmWdKnzPOwCml4n547bopSw9 ZsnkF81AJOqp/QgAsRqtYKxWxVX7OhGQ6gqxFIRkFUeLwitXMfs/ONvzpp9OtxR4FmHH KFjyOiTSrEYY12moVuFYZplrcPuvPmg8wluvhpaqGFhvDat5LHq3p7Lhbq0XJDWPpr4T k45VOAFpLWrDvtECtSIkdexKfM9aMrV96Z+BjfEB0GKf6SnWNV1cn2yKSMQSuzXrLRKL ENvaZ6jL1LCvJP3Nhx/GkqE7n2Cq9sWy71Lq+vWL08R5l5KSy2ZaAoAPV3ahXtqRitz9 jylg== ARC-Authentication-Results: i=2; gmr-mx.google.com; spf=pass (google.com: domain of albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org designates 80.241.56.171 as permitted sender) smtp.mailfrom=albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:references:from:to:subject:in-reply-to:date:message-id :mime-version:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=N6XaPIuND9FXW6BpxFl+XmI8oRCbr8czSutkvn3+HtU=; b=Eb6/0fOLMmPDv10/llZJoOKMy4g0n/uFvWbd6OjR8Pk6jGyaEolXYvs13eDKw/p9Xl UDjTR7r4W95BRyN3/UgxBbN8aXBkNPXPReeKG2c6SRAgUP++eb4eoFL9RRFXeS4DKvfL m/WgwGK2n/W7gSjjE6iZ6l27B60ftsGTImUJV+zIoFJtefef1b0J/aM+lWIcF8+Xk5FL Ul+Y3tfpxQ6PScn50lTNVl08gXoniVYbNr7zfERN1Jo88EKqpfJDTd3vtmTU8RbsWFkm mTg54/MxrzClxuABNCgNj3Qa95UakFc+ftafYepju7PduU4etj73nOBaEwlPS0nJQH8k 1UTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:references:from:to:subject:in-reply-to :date:message-id:mime-version:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=N6XaPIuND9FXW6BpxFl+XmI8oRCbr8czSutkvn3+HtU=; b=dl29rkeL7c7BgdZg831v06ZDzYIJgNT1zcxASyXy5/iySXBhYVT46jEhxxgVExmu9X VzHOWb2t1AYxMC5CSiYeS7zi7Hc/Ry1JAhDU54rwuOo/nYaOvJMhPSPW1nMEXA0E3pIX yj6tAWgPlfIzdEdrvTJWSBEmNtH8nnN2TnkilK15OLq/mGd+vZ6d2JM7qhDhyPVMsWAc O62dbJLMdLkskhwUSYMip+o9h5P55ShTZJYrfiqZZKw0M8Ie0yfrlzjlmhyBf4I5yQM7 gVn76tYyMEp7qkVlXhzYKDilQAgjuk0ze93zoFitdKt2kuc94LeBfwQVJHc87UIze81q ZwCQ== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM530BtwC9zFT6VNjmw9w3GblHuPHmmknO/LPgk5errzijGakSQxTB mB/RyuvjsxfiYSsWQdZ5yd0= X-Google-Smtp-Source: ABdhPJwi5XS/rf/cumPI7425CK0MAFmQREMhODoaM6Vbx1PGC/rv3OB+J37AoYL1IKKE0XYBNvKMQQ== X-Received: by 2002:a1c:40f:: with SMTP id 15mr3103930wme.175.1599841177701; Fri, 11 Sep 2020 09:19:37 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a5d:428e:: with SMTP id k14ls2853878wrq.0.gmail; Fri, 11 Sep 2020 09:19:33 -0700 (PDT) X-Received: by 2002:adf:dccc:: with SMTP id x12mr2902914wrm.241.1599841173861; Fri, 11 Sep 2020 09:19:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599841173; cv=none; d=google.com; s=arc-20160816; b=oBbzvWEPdMeQK4W/uSGU5pVLFGeNowL4XheXZCvJ5rPN64hhmnvW1ntugGEbPx2lNK FiOvFWitiS9A7ThYt6aN4zu9K2Mx49CaWerWvOx979Hy+BzxXBvnW/4I8Ob6P1jAjUJP Q3g44qgW/pBgZTeJyUW9BoTSeCnq3wzvSY5JVS1FHlBiMe0eMfXGkw2WkJiJSyNTZWMe hdblNZhh6qT3mfUyP2UqZ2tg4jrF85ATrMaA6MsK+ByBVaRXNYObpvBv5lKQNOwIqwrW GphKd76nkouo5mOeFCcumAMbH1R7gH8tDvv7kJw5f6TlkN+Ct8BXh6ITjNzgA/R9zVsz /fuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:message-id:date:in-reply-to:subject:to:from:references; bh=5nwrfgC3wfhmf/qWKxpOXvhtCZK7L09IrMj2IaAk9xk=; b=lnmwH7K8iJZiOWki4YTJaMLMaROef9XLiC/MnaI7vMiB8enxT4ZTtSmT5SeJBfekXG EqkLRz5QDkeF5CUEqD5OVSBSZQjmajDdEORrJ4xD8Y6n6soHn+/mj3HUp3MsbUQH+9oU W7v2yPrbAT5HC21uKUuTb2fNCUPYIzSsaLJE/2olCJ355cE3AVW48VEXQ7qjS27JBTIL ck/rg6mK8WPu+SVC4ZwJaq2rUgVjWuEc4uTI0nifOTXs0VklYtq2PkiNNlJoobZdBKJK f/I9xHM6jxsSraPbiH+sZVR7pdpkmestaHF00ZFKWFkWjcN4OApuzZhT8mzask7P2wQq 703w== ARC-Authentication-Results: i=1; gmr-mx.google.com; spf=pass (google.com: domain of albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org designates 80.241.56.171 as permitted sender) smtp.mailfrom=albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org Original-Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org. [80.241.56.171]) by gmr-mx.google.com with ESMTPS id g5si312966wmi.3.2020.09.11.09.19.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Sep 2020 09:19:33 -0700 (PDT) Received-SPF: pass (google.com: domain of albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org designates 80.241.56.171 as permitted sender) client-ip=80.241.56.171; Original-Received: from smtp1.mailbox.org (smtp1.mailbox.org [IPv6:2001:67c:2050:105:465:1:1:0]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4Bp1DP4MtYzQkm8 for ; Fri, 11 Sep 2020 18:19:33 +0200 (CEST) X-Virus-Scanned: amavisd-new at heinlein-support.de Original-Received: from smtp1.mailbox.org ([80.241.60.240]) by gerste.heinlein-support.de (gerste.heinlein-support.de [91.198.250.173]) (amavisd-new, port 10030) with ESMTP id SwIohNkpdBF1 for ; Fri, 11 Sep 2020 18:19:30 +0200 (CEST) In-reply-to: <2dcd5988-9b74-47a9-80bd-ce4a64fee169n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> X-Rspamd-Score: -2.55 / 15.00 / 15.00 X-Rspamd-Queue-Id: 9764514B1 X-Rspamd-UID: 45865d X-Original-Sender: albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org designates 80.241.56.171 as permitted sender) smtp.mailfrom=albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:26105 Archived-At: I believe that this would be possible, but not straight-forward. Pandoc doesn't preserve information about which file an element originated in, so one would have to iterate over all files manually. E.g., maybe pass the files via a metadata value as a comma- separated list. Then step through all file names, read their contents, and decide whether to keep or discard the result. Non-functioning pseudo code: function Pandoc (doc) local categories = find_categories(doc.blocks) local files = doc.meta.input_files for _, f in files:gmatch '[^,]+' do local contents = io.open(f, 'r'):read '*a' local file_doc = pandoc.read(contents) if not contains_category(file_doc, categories) then doc = append_doc(doc, file_doc) end end return doc end The functions `append_doc` and `find_categories` would have to be written. Cheers, Albert Henrik Klang writes: > Hi, > > I have a number of markdown files that I concatenate with Pandoc. Each > markdown file has a second level top header (##). The markdown files are > specified in the variable $document_list. > > pandoc --from=markdown --wrap=auto --to=markdown $documents_list > > output.md > > In the next step I create a PDF: > > pandoc -f markdown+table_captions --pdf-engine=xelatex --listings -o > output.pdf output.md > > What I want to do is to filter out certain markdown files in the first > step. The filter shall be based on a list under a specific category in the > markdown file: > > ### Group(s) > Category1, Category3 > > E.g. if *Category1* exists in the markdown file I want to filter the entire > markdown file out. > > Do you think I could achieve this with Lua filters? > > Thank you. > > / Henrik -- Albert Krewinkel GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124