From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/23568 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: BPJ Newsgroups: gmane.text.pandoc Subject: Re: Lua-Filter, Span-text to metadata-text: How to get rid of linebreaks in metadata Date: Sat, 12 Oct 2019 09:27:35 +0200 Message-ID: References: <7538f86a-e1ac-d7b5-cee7-f67eb34f5127@tu-dortmund.de> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="000000000000467bf20594b1923c" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="83624"; mail-complaints-to="usenet@blaine.gmane.org" To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBCWMVYEK54FRB5EAQ3WQKGQETQZUU6Q-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Sat Oct 12 09:27:55 2019 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-pg1-f188.google.com ([209.85.215.188]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.89) (envelope-from ) id 1iJBod-000LSw-Jn for gtp-pandoc-discuss@m.gmane.org; Sat, 12 Oct 2019 09:27:52 +0200 Original-Received: by mail-pg1-f188.google.com with SMTP id e13sf2681302pgi.21 for ; Sat, 12 Oct 2019 00:27:51 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1570865269; cv=pass; d=google.com; s=arc-20160816; b=AK65ZctzZ5Ma8Te2GLsM7vjL8T8vhlIzl8F/so5Nbw5mRNo4VnM/nP17GfPvdfA9UK NB3vCY3pt8MtP5dpe9mpkaiNVmrWYsZ8/B+HYzlUAUeIT2QGJ2lyAhHwE3wmLwia4+wG HuExhqKaYjut3rN86ihDyktAna62+zNWPTjA/DkjA0b9/Y3pfU0YhSqlMBseEB/dgqr1 w/evs2gNQ/wf1SftNZozI7KUc4itFoPKQhsI27e6n87uccjWgI5IbCd7mwhU2DlorVmK /NkrxS8mMzT6Uqi60iU8Cy6huCsgquaVKkefgF6VIc1pchHQeigO6tCOLBSRkV2r1NT+ XdeA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:to:subject:message-id:date :from:in-reply-to:references:mime-version:sender:dkim-signature :dkim-signature; bh=VZzvs33OdFu35cF0zq2+JNYdZd9ArsgOb+P9ox18dDo=; b=Z060xVNVaLQsgWjX3lx3HUohtlsnNR6rZkMxhKEpypvywlv/LW5He05yYhcWJdL1D+ i0n8TvvRBxWtt1tCsKptTtET3IfUF5CHNoKyzjzWA/tJRU7iK8Y4vFUOpdbaQCL2fQ2x 9j3omBTfWRW1DTFWRcRRAGq3M1yd2qaU44wWdLHRI1j5FjM0lQRZwJj6thkV0RAHL8RV fJhjqsD+bxvRSaVcHXPZ4WXkxT68rtNJE9U2ofXtCQFS7TI4j9I95m7PZ2EGI9Pi75V6 K/k58PcEMeA4p5lBpsq1rZVoadRtRgESsouvJrwEquI4u6apefe7E/JxVHIw2CB4NH81 6EJg== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=egwO6v6S; spf=pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::234 as permitted sender) smtp.mailfrom=melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:mime-version:references:in-reply-to:from:date:message-id :subject:to:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=VZzvs33OdFu35cF0zq2+JNYdZd9ArsgOb+P9ox18dDo=; b=aPhEfqbA+ppNLYqlu+Ut6arz93rXg70okFDZa0RY4fp+IEMkqe0+KZkG23ciWxRflA 9K6dlxsfGxOz0VBFlEIoEQhgJz722XT791yOlQM3gNOECn/BHADtsUqtpVp7AIUQBrti LnJ7VEtlRTyt6HgyYnFPQaG5rSerPNlgvfBHINaOCEgXAWKDSrlSyNYpMK1B3zRmNCE1 ahCCT7V+McB5DD4J3n3A/iOSwTsLnktKU0rpUoz3SLHj5CWHWFQAa+t8ir/IAAVeJDBH yZW5ae89Qh52N63umyP5wILh0ILr8XnDsmDE02xLCueD+aD+C9CGGdW9JKElDdqPwTZa 3rRQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=VZzvs33OdFu35cF0zq2+JNYdZd9ArsgOb+P9ox18dDo=; b=e8DTUrdj1JKoibtdNaASMk5s8g4BzRA/kgZqHOVYfEaXadSgmXgawI9Y2ooAHVBytk CAbsEi8OqzUKBVzPi78wP3hO2Z3Ynp32b4AmQyf2riLy4UAjCy5NnBOq3g4FZ4oiVllB 8mtvtk2NM6XtFGa43EBy5yNM1tguzcoNPk5ZuHNCKF9LUB3sVmNWND4CwfXRMbJGyUoK OtMlwq4mSOxA43/YAbV7td/qR0x+YrJAWERwWgMUA2SXLOuyWr1ltXUBOrO2gC9RDlmq erOEf9TMbcKUEQtUav3jqi52tWfLwffmp3LSi1WXkGptQlWqpukiS1IPM7hGwgG7kver gI4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:mime-version:references:in-reply-to:from :date:message-id:subject:to:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=VZzvs33OdFu35cF0zq2+JNYdZd9ArsgOb+P9ox18dDo=; b=T1iOb6SdoHsJ/ZqTBpXAj4TknmIbuYXbME/oqs6fkP71l3AmzW2cLrInf1Rb76NuKb QAPlWTOM6rI3yP7r+jcBY7RVtAR9FMl31Sc2U7GWGyhcJd+gHheQuN3hSBrr4RbF7HmY mKM1Vn2vSk62QMY4iZNJp9SOklk8d64QC/NQ6vFgTHUm1M6qO+/wM6jul8hPI8s2aV5H dlUvbcxkDcFUTxGNTRcRGyvFMUjw7uUQl7raF85RTCzUt+qD3He1e+aBl1UJOc8uocAr 5MStKKbDv1WQ+1h3pkCGUyXhBCRl6RRt12jARfo0L04hshkfCjZyr0wA5bixTGuGYpAL IaGw== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: APjAAAWAcuJvRlV+e7rwgD4KETAubs9LE5IkGoOru/j6wtkUphdrZTT5 N+z1TUnS5JXca8MTCjg6SVs= X-Google-Smtp-Source: APXvYqz6wzajqFdCu/yzlqAkUqfA6PSmxvSus2G6e4b/NpPv4Q4LwIWQXhty0plkftwL1Nuradnfgg== X-Received: by 2002:a62:60c6:: with SMTP id u189mr20877741pfb.85.1570865269630; Sat, 12 Oct 2019 00:27:49 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a62:bd03:: with SMTP id a3ls2596603pff.10.gmail; Sat, 12 Oct 2019 00:27:48 -0700 (PDT) X-Received: by 2002:aa7:8583:: with SMTP id w3mr20446242pfn.129.1570865268655; Sat, 12 Oct 2019 00:27:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570865268; cv=none; d=google.com; s=arc-20160816; b=hiQff/mxKPX0e5eKjc4tBmZhlRcIJyQcqrxTRe9XDwOWY2YVcOskcemwsC0CKGTcM6 GrKdrA4sjffJ78nqaN14BvpXyCDzJ0J/7dCo8hoFTQEfAjX3hzUUV+l+Q6fMdfeGT9M0 g8Fw8Lgjngt7QxNplgmQzuIdoe7wZaTukRch9nJt7ih69MBStov+EYEfQRO8OsuTZyH2 WD1xVqFn7EPU33lMXKD5A7wJpzLWFPxBWVJR0jGdXUlYyjZO7CNuSrRMsYYTfRKuV33i rpOUGEz6kCMI8GErDONNpn/xqOpuAAjbyl2WYnodtdKfR5zVBiWQ5mluIsj54CCZk+mP UcOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=KWKAUegExUdfRjTt/tGhox/CIhPmR15Un9qoBMzAEcg=; b=0omxAQabfgKK0yJu5/e6lE8tPyC6ikLkzhVRp3T76z1+Pf7fJP5MYrGEpjP2xoWc55 Tucio4J2tEZqFMUY0y6vekS4anAYLcgvD6L9XM3SNGsWPn7/QwndZSxPmSPjhbzvoYO7 sgtNSswtjZzFbwKqm3hXxf5Jl3Pu6GYuy5lyx6JPumYo5tFJDiMU3g5spTvO35LLQzwB GvP+e1ADYoSUyIBHQKLhhQ91N/vQ5VQr+3a6GbJwH91U3qdWdN+wJw2slJfd4KhFnV8v 9NNCrDbKMBFWUztxc0M663ndpT4QvlvbXgkm4Wn7HwtRG0nCHsmL9uD/cB3bTnL3vBV0 4i/A== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=egwO6v6S; spf=pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::234 as permitted sender) smtp.mailfrom=melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Original-Received: from mail-oi1-x234.google.com (mail-oi1-x234.google.com. [2607:f8b0:4864:20::234]) by gmr-mx.google.com with ESMTPS id b1si526612pjw.1.2019.10.12.00.27.48 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 12 Oct 2019 00:27:48 -0700 (PDT) Received-SPF: pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::234 as permitted sender) client-ip=2607:f8b0:4864:20::234; Original-Received: by mail-oi1-x234.google.com with SMTP id w6so9868499oie.11 for ; Sat, 12 Oct 2019 00:27:48 -0700 (PDT) X-Received: by 2002:aca:c45:: with SMTP id i5mr15749038oiy.134.1570865267766; Sat, 12 Oct 2019 00:27:47 -0700 (PDT) In-Reply-To: <7538f86a-e1ac-d7b5-cee7-f67eb34f5127-ncST9ati83jjhi9iKp3Nug@public.gmane.org> X-Original-Sender: melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=egwO6v6S; spf=pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::234 as permitted sender) smtp.mailfrom=melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.org gmane.text.pandoc:23568 Archived-At: --000000000000467bf20594b1923c Content-Type: text/plain; charset="UTF-8" It's because metadata values are parsed by the ordinary Markdown parser, which preserves existing line breaks in the input as SoftBreak elements so that they can be preserved in the output, which is useful for ordinary text. Many people like to have one sentence per physical line for example, and will want to have that preserved at least when converting Markdown to Markdown for cleanup purposes. Apparently the stringify function also restores them, which is not so useful in your case. You can also try --wrap=auto, which may remove most breaks in the metadata values but still insert line breaks in appropriate places in the document body. Den tors 10 okt. 2019 12:00Jonas Zohren skrev: > Yes, this helps to solve the problem, thanks for that. > > But why do those metavalues get wrapped in the JSON-output in the first > place? Is there a specific rationale behind it? > > On 10.10.19 06:20, John MacFarlane wrote: > > > > If you don't want any line wrapping behavior, just use > > --wrap=none on the command line. > > > > Alternatively, --wrap=preserve will preserve newlines > > in your source file. > > > > Does that help or have I misunderstood the problem? > > > > Jonas Zohren writes: > > > >> Dear list! > >> > >> Setup: > >> PandocMarkdown transcript of meeting with specially tagged spans. E.g. > >> > >> ```md > >> [Let's declare war on those other guys over there which we don't want to > >> live.]{.resolution} > >> > >> [Let's buy a tank.]{.resolution} > >> ``` > >> > >> I want to extract those resolutions out of the document and store them > >> as metadata for further processing. I managed to do so with a lua filter > >> using `pandoc.utils.stringify(span)`. As a result the strings get stored > >> in metadata: > >> ```yaml > >> date: 0000-00-00 > >> resolutions: > >> - text: | > >> Let's declare war on those other guys over there which > >> we don't want to live on. > >> > >> ``` > >> > >> And here is my problem: It gets split up into multiple lines, even > >> though the original text did not have line breaks. In markdown this > >> wouldn't be a huge problem, as it ignores this, but when I now output > >> the metadata as json with the template > >> > >> ```md > >> $meta-json$ > >> ``` > >> > >> and `pandoc -t markdown -s` the resulting json contains those line > >> break, which originally weren't there: > >> > >> ```json > >> {text: "Let's declare war on those other guys over there which\nwe don't > >> want to live on."} > >> ``` > >> > >> AFAIK PandocMarkdown treats the yaml metadata strings as regular > >> markdown strings and might auto line break them up as a result, but why > >> does this leak into the JSON-output? The raw JSON-AST (`pandoc -t JSON`) > >> does not containt those line breaks. > >> > >> How can I avoid that and export my metadata as JSON with _clean_ > strings? > >> > >> > >> Kind Regards > >> > >> Jonas > > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/7538f86a-e1ac-d7b5-cee7-f67eb34f5127%40tu-dortmund.de > . > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhDPfHKWCRrPEj6BPi-yRGCCn%3DvRM7txXZVpctHPd-KzTg%40mail.gmail.com. --000000000000467bf20594b1923c Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
It's because metadata values are parsed by the o= rdinary Markdown parser, which preserves existing line breaks in the input = as SoftBreak elements so that they can be preserved in the output, which is= useful for ordinary text. Many people like to have one sentence per physic= al line for example, and will want to have that preserved at least when con= verting Markdown to Markdown for cleanup purposes. Apparently the stringify= function also restores them, which is not so useful in your case. You can = also try --wrap=3Dauto, which may remove most breaks in the metadata values= but still insert line breaks in appropriate places in the document body.
Den = tors 10 okt. 2019 12:00Jonas Zohren <jonas.zohren-ncST9ati83jjhi9iKp3Nug@public.gmane.org> skrev:
Yes, this helps to solve the problem, thanks for that.<= br>
But why do those metavalues get wrapped in the JSON-output in the first
place? Is there a specific rationale behind it?

On 10.10.19 06:20, John MacFarlane wrote:
>
> If you don't want any line wrapping behavior, just use
> --wrap=3Dnone on the command line.
>
> Alternatively, --wrap=3Dpreserve will preserve newlines
> in your source file.
>
> Does that help or have I misunderstood the problem?
>
> Jonas Zohren <jonas.zohren-ncST9ati83jjhi9iKp3Nug@public.gmane.org> writes:<= br> >
>> Dear list!
>>
>> Setup:
>> PandocMarkdown transcript of meeting with specially tagged spans. = E.g.
>>
>> ```md
>> [Let's declare war on those other guys over there which we don= 't want to
>> live.]{.resolution}
>>
>> [Let's buy a tank.]{.resolution}
>> ```
>>
>> I want to extract those resolutions out of the document and store = them
>> as metadata for further processing. I managed to do so with a lua = filter
>> using `pandoc.utils.stringify(span)`. As a result the strings get = stored
>> in metadata:
>> ```yaml
>> date: 0000-00-00
>> resolutions:
>> - text: |
>>=C2=A0 =C2=A0 =C2=A0 =C2=A0Let's declare war on those other guy= s over there which
>>=C2=A0 =C2=A0 =C2=A0 =C2=A0we don't want to live on.
>>
>> ```
>>
>> And here is my problem: It gets split up into multiple lines, even=
>> though the original text did not have line breaks. In markdown thi= s
>> wouldn't be a huge problem, as it ignores this, but when I now= output
>> the metadata as json with the template
>>
>> ```md
>> $meta-json$
>> ```
>>
>> and `pandoc -t markdown -s` the resulting json contains those line=
>> break, which originally weren't there:
>>
>> ```json
>> {text: "Let's declare war on those other guys over there = which\nwe don't
>> want to live on."}
>> ```
>>
>> AFAIK PandocMarkdown treats the yaml metadata strings as regular >> markdown strings and might auto line break them up as a result, bu= t why
>> does this leak into the JSON-output? The raw JSON-AST (`pandoc -t = JSON`)
>> does not containt those line breaks.
>>
>> How can I avoid that and export my metadata as JSON with _clean_ s= trings?
>>
>>
>> Kind Regards
>>
>> Jonas

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe@googlegroups.= com.
To view this discussion on the web visit https://groups.google= .com/d/msgid/pandoc-discuss/7538f86a-e1ac-d7b5-cee7-f67eb34f5127%40tu-dortm= und.de.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://group= s.google.com/d/msgid/pandoc-discuss/CADAJKhDPfHKWCRrPEj6BPi-yRGCCn%3DvRM7tx= XZVpctHPd-KzTg%40mail.gmail.com.
--000000000000467bf20594b1923c--