From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/32193 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Abhishek Ulayil Newsgroups: gmane.text.pandoc Subject: How to read/parse metadata in a custom reader Date: Fri, 17 Feb 2023 07:18:50 -0800 (PST) Message-ID: <33b147cf-936b-419d-9ca0-417e2f9e36edn@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_11440_2147053724.1676647130532" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="923"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBDQIR5HZ7MFBBW5VX2PQMGQEAOZEFDA-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Fri Feb 17 16:18:56 2023 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-qk1-f190.google.com ([209.85.222.190]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1pT2Vj-000AYF-7H for gtp-pandoc-discuss@m.gmane-mx.org; Fri, 17 Feb 2023 16:18:55 +0100 Original-Received: by mail-qk1-f190.google.com with SMTP id bs36-20020a05620a472400b00720f9e6e3e2sf534934qkb.13 for ; Fri, 17 Feb 2023 07:18:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:message-id:to:from:date:sender:from:to:cc :subject:date:message-id:reply-to; bh=ebFr3KFW5JYeG8tBf1XJ23c6NN7kqHDlEWx2N/AfBLY=; b=s9UWKMQZPp6fga9L5JBjYo8t9t81JyAuUjIMIpM/cAF5LBBjJX05xEfNSZvWY3kLxq CIqwQBpGo7NEyPIspDrjHmJ2RV2AIGcnt8q7fg0IGZHvkMrTUmK7gK+I+VJsLKlXziKF bekTnkj+wpT58tTZXm4NMu/doAnMUY+8IGenXZWma3JK+7uPVe5Pkwdlt95jAtrxm0eT exZBLGCL8B/iJY+w77uzDx0lsrVig4DeDLB5KoOb3f6UOGpRztBczAQxUDCmZw5ffcYk yO0lNRBsbUaWBBs9FDlP+fsAKbD0I0pooGn9VxlgfqaOSEJQZ38JojcKtzNf27iSYlnG rIzQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:message-id:to:from:date:from:to:cc:subject :date:message-id:reply-to; bh=ebFr3KFW5JYeG8tBf1XJ23c6NN7kqHDlEWx2N/AfBLY=; b=SHsRyP3AfVKYY63AZrKVMyhHy4H3+X+3WEGOO5RSsAeNEwARxv9beshgzYqRVe9tKm S6gC4AjKs91U5nBCc+JI0s4sBIoMFsJZHYX/Y835UZM1vMOVtvAwFDsOH8XWadUjOcwD INEw2i9hR2cn6leSnLuLHZFDewbD2umtGelqG8napxoKfBcTAS7mXKDf/3mXyQUzJE5w jv5SKRwxuEmqbv/7DoF96eg1tXqURexkXGvkJkw9qL9hCL8wWK+ATrf5R2Xy15gBTN+K 0WjQdIvxt+AFtVYf7St9yaixRtQG8hu8pkFL/CE3Wtg7k4TgzdLlZQGcb+l5uWcdBENB rw6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :x-spam-checked-in-group:list-id:mailing-list:precedence:reply-to :x-original-sender:mime-version:subject:message-id:to:from:date :x-gm-message-state:sender:from:to:cc:subject:date:message-id :reply-to; bh=ebFr3KFW5JYeG8tBf1XJ23c6NN7kqHDlEWx2N/AfBLY=; b=oSH+IpWTKi0ZpVLRoufTXGvlas3o0NnzzxvFKb4f7cHPrwSaUoCGrDgsImZ3H4y/mW mJaJJlu6dl7/UO/sqT19U+qMcEW2FYb3rw+g/+VP0mGhzH1cIv0PdY621EUiTxQr2A1v IlKucO0H2zcKU06PgJsYCxZEOYBsnmVMei/dMxPu8WEHwYBs0JmDE0zBqziEVe7z1prQ KyhR+SpBCDBWkEahtiA8ATFzmyyVl4bilShjT8i8Q9yg3XJxDbVXPXIWJmOmB3EpjD1d CPgg3Gs1vm14HHaS704S4OQ4MX32GB5ABmnxdG8SOml99Z1FunCVsMKMfA4BMX7c3Kmq llVw== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AO0yUKUcU5thf5RYsfAcuv8CLs6L8EIEcoJgmdHSFTHeWLpXN/EGFpYk fzVUEOt2J8he9cudVRUd3aE= X-Google-Smtp-Source: AK7set/zvYl6u0ueioBMzWbUDaX/iNISVG5qmajQNarxKM9/2weXr1aWgeav9O5aSAntAcIMQrGHXw== X-Received: by 2002:a0c:9c8b:0:b0:53c:2e3c:e4d6 with SMTP id i11-20020a0c9c8b000000b0053c2e3ce4d6mr54882qvf.19.1676647134169; Fri, 17 Feb 2023 07:18:54 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6214:1d25:b0:56e:9b27:727c with SMTP id f5-20020a0562141d2500b0056e9b27727cls1230656qvd.0.-pod-prod-gmail; Fri, 17 Feb 2023 07:18:51 -0800 (PST) X-Received: by 2002:ad4:5a51:0:b0:56e:9c1c:c64 with SMTP id ej17-20020ad45a51000000b0056e9c1c0c64mr198147qvb.6.1676647131089; Fri, 17 Feb 2023 07:18:51 -0800 (PST) X-Original-Sender: Abhiman2000-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:32193 Archived-At: ------=_Part_11440_2147053724.1676647130532 Content-Type: multipart/alternative; boundary="----=_Part_11441_1737383056.1676647130532" ------=_Part_11441_1737383056.1676647130532 Content-Type: text/plain; charset="UTF-8" Hi everyone,hope you all are doing well, I am currently exploring lpeg as well as custom readers in pandoc. With the given documentation available online, I was able to briefly understand the mechanics of custom reader, Grammar and capture groups. However I was confused as to how should I read data so as to classify them in meta block of pandoc AST. here is an example code where I am trying to read abstract using latex style commands. ``` local P, S, R, Cf, Cc, Ct, V, Cs, Cg, Cb, B, C, Cmt = lpeg.P, lpeg.S, lpeg.R, lpeg.Cf, lpeg.Cc, lpeg.Ct, lpeg.V, lpeg.Cs, lpeg.Cg, lpeg.Cb, lpeg.B, lpeg.C, lpeg.Cmt local whitespacechar = S(" \t\r\n") local specialchar = S("/*~[]\\{}|") local wordchar = (1 - (whitespacechar + specialchar )) local spacechar = S(" \t") local newline = P"\r"^-1 * P"\n" local blankline = spacechar^0 * newline local blanklines = newline * (spacechar^0 * newline)^1 local endline = newline - blanklines local function trim(s) return (s:gsub("^%s*(.-)%s*$", "%1")) end -- Grammar G = P{ "Pandoc", Pandoc = Ct(V"Block"^0) / pandoc.Pandoc; Block = blanklines^0 * V"Para" ; Para = Ct(V"Inline"^1) / pandoc.Para; Meta = V"MetaList" / pandoc.MetaBlocks; MetaList = Ct(V"Abstract"^0) / pandoc.MetaList; Inline = V"Str" + V"Space" + V"SoftBreak" ; Abstract = P"\\abstract{" * C((1-S"}")^1) * P"}" / pandoc.Str; Str = wordchar^1 / pandoc.Str; Space = spacechar^1 / pandoc.Space; SoftBreak = endline / pandoc.SoftBreak; } function Reader(input) return lpeg.match(G, tostring(input)) end ``` pandoc command : ``` pandoc -f reader.lua -t native -s \abstract{ hello } ^d ``` But I am unable to add the abstract group into the meta data Also I want to know how to add custom labels to the metaInlines. It might sound like reinventing the wheel a bit but it is my curiosity that wants to explore implementing custom readers. Thanks in advance, Best wishes. Abhishek U. -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/33b147cf-936b-419d-9ca0-417e2f9e36edn%40googlegroups.com. ------=_Part_11441_1737383056.1676647130532 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi everyone,hope you all are doing well,

I = am currently exploring lpeg as well as custom readers in pandoc. With the g= iven documentation available online, I was able to briefly understand the m= echanics of custom reader, Grammar and capture groups.

However I was confused as to how should I read data so as to classif= y them in meta block of pandoc AST.
here is an example cod= e where I am trying to read abstract using latex style commands.
= ```
local P, S, R, Cf, Cc= , Ct, V, C= s, Cg, Cb,= B, C, Cmt= =3D
lpeg.<= /span>P, lpeg.S, lpeg.R, lpeg.Cf, lpe= g.Cc, lpeg= .Ct, lpeg.= V,
lpeg.Cs, lpeg.Cg, lpeg.Cb, <= /span>lpeg.B, lpe= g.C, lpeg.= Cmt

local whitespacechar =3D S(" \t\r\n")
l= ocal specialchar =3D S<= /span>("/*~[]\\{}|")
local wordchar<= span> =3D (1 - (whitespacechar + specialchar ))
l= ocal spacechar =3D S(" \t")
local newline =3D = P"\r"^-1 * P"\n"
local blankline<= /span> =3D spacechar^0<= span> * newline
local blanklines =3D newline * (<= /span>spacechar^0 * newline)^1
local= endline =3D newline - blanklines
local function trim(s)
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 return (s:gsub("^%s*(.-)%s*$",= "%1"))
end
-- Grammar
G =3D = P{ "Pandoc",
= Pandoc =3D Ct(<= /span>V"Block"^0)= / pandoc.Pandoc;
Block =3D blankl= ines^0 * V= "Para" ;
Para =3D Ct(V"Inline"^1) / pandoc.<= /span>Para;
Meta =3D V"MetaList" / pandoc.MetaBlocks;
MetaList =3D Ct(V"Abstract"^0) / pandoc.MetaList;
Inline =3D V"Str" + V"Spa= ce" + V"SoftBreak" ;
Abstract =3D = P"\\abstract{"
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 * C((1-S"}")^1)
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * P"}"
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 / pandoc.Str<= span>;
Str =3D wordchar^1 / pandoc= .Str;
Space =3D spacechar^<= span>1 / pandoc.Space;
SoftBreak = =3D endline / pandoc.SoftBreak;
}
function Reader(input)
=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 return lpeg.match(G, tostring(input))
=
end
```
pan= doc command :
```
pandoc -f reader.lua -t native =C2=A0-s
\abstract{
hello
}
^d
```<= br />
But I am unable to add the a= bstract group into the meta data
Also I want to know how to add custom labels to the metaInlines.
=

It might sound like reinventing the wheel a bit but it is my curio= sity that wants to explore=C2=A0 implementing custom readers.

Thanks in advance,
Best wishes.
Abhishek U.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d= /msgid/pandoc-discuss/33b147cf-936b-419d-9ca0-417e2f9e36edn%40googlegroups.= com.
------=_Part_11441_1737383056.1676647130532-- ------=_Part_11440_2147053724.1676647130532--