From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/29989 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: =?UTF-8?B?0JjQs9C+0YDRjCDQn9Cw0YjQtdCy?= Newsgroups: gmane.text.pandoc Subject: Re: LaTeX: parse thebibliography (patch) Date: Sun, 16 Jan 2022 10:59:39 -0800 (PST) Message-ID: <8296a3c5-bd71-4b2c-8498-11903d7a0194n@googlegroups.com> References: <415779ca-0946-47b5-b15e-a82c2d99d168n@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_750_2053146350.1642359579870" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="5473"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBCEZVDHV5ILBBHGWSGHQMGQEWYGDRLI-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Sun Jan 16 19:59:43 2022 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-oi1-f184.google.com ([209.85.167.184]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1n9Akh-0001GE-6i for gtp-pandoc-discuss@m.gmane-mx.org; Sun, 16 Jan 2022 19:59:43 +0100 Original-Received: by mail-oi1-f184.google.com with SMTP id v204-20020acaded5000000b002c896f409c4sf9925472oig.6 for ; Sun, 16 Jan 2022 10:59:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=sender:date:from:to:message-id:in-reply-to:references:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:list-subscribe :list-unsubscribe; bh=VZJVrzok3pZKIsjG2wsxHPPCgHjJyEE8+kuOm6RzRuA=; b=UcWi7FqYYQCLKxee710hZ4n6/q6l5aykrE+cHV6Hb3n7De+Ayta66s7qlHzujCErku ZrCHgReV52lPTdtTmakx0U4TWFCv6eO4ZlrgUfYDqIUoC67bHFHiwd60n8GobrpwCRIN in23UMMyruYqRFb5l3T0u8fT5w0ZLEUkgdzqDbqQiPqrkPl9DU9P/CnEaGf6mqlajXHO P/yM/pymw2waV04SuXBDOMUNZSgdhbCy2RXYm/RDyC9gj08o9aFp/1gSGqGAUoXJjXUZ ndNIar19GstLiKKP0wpUqMTMA+mPHUchKq4o6T2bSfEyetYrhNwLE4H+xGWRd3HhpI33 xdKw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-subscribe:list-unsubscribe; bh=VZJVrzok3pZKIsjG2wsxHPPCgHjJyEE8+kuOm6RzRuA=; b=T/dPjZLiX+eI47lAsSX8NVn6XXwH5KkrgUXP1Al0xa08p8ZMFTA+9zYoeIkYBtJlmB +8QsM5Zcrzuj53juafI92ETip8X+SNK8BXzTYArAk9Fgo0DKY3W1JJcwnrDRIFkAOsfd W5oiGwz/mfTuZeplCK7YpGk9WUSKdacOnEeXfHmJjiECKTpoWWMjlEJVZboqoeImWI+D STHld5QsaZ9VXd5qccebAhBfS5pLvi6gKxo5EUg470YBlBORtezV8Q2md2MMvDA1+1Vm al9iUJaJLOaer7PDNP7mrncF8iqAMrRL47Bo1OqbfMh3OubUOAKBGfwn97YQiuRH8KKZ 4Bkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=sender:x-gm-message-state:date:from:to:message-id:in-reply-to :references:subject:mime-version:x-original-sender:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=VZJVrzok3pZKIsjG2wsxHPPCgHjJyEE8+kuOm6RzRuA=; b=Z/XnPIR58ieqOO3FpP0JNzUCbLDTP/LDDbR9iv++3iY869Ea1/dljB+MHRQ94w0qvr 4Gax+hh3c5tqjTWK++tOjzb05v4XfnMe+WXHe9i5D/udGgMQ8lIumgj10NWmaepaBgZP d5pV6PL9r9QsaD/dBfmhZiTi+pKdT87VZEdsvBK2Q4Owbcfm3fnzYQwDWSH1p4ljaC8m bExKb3D0S3FjOTJ+R2L9zFq5YYQuf4PaaS0QHtxnVex8Op4ZKxjc4Yha94LQ2vBH1GJc 3qvNFD7V0ba+a0sKOO0x0kqluQanJ9IphjJ/mfxcMT8WKBGp9f0RTrSHIv7f4ZI43d/E RLSQ== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM533Ky7WVtgfNawvpQ6bYoraGT39RuSQVSpaVf5jguUkLwUqBQc3F 8EWW0GIYsioG2OeQFY4t3wU= X-Google-Smtp-Source: ABdhPJw6N1JIpANJpI+43A6uFtR1VAvxJelINayARJ8KY7ySgTNahAFQeqmP8ZkZ49t2j+aPk4v+DA== X-Received: by 2002:a4a:dfcc:: with SMTP id p12mr12653780ood.4.1642359582060; Sun, 16 Jan 2022 10:59:42 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6830:43a0:: with SMTP id s32ls901355otv.6.gmail; Sun, 16 Jan 2022 10:59:40 -0800 (PST) X-Received: by 2002:a05:6830:2424:: with SMTP id k4mr11899922ots.241.1642359580527; Sun, 16 Jan 2022 10:59:40 -0800 (PST) In-Reply-To: <415779ca-0946-47b5-b15e-a82c2d99d168n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> X-Original-Sender: pashev.igor-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:29989 Archived-At: ------=_Part_750_2053146350.1642359579870 Content-Type: multipart/alternative; boundary="----=_Part_751_1685195138.1642359579870" ------=_Part_751_1685195138.1642359579870 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Here is an updated version which makes ordered list instead of definition= =20 list when possible. Ordered lists are easier to render in other formats. =D1=87=D0=B5=D1=82=D0=B2=D0=B5=D1=80=D0=B3, 13 =D1=8F=D0=BD=D0=B2=D0=B0=D1= =80=D1=8F 2022 =D0=B3. =D0=B2 19:58:07 UTC+2, =D0=98=D0=B3=D0=BE=D1=80=D1= =8C =D0=9F=D0=B0=D1=88=D0=B5=D0=B2:=20 > Here is a patch which makes Pandoc parse the bibliography environment int= o=20 > a definition list. The patch includes a test showing the result. > > I needed it for myself for self-contained LaTeX documents and hope=20 > somebody may find it useful too. > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/8296a3c5-bd71-4b2c-8498-11903d7a0194n%40googlegroups.com. ------=_Part_751_1685195138.1642359579870 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Here is an updated version which makes ordered list instead of definit= ion list when possible.
Ordered lists are easier to render in oth= er formats.

=D1=87=D0=B5=D1=82=D0=B2=D0=B5=D1=80=D0=B3, 13 =D1=8F=D0=BD= =D0=B2=D0=B0=D1=80=D1=8F 2022 =D0=B3. =D0=B2 19:58:07 UTC+2, =D0=98=D0=B3= =D0=BE=D1=80=D1=8C =D0=9F=D0=B0=D1=88=D0=B5=D0=B2:
Here is a patch which makes Pan= doc parse the bibliography environment into a definition list. The patch in= cludes a test showing the result.

I needed it = for myself for self-contained LaTeX documents and hope somebody may find it= useful too.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d= /msgid/pandoc-discuss/8296a3c5-bd71-4b2c-8498-11903d7a0194n%40googlegroups.= com.
------=_Part_751_1685195138.1642359579870-- ------=_Part_750_2053146350.1642359579870 Content-Type: text/x-patch; charset=US-ASCII; name=thebibliography-dl-ol.patch Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=thebibliography-dl-ol.patch X-Attachment-Id: 7ce24ecb-fffb-4f3d-8b71-a57b95cc2378 Content-ID: <7ce24ecb-fffb-4f3d-8b71-a57b95cc2378> >From f62f8b7ef2f1c2357fbd41f5226fe433e632e042 Mon Sep 17 00:00:00 2001 From: Igor Pashev Date: Thu, 13 Jan 2022 19:11:50 +0200 Subject: LaTeX: parse thebibliography --- src/Text/Pandoc/Readers/LaTeX.hs | 38 +++++++++++++++++++++++++ src/Text/Pandoc/Readers/LaTeX/Parsing.hs | 2 ++ test/command/latex-thebibliography.md | 49 ++++++++++++++++++++++++++++++++ 3 files changed, 89 insertions(+) create mode 100644 test/command/latex-thebibliography.md diff --git a/src/Text/Pandoc/Readers/LaTeX.hs b/src/Text/Pandoc/Readers/LaTeX.hs index 20a2db76b..37fa4adf0 100644 --- a/src/Text/Pandoc/Readers/LaTeX.hs +++ b/src/Text/Pandoc/Readers/LaTeX.hs @@ -741,6 +741,14 @@ looseItem = do skipopts return mempty +looseBibItem :: PandocMonad m => LP m Blocks +looseBibItem = do + inListItem <- sInListItem <$> getState + guard $ not inListItem + skipopts + void braced + return mempty + epigraph :: PandocMonad m => LP m Blocks epigraph = do p1 <- grouped block @@ -886,6 +894,7 @@ blockCommands = M.fromList , ("strut", pure mempty) , ("rule", rule) , ("item", looseItem) + , ("bibitem", looseBibItem) , ("documentclass", skipopts *> braced *> preamble) , ("centerline", para . trimInlines <$> (skipopts *> tok)) , ("caption", mempty <$ setCaption inline) @@ -975,6 +984,7 @@ environments = M.union (tableEnvironments blocks inline) $ , ("togglefalse", braced >>= setToggle False) , ("iftoggle", try $ ifToggle >> block) , ("CSLReferences", braced >> braced >> env "CSLReferences" blocks) + , ("thebibliography", theBibliography) ] filecontents :: PandocMonad m => LP m Blocks @@ -1211,6 +1221,34 @@ descItem = do bs <- blocks return (ils, [bs]) +bibItem :: PandocMonad m => LP m (Inlines, [Blocks]) +bibItem = do + blocks + controlSeq "bibitem" + sp + lbl <- opt <|> nextNum + cite_key <- untokenize <$> braced + bs <- blocks + return (lbl, [divWith (cite_key, [], []) bs]) + where + nextNum = do + st <- getState + let n = sTheBibItemNum st + 1 + setState st {sTheBibItemNum = n} + return . str . T.pack . show $ n + +theBibliography :: PandocMonad m => LP m Blocks +theBibliography = do + updateState $ \st -> st {sTheBibItemNum = 0} + items <- listenv "thebibliography" (many bibItem) + is_ol <- (== length items) . sTheBibItemNum <$> getState + return $ + divWith + ("", ["thebibliography"], []) + (if is_ol + then orderedListWith (1, Decimal, Period) $ map (head . snd) items + else definitionList items) + listenv :: PandocMonad m => Text -> LP m a -> LP m a listenv name p = try $ do oldInListItem <- sInListItem `fmap` getState diff --git a/src/Text/Pandoc/Readers/LaTeX/Parsing.hs b/src/Text/Pandoc/Readers/LaTeX/Parsing.hs index 9eb4a0cbc..8fb6bd5bc 100644 --- a/src/Text/Pandoc/Readers/LaTeX/Parsing.hs +++ b/src/Text/Pandoc/Readers/LaTeX/Parsing.hs @@ -172,6 +172,7 @@ data LaTeXState = LaTeXState{ sOptions :: ReaderOptions , sFileContents :: M.Map Text Text , sEnableWithRaw :: Bool , sRawTokens :: IntMap.IntMap [Tok] + , sTheBibItemNum :: Int } deriving Show @@ -199,6 +200,7 @@ defaultLaTeXState = LaTeXState{ sOptions = def , sFileContents = M.empty , sEnableWithRaw = True , sRawTokens = IntMap.empty + , sTheBibItemNum = 0 } instance PandocMonad m => HasQuoteContext LaTeXState m where diff --git a/test/command/latex-thebibliography.md b/test/command/latex-thebibliography.md new file mode 100644 index 000000000..54b257c61 --- /dev/null +++ b/test/command/latex-thebibliography.md @@ -0,0 +1,49 @@ +# The bibliography + +``` +% pandoc -f latex -t native +\begin{thebibliography}{10} + \bibitem{two} The Second. + \bibitem{four} The Fourth. +\end{thebibliography} +\begin{thebibliography}{100} + \bibitem[One1990]{one} The First. + \bibitem{two} The Second. +\end{thebibliography} +^D +[ Div + ( "" , [ "thebibliography" ] , [] ) + [ OrderedList + ( 1 , Decimal , Period ) + [ [ Div + ( "two" , [] , [] ) + [ Para [ Str "The" , Space , Str "Second." ] ] + ] + , [ Div + ( "four" , [] , [] ) + [ Para [ Str "The" , Space , Str "Fourth." ] ] + ] + ] + ] +, Div + ( "" , [ "thebibliography" ] , [] ) + [ DefinitionList + [ ( [ Str "One1990" ] + , [ [ Div + ( "one" , [] , [] ) + [ Para [ Str "The" , Space , Str "First." ] ] + ] + ] + ) + , ( [ Str "1" ] + , [ [ Div + ( "two" , [] , [] ) + [ Para [ Str "The" , Space , Str "Second." ] ] + ] + ] + ) + ] + ] +] +``` +