From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/29965 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: =?UTF-8?B?0JjQs9C+0YDRjCDQn9Cw0YjQtdCy?= Newsgroups: gmane.text.pandoc Subject: LaTeX: parse thebibliography (patch) Date: Thu, 13 Jan 2022 09:58:06 -0800 (PST) Message-ID: <415779ca-0946-47b5-b15e-a82c2d99d168n@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_726_547548386.1642096686988" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="1938"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBCEZVDHV5ILBBMOQQGHQMGQEPPD2TDQ-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Thu Jan 13 18:58:13 2022 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-ot1-f60.google.com ([209.85.210.60]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1n84MV-0000HB-KI for gtp-pandoc-discuss@m.gmane-mx.org; Thu, 13 Jan 2022 18:58:11 +0100 Original-Received: by mail-ot1-f60.google.com with SMTP id e23-20020a9d6e17000000b0059098af9184sf1011465otr.20 for ; Thu, 13 Jan 2022 09:58:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=sender:date:from:to:message-id:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-subscribe:list-unsubscribe; bh=4ErTZ76ZO9p/kBReGsEKnWLvwyrN9JcrdGL+JD40a0s=; b=dhfgZFsqPvev0d9x7kd2vikJbwnt3+YI3synjl+c5I7umVZAjrvCth4xKtMLo9vuih iCMzJx9Vnl/HPbyZc5G4tyBMyrusW6t9KyCLV23i+Tejn/wEe5UTucbkxVvdABvS8S8z Akdr3IVKwH1CyyffstzLjrG9HE6YKh/Loa9/Lyt1ZIU682y8bAIe8RWXa6/uMdsyWYr/ 8iaz/HECyGhbbmKRgra7h0cIrVrTsqLcTAgirXQnkR9FIfsUzEiMTrTg6wGO8UcT8XJ6 GXqT5y2RFRWqnLvbZsQ11LwXmzVZ0SRLTAtRg6Labnt4XHCFj8vafUmA04cAaaqSBkON DaWA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:to:message-id:subject:mime-version:x-original-sender :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=4ErTZ76ZO9p/kBReGsEKnWLvwyrN9JcrdGL+JD40a0s=; b=puJJadxQWutpki2i9HrG8BOaTqQCw9NaB72Lh9xbBfgYEpRBjhbNpjAbHRYS64P+XF 1r050o7WyJcNtZiG3+CXCYpzg1w6D3zWnp7Bkiph4gUEFzdAj88B7vvFFoS5MIM5G5t+ rhZXEYlFG68z3ucYz/SSESFULQuxT27elOaVoffj/TcvSgc+2jkclqDilrVYEkmRrosi AJj3G2EshoZji01uuaoOB7R+woRt7eOZI3fGFM2GbIG+TyABSABOrbyCBBdGnyRBQAL/ ghBJ4I4LB9ZWdC+Ab08A9qaYntc3WN1b1BPdWN1F1W/G7ILqK2o0kws7aU89XlMWVavG EQ1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=sender:x-gm-message-state:date:from:to:message-id:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=4ErTZ76ZO9p/kBReGsEKnWLvwyrN9JcrdGL+JD40a0s=; b=x1PMgKThggAb+nZC1AKZ2QIholHq4pecG3PNaMSUXjeEqAFz/jNLHyOcct+PHshyax E6IxrHxXcVhBvrqVpyd8Qa1Sj/+JD3iLNiU2TMm3yJxZpbDEqW2EWt26MooPuQrLe20n Lvi1jgHDnz/Nsw3KNLKtSEYH12F6erckIP3xEySIIqGZQwRtQ0+Dg9YDrO5SAE2xak93 kW6KdPuuA2PTbLf4gHl9kBcC5oLw8qOnhWucET28L87InArv49YuAXxnHaYEZrZlKmMb v89AW3Giu8bDfLJ/gEqrWP50cUb7BMBF02x8gGDy+fuV3qhUinsWr7XpPxGSwcrSa6+F d41A== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM531MMQdoVL0irBm1SBnWQqSHFtRk2TsSa/h6dbADKFaVxAZ7sn+g vp8fGalWXZRyO3XT5Oay2zg= X-Google-Smtp-Source: ABdhPJziuyy4bcDp4zK3tJu6j7QU2DC6NiW4kc/oNfAV9KhOA4Y2L6BFrLfY1vvklsJbzv3khPmUfQ== X-Received: by 2002:a05:6830:2366:: with SMTP id r6mr3920443oth.376.1642096690424; Thu, 13 Jan 2022 09:58:10 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6808:171b:: with SMTP id bc27ls997325oib.8.gmail; Thu, 13 Jan 2022 09:58:09 -0800 (PST) X-Received: by 2002:a54:440b:: with SMTP id k11mr7092792oiw.47.1642096687596; Thu, 13 Jan 2022 09:58:07 -0800 (PST) X-Original-Sender: pashev.igor-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:29965 Archived-At: ------=_Part_726_547548386.1642096686988 Content-Type: multipart/alternative; boundary="----=_Part_727_10815124.1642096686988" ------=_Part_727_10815124.1642096686988 Content-Type: text/plain; charset="UTF-8" Here is a patch which makes Pandoc parse the bibliography environment into a definition list. The patch includes a test showing the result. I needed it for myself for self-contained LaTeX documents and hope somebody may find it useful too. -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/415779ca-0946-47b5-b15e-a82c2d99d168n%40googlegroups.com. ------=_Part_727_10815124.1642096686988 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Here is a patch which makes Pandoc parse the bibliography environment = into a definition list. The patch includes a test showing the result.

I needed it for myself for self-contained LaTeX doc= uments and hope somebody may find it useful too.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d= /msgid/pandoc-discuss/415779ca-0946-47b5-b15e-a82c2d99d168n%40googlegroups.= com.
------=_Part_727_10815124.1642096686988-- ------=_Part_726_547548386.1642096686988 Content-Type: text/x-patch; charset=US-ASCII; name=thebibliography.patch Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=thebibliography.patch X-Attachment-Id: 86d9d76c-a7c7-472f-a497-b84b3bf03d5d Content-ID: <86d9d76c-a7c7-472f-a497-b84b3bf03d5d> >From 12cce1758ff82cced76fa1961076f99176ea2689 Mon Sep 17 00:00:00 2001 From: Igor Pashev Date: Thu, 13 Jan 2022 19:11:50 +0200 Subject: LaTeX: parse thebibliography --- src/Text/Pandoc/Readers/LaTeX.hs | 32 +++++++++++++++++++++ src/Text/Pandoc/Readers/LaTeX/Parsing.hs | 2 ++ test/command/latex-thebibliography.md | 48 ++++++++++++++++++++++++++++++++ 3 files changed, 82 insertions(+) create mode 100644 test/command/latex-thebibliography.md diff --git a/src/Text/Pandoc/Readers/LaTeX.hs b/src/Text/Pandoc/Readers/LaTeX.hs index 20a2db76b..e4a3aaa58 100644 --- a/src/Text/Pandoc/Readers/LaTeX.hs +++ b/src/Text/Pandoc/Readers/LaTeX.hs @@ -741,6 +741,14 @@ looseItem = do skipopts return mempty +looseBibItem :: PandocMonad m => LP m Blocks +looseBibItem = do + inListItem <- sInListItem <$> getState + guard $ not inListItem + skipopts + void braced + return mempty + epigraph :: PandocMonad m => LP m Blocks epigraph = do p1 <- grouped block @@ -886,6 +894,7 @@ blockCommands = M.fromList , ("strut", pure mempty) , ("rule", rule) , ("item", looseItem) + , ("bibitem", looseBibItem) , ("documentclass", skipopts *> braced *> preamble) , ("centerline", para . trimInlines <$> (skipopts *> tok)) , ("caption", mempty <$ setCaption inline) @@ -975,6 +984,7 @@ environments = M.union (tableEnvironments blocks inline) $ , ("togglefalse", braced >>= setToggle False) , ("iftoggle", try $ ifToggle >> block) , ("CSLReferences", braced >> braced >> env "CSLReferences" blocks) + , ("thebibliography", theBibliography) ] filecontents :: PandocMonad m => LP m Blocks @@ -1211,6 +1221,28 @@ descItem = do bs <- blocks return (ils, [bs]) +bibItem :: PandocMonad m => LP m (Inlines, [Blocks]) +bibItem = do + blocks + controlSeq "bibitem" + sp + lbl <- opt <|> nextCite + cite_key <- untokenize <$> braced + bs <- blocks + return (lbl, [divWith (cite_key, [], []) bs]) + where + nextCite = do + st <- getState + let n = sTheBibItemNum st + 1 + setState st {sTheBibItemNum = n} + return . singleton . Str . T.pack . show $ n + +theBibliography :: PandocMonad m => LP m Blocks +theBibliography = + divWith ("", ["thebibliography"], []) . definitionList <$> + listenv "thebibliography" (many bibItem) + + listenv :: PandocMonad m => Text -> LP m a -> LP m a listenv name p = try $ do oldInListItem <- sInListItem `fmap` getState diff --git a/src/Text/Pandoc/Readers/LaTeX/Parsing.hs b/src/Text/Pandoc/Readers/LaTeX/Parsing.hs index 9eb4a0cbc..8fb6bd5bc 100644 --- a/src/Text/Pandoc/Readers/LaTeX/Parsing.hs +++ b/src/Text/Pandoc/Readers/LaTeX/Parsing.hs @@ -172,6 +172,7 @@ data LaTeXState = LaTeXState{ sOptions :: ReaderOptions , sFileContents :: M.Map Text Text , sEnableWithRaw :: Bool , sRawTokens :: IntMap.IntMap [Tok] + , sTheBibItemNum :: Int } deriving Show @@ -199,6 +200,7 @@ defaultLaTeXState = LaTeXState{ sOptions = def , sFileContents = M.empty , sEnableWithRaw = True , sRawTokens = IntMap.empty + , sTheBibItemNum = 0 } instance PandocMonad m => HasQuoteContext LaTeXState m where diff --git a/test/command/latex-thebibliography.md b/test/command/latex-thebibliography.md new file mode 100644 index 000000000..153fbfc13 --- /dev/null +++ b/test/command/latex-thebibliography.md @@ -0,0 +1,48 @@ +# The bibliography + +``` +% pandoc -f latex -t native +\begin{thebibliography}{100} + \bibitem[One1990]{one} The First. + \bibitem{two} The Second. + \bibitem[Three 1998]{three} The Third. + \bibitem{four} The Fourth. +\end{thebibliography} +^D +[ Div + ( "" , [ "thebibliography" ] , [] ) + [ DefinitionList + [ ( [ Str "One1990" ] + , [ [ Div + ( "one" , [] , [] ) + [ Para [ Str "The" , Space , Str "First." ] ] + ] + ] + ) + , ( [ Str "1" ] + , [ [ Div + ( "two" , [] , [] ) + [ Para [ Str "The" , Space , Str "Second." ] ] + ] + ] + ) + , ( [ Str "Three" , Space , Str "1998" ] + , [ [ Div + ( "three" , [] , [] ) + [ Para [ Str "The" , Space , Str "Third." ] ] + ] + ] + ) + , ( [ Str "2" ] + , [ [ Div + ( "four" , [] , [] ) + [ Para [ Str "The" , Space , Str "Fourth." ] ] + ] + ] + ) + ] + ] +] + +``` +