Scripting with Haskell to achieve a filter for acronyms in pandoc

public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed

* Scripting with Haskell to achieve a filter for acronyms in pandoc
@ 2016-09-09  0:43 Luis Fernado Silva Castro de Araújo
       [not found] ` <6c9b4eec-0ea9-46bf-8da3-51998b63c902-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: Luis Fernado Silva Castro de Araújo @ 2016-09-09  0:43 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1585 bytes --]

I found out this 
<https://groups.google.com/d/topic/pandoc-discuss/yNpCZes5MZY/discussion> 
topic where the user @Chris Lewis tries to create a script to implement a 
syntax for latex acronym package.


The user @fiddlosopher suggested an example of code:

#!/usr/bin/env runhaskell
-- acronym.hs
import Text.Pandoc
import Data.IORef

main :: IO ()
main = do
  usedRef <- newIORef False
  toJsonFilter $ acronym usedRef

acronym :: IORef Bool -> Inline -> IO Inline
acronym usedRef (Link [Str abbrev] ("acro:", expansion)) = do
  used <- readIORef usedRef
  if used
     then return $ Str abbrev
     else do
       writeIORef usedRef True
       return $ Str $ abbrev ++ " (" ++ expansion ++ ")"
acronym _ x = return x

But how can I test it? What I did was to add --filter acronym.hs \ to my 
pandoc makefile. But I got the error:

pandoc: Error running filter acronym.hs
fd:4: hPutBuf: resource vanished (Broken pipe)
make: *** [Makefile:28: pdf] Error 83

How should I go about to create a filter to pandoc?

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/6c9b4eec-0ea9-46bf-8da3-51998b63c902%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 2248 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scripting with Haskell to achieve a filter for acronyms in pandoc
       [not found] ` <6c9b4eec-0ea9-46bf-8da3-51998b63c902-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-09-19 18:25   ` John MacFarlane
       [not found]     ` <20160919182539.GA9066-l/d5Ua9yGnxXsXJlQylH7w@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: John MacFarlane @ 2016-09-19 18:25 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

This won't work with recent versions of pandoc because
of changes in the Link type. Try

   acronym usedRef (Link [Str abbrev] ("acro:", expansion) _) = do

To debug problems with filters, try loading with ghci

    ghci acronym.hs

+++ Luis Fernado Silva Castro de Araújo [Sep 08 16 17:43 ]:
>   I found out [1]this topic where the user @Chris Lewis tries to create a
>   script to implement a syntax for latex acronym package.
>   The user @fiddlosopher suggested an example of code:
>   #!/usr/bin/env runhaskell
>   -- acronym.hs
>   import Text.Pandoc
>   import Data.IORef
>   main :: IO ()
>   main = do
>     usedRef <- newIORef False
>     toJsonFilter $ acronym usedRef
>   acronym :: IORef Bool -> Inline -> IO Inline
>   acronym usedRef (Link [Str abbrev] ("acro:", expansion)) = do
>     used <- readIORef usedRef
>     if used
>        then return $ Str abbrev
>        else do
>          writeIORef usedRef True
>          return $ Str $ abbrev ++ " (" ++ expansion ++ ")"
>   acronym _ x = return x
>   But how can I test it? What I did was to add --filter acronym.hs \ to
>   my pandoc makefile. But I got the error:
>   pandoc: Error running filter acronym.hs
>   fd:4: hPutBuf: resource vanished (Broken pipe)
>   make: *** [Makefile:28: pdf] Error 83
>   How should I go about to create a filter to pandoc?
>
>   --
>   You received this message because you are subscribed to the Google
>   Groups "pandoc-discuss" group.
>   To unsubscribe from this group and stop receiving emails from it, send
>   an email to [2]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>   To post to this group, send email to
>   [3]pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>   To view this discussion on the web visit
>   [4]https://groups.google.com/d/msgid/pandoc-discuss/6c9b4eec-0ea9-46bf-
>   8da3-51998b63c902%40googlegroups.com.
>   For more options, visit [5]https://groups.google.com/d/optout.
>
>References
>
>   1. https://groups.google.com/d/topic/pandoc-discuss/yNpCZes5MZY/discussion
>   2. mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>   3. mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>   4. https://groups.google.com/d/msgid/pandoc-discuss/6c9b4eec-0ea9-46bf-8da3-51998b63c902-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org?utm_medium=email&utm_source=footer
>   5. https://groups.google.com/d/optout

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/20160919182539.GA9066%40Johns-MBP.home.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scripting with Haskell to achieve a filter for acronyms in pandoc
       [not found]     ` <20160919182539.GA9066-l/d5Ua9yGnxXsXJlQylH7w@public.gmane.org>
@ 2016-09-22  5:11       ` Luis Fernado Silva Castro de Araújo
       [not found]         ` <be83b525-f6d9-4536-8271-53d80b3ffd5b-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: Luis Fernado Silva Castro de Araújo @ 2016-09-22  5:11 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2803 bytes --]

Thank you John,


To recap, the aim is to create an acronym filter (in Haskell) to run with 
pandoc. The filter would modify the current link syntax and make 
[LRU](acro: "Least Recently Used") represent a definition for an acronym.

In the LATEX side, an acronym addition would have to involve adding 
\usepackage[acronym,smallcaps]{glossaries} to the preamble and 

\newacronym{LRU}{LRU}{Least Recently Used} for the definition of LRU and finally \gls{LRU} to every time the term is used in the text.


For now I am trying to define the syntax in Haskell:

#!/usr/bin/env runhaskell
-- acronym.hs
import Text.Pandoc
import Data.IORef

main :: IO ()
main = do
  usedRef <- newIORef False
  toJsonFilter $ acronym usedRef

acronym :: IORef Bool -> Inline -> IO Inline
acronym usedRef (Link [Str abbrev] ("acro:", expansion) _) = do
  used <- readIORef usedRef
  if used
     then return $ Str abbrev
     else do
       writeIORef usedRef True
       return $ Str $ abbrev ++ " (" ++ expansion ++ ")"
acronym _ x = return x


But I am getting:


acronym.hs:12:23: error:
    • Couldn't match type ‘[Inline]’
                     with ‘(String, [String], [(String, String)])’
      Expected type: Attr
        Actual type: [Inline]
    • In the pattern: [Str abbrev]
      In the pattern: Link [Str abbrev] ("acro:", expansion) _
      In an equation for ‘acronym’:
          acronym usedRef (Link [Str abbrev] ("acro:", expansion) _)
            = do { used <- readIORef usedRef;
                   if used then return $ Str abbrev else do { ... } }

acronym.hs:12:36: error:
    • Couldn't match expected type ‘[Inline]’
                  with actual type ‘([Char], [Char])’
    • In the pattern: ("acro:", expansion)
      In the pattern: Link [Str abbrev] ("acro:", expansion) _
      In an equation for ‘acronym’:
          acronym usedRef (Link [Str abbrev] ("acro:", expansion) _)
            = do { used <- readIORef usedRef;
                   if used then return $ Str abbrev else do { ... } }
Failed, modules loaded: none.


Can you point me to a reference that I can read in order to solve this problem? Many thanks.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/be83b525-f6d9-4536-8271-53d80b3ffd5b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 4117 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scripting with Haskell to achieve a filter for acronyms in pandoc
       [not found]         ` <be83b525-f6d9-4536-8271-53d80b3ffd5b-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-09-23  8:35           ` John MacFarlane
       [not found]             ` <20160923083514.GI86115-BKjuZOBx5Kn2N3qrpRCZGbhGAdq7xJNKhPhL2mjWHbk@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: John MacFarlane @ 2016-09-23  8:35 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

I think I just gave you bad guidance.  The attributes
field is first, not last. Instead of

acronym usedRef (Link [Str abbrev] ("acro:", expansion) _) =

try

acronym usedRef (Link _ [Str abbrev] ("acro:", expansion)) =


+++ Luis Fernado Silva Castro de Araújo [Sep 21 16 22:11 ]:
>   Thank you John,
>   To recap, the aim is to create an acronym filter (in Haskell) to run
>   with pandoc. The filter would modify the current link syntax and make
>   [LRU](acro: "Least Recently Used") represent a definition for an
>   acronym.
>   In the LATEX side, an acronym addition would have to involve adding
>   \usepackage[acronym,smallcaps]{glossaries} to the preamble and
>\newacronym{LRU}{LRU}{Least Recently Used} for the definition of LRU and finally
> \gls{LRU} to every time the term is used in the text.
>For now I am trying to define the syntax in Haskell:
>#!/usr/bin/env runhaskell
>-- acronym.hs
>import Text.Pandoc
>import Data.IORef
>main :: IO ()
>main = do
>  usedRef <- newIORef False
>  toJsonFilter $ acronym usedRef
>acronym :: IORef Bool -> Inline -> IO Inline
>acronym usedRef (Link [Str abbrev] ("acro:", expansion) _) = do
>  used <- readIORef usedRef
>  if used
>     then return $ Str abbrev
>     else do
>       writeIORef usedRef True
>       return $ Str $ abbrev ++ " (" ++ expansion ++ ")"
>acronym _ x = return x
>But I am getting:
>
>acronym.hs:12:23: error:
>    • Couldn't match type ‘[Inline]’
>                     with ‘(String, [String], [(String, String)])’
>      Expected type: Attr
>        Actual type: [Inline]
>    • In the pattern: [Str abbrev]
>      In the pattern: Link [Str abbrev] ("acro:", expansion) _
>      In an equation for ‘acronym’:
>          acronym usedRef (Link [Str abbrev] ("acro:", expansion) _)
>            = do { used <- readIORef usedRef;
>                   if used then return $ Str abbrev else do { ... } }
>acronym.hs:12:36: error:
>    • Couldn't match expected type ‘[Inline]’
>                  with actual type ‘([Char], [Char])’
>    • In the pattern: ("acro:", expansion)
>      In the pattern: Link [Str abbrev] ("acro:", expansion) _
>      In an equation for ‘acronym’:
>          acronym usedRef (Link [Str abbrev] ("acro:", expansion) _)
>            = do { used <- readIORef usedRef;
>                   if used then return $ Str abbrev else do { ... } }
>Failed, modules loaded: none.
>Can you point me to a reference that I can read in order to solve this problem?
>Many thanks.
>
>   --
>   You received this message because you are subscribed to the Google
>   Groups "pandoc-discuss" group.
>   To unsubscribe from this group and stop receiving emails from it, send
>   an email to [1]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>   To post to this group, send email to
>   [2]pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>   To view this discussion on the web visit
>   [3]https://groups.google.com/d/msgid/pandoc-discuss/be83b525-f6d9-4536-
>   8271-53d80b3ffd5b%40googlegroups.com.
>   For more options, visit [4]https://groups.google.com/d/optout.
>
>References
>
>   1. mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>   2. mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>   3. https://groups.google.com/d/msgid/pandoc-discuss/be83b525-f6d9-4536-8271-53d80b3ffd5b-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org?utm_medium=email&utm_source=footer
>   4. https://groups.google.com/d/optout

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/20160923083514.GI86115%40Administrateurs-iMac-3.local.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scripting with Haskell to achieve a filter for acronyms in pandoc
       [not found]             ` <20160923083514.GI86115-BKjuZOBx5Kn2N3qrpRCZGbhGAdq7xJNKhPhL2mjWHbk@public.gmane.org>
@ 2016-12-27 21:34               ` Luis Fernado Silva Castro de Araújo
       [not found]                 ` <92666292-e593-4249-8ec2-ad37ceba79d2-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: Luis Fernado Silva Castro de Araújo @ 2016-12-27 21:34 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2994 bytes --]

Sorry John,

I tried, but Haskell is too foreign to me. It would be simpler to go with a 
more familiar language.


   - The aim is to create an acronym filter (in Haskell) to run with 
   pandoc. The filter would modify the current link syntax and make 
   [LRU](acro: "Least Recently Used") represent a definition for an acronym.


   - In the LATEX side, an acronym addition would have to involve adding \usepackage[acronym,smallcaps]{glossaries} 
   to the preamble and \newacronym{LRU}{LRU}{Least Recently Used} for the 
   definition of LRU and finally \gls{LRU} to every time the term is used 
   in the text.



 I have been adapting Pacrodoc, however I am stuck at the following code:


# Based on https://github.com/cflewis/Pacrodoc/blob/master/pacrodoc.py

import sys
import json
import re
import urllib

acronyms = {}

def processAcronym(linkData):
    # Links look like this:
    # [[{u'Str': u'Link Name'}], [u'Link URL', 'Link Title']]
    acronym = linkData[0][0]['Str']
    acronymText = linkData[1][0]

    # First we check if there is an acronym being defined
    if re.search('^acro:', linkData[1][0]):
        # An acronym is being defined, so strip off the acro:
        # prefix and unencode the text
        acronyms[acronym] = {'text': urllib.unquote(acronymText[5:]), 
'used': False}

        # Strip out this link
        return {'Str': ''}

    # Now we check if its referring to an acronym instead
    if not acronymText and acronym in acronyms:
        if not acronyms[acronym]['used']:
            acronyms[acronym]['used'] = True
            return {'Str': '%s (%s)' % (acronyms[acronym]['text'], acronym)}
        else:
            return {'Str': acronym}

    # It was just a normal link, so return it unchanged
    return {'Link': linkData}

def lookForAcronyms(jsonData):
    if isinstance(jsonData, list):
        return [lookForAcronyms(value) for value in jsonData]

    if isinstance(jsonData, dict):
        if 'Link' in jsonData:
            return processAcronym(jsonData['Link'])
        else:
            return {k: lookForAcronyms(v) for k, v in jsonData.items()}

    return jsonData

if __name__ == "__main__":
  toJSONFilter(lookForAcronyms)


And I get the error:

fd:4: hClose: resource vanished (Broken pipe)



I hope anyone could shed some light on where I am getting it wrong. Perhaps 
some pointers on how to do it using the more recent panflute framework.

Thanks 

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/92666292-e593-4249-8ec2-ad37ceba79d2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 4663 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scripting with Haskell to achieve a filter for acronyms in pandoc
       [not found]                 ` <92666292-e593-4249-8ec2-ad37ceba79d2-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-12-28  2:37                   ` Sergio Correia
       [not found]                     ` <719e577c-e900-4951-afb2-0423e71ff601-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2016-12-28 16:01                   ` BP Jonsson
  1 sibling, 1 reply; 19+ messages in thread
From: Sergio Correia @ 2016-12-28  2:37 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 3893 bytes --]

Chiming in:


   - That error happened to me when the code I was running had errors (so 
   the problem might be on the Python side)
   - A quick way to test the python side is to just run the python program 
   w/out any arguments. That would ensure there are no "compile-time" errors 
   at least
   - This also looks quite similar to what I did in one of the panflute 
   examples 
   <http://scorreia.com/software/panflute/guide.html#calling-external-programs> (might 
   be useful in case you want a more complex filter)


Best,
Sergio

On Tuesday, December 27, 2016 at 4:34:43 PM UTC-5, Luis Fernado Silva 
Castro de Araújo wrote:
>
> Sorry John,
>
> I tried, but Haskell is too foreign to me. It would be simpler to go with 
> a more familiar language.
>
>
>    - The aim is to create an acronym filter (in Haskell) to run with 
>    pandoc. The filter would modify the current link syntax and make 
>    [LRU](acro: "Least Recently Used") represent a definition for an acronym.
>
>
>    - In the LATEX side, an acronym addition would have to involve adding \usepackage[acronym,smallcaps]{glossaries} 
>    to the preamble and \newacronym{LRU}{LRU}{Least Recently Used} for the 
>    definition of LRU and finally \gls{LRU} to every time the term is used 
>    in the text.
>
>
>
>  I have been adapting Pacrodoc, however I am stuck at the following code:
>
>
> # Based on https://github.com/cflewis/Pacrodoc/blob/master/pacrodoc.py
>
> import sys
> import json
> import re
> import urllib
>
> acronyms = {}
>
> def processAcronym(linkData):
>     # Links look like this:
>     # [[{u'Str': u'Link Name'}], [u'Link URL', 'Link Title']]
>     acronym = linkData[0][0]['Str']
>     acronymText = linkData[1][0]
>
>     # First we check if there is an acronym being defined
>     if re.search('^acro:', linkData[1][0]):
>         # An acronym is being defined, so strip off the acro:
>         # prefix and unencode the text
>         acronyms[acronym] = {'text': urllib.unquote(acronymText[5:]), 
> 'used': False}
>
>         # Strip out this link
>         return {'Str': ''}
>
>     # Now we check if its referring to an acronym instead
>     if not acronymText and acronym in acronyms:
>         if not acronyms[acronym]['used']:
>             acronyms[acronym]['used'] = True
>             return {'Str': '%s (%s)' % (acronyms[acronym]['text'], 
> acronym)}
>         else:
>             return {'Str': acronym}
>
>     # It was just a normal link, so return it unchanged
>     return {'Link': linkData}
>
> def lookForAcronyms(jsonData):
>     if isinstance(jsonData, list):
>         return [lookForAcronyms(value) for value in jsonData]
>
>     if isinstance(jsonData, dict):
>         if 'Link' in jsonData:
>             return processAcronym(jsonData['Link'])
>         else:
>             return {k: lookForAcronyms(v) for k, v in jsonData.items()}
>
>     return jsonData
>
> if __name__ == "__main__":
>   toJSONFilter(lookForAcronyms)
>
>
> And I get the error:
>
> fd:4: hClose: resource vanished (Broken pipe)
>
>
>
> I hope anyone could shed some light on where I am getting it wrong. 
> Perhaps some pointers on how to do it using the more recent panflute 
> framework.
>
> Thanks 
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/719e577c-e900-4951-afb2-0423e71ff601%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 6087 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scripting with Haskell to achieve a filter for acronyms in pandoc
       [not found]                     ` <719e577c-e900-4951-afb2-0423e71ff601-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-12-28  3:14                       ` Luis Fernado Silva Castro de Araújo
       [not found]                         ` <f979c153-7a62-41f9-a782-64532d1cee6b-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: Luis Fernado Silva Castro de Araújo @ 2016-12-28  3:14 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2586 bytes --]

Thanks for the comment Sergio, here is a first attempt to adapt into 
panflute. Note that I am a total beginner in Python, but I am not looking 
for code, I just need pointers. 

Could you please point me where the problem is arising? It gets in a loop 
in the python interactive shell and stops with the error 'TypeError: 
lookForAcronyms() takes 1 positional argument but 2 were given' when I 
attempt to run as a filter.


Here is the code:

# Based on https://github.com/cflewis/Pacrodoc/blob/master/pacrodoc.py

import panflute as pf
import sys
import json
import re
import urllib

acronyms = {}

def processAcronym(linkData):
    # Links look like this:
    # [[{u'Str': u'Link Name'}], [u'Link URL', 'Link Title']]
    acronym = linkData[0][0]['Str']
    acronymText = linkData[1][0]

    # First we check if there is an acronym being defined
    if re.search('^acro:', linkData[1][0]):
        # An acronym is being defined, so strip off the acro:
        # prefix and unencode the text
        acronyms[acronym] = {'text': urllib.unquote(acronymText[5:]), 
'used': False}

        # Strip out this link
        return {'Str': ''}

    # Now we check if its referring to an acronym instead
    if not acronymText and acronym in acronyms:
        if not acronyms[acronym]['used']:
            acronyms[acronym]['used'] = True
            return {'Str': '%s (%s)' % (acronyms[acronym]['text'], acronym)}
        else:
            return {'Str': acronym}

    # It was just a normal link, so return it unchanged
    return {'Link': linkData}

def lookForAcronyms(jsonData):
    if isinstance(jsonData, list):
        return [lookForAcronyms(value) for value in jsonData]

    if isinstance(jsonData, dict):
        if 'Link' in jsonData:
            return processAcronym(jsonData['Link'])
        else:
            return {k: lookForAcronyms(v) for k, v in jsonData.items()}

    return jsonData

def main(doc=None):
    return pf.run_filter(lookForAcronyms, doc=doc) 

if __name__ == '__main__':
    main()

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/f979c153-7a62-41f9-a782-64532d1cee6b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 4595 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scripting with Haskell to achieve a filter for acronyms in pandoc
       [not found]                         ` <f979c153-7a62-41f9-a782-64532d1cee6b-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-12-28  4:22                           ` Sergio Correia
       [not found]                             ` <13631c6c-5a3a-441a-8076-928182fc69b3-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: Sergio Correia @ 2016-12-28  4:22 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 3017 bytes --]

Can you tell me a bit about how you are intending it to work?

EG: when do you define an acronym vs where do you use it?



On Tuesday, December 27, 2016 at 10:14:30 PM UTC-5, Luis Fernado Silva 
Castro de Araújo wrote:
>
> Thanks for the comment Sergio, here is a first attempt to adapt into 
> panflute. Note that I am a total beginner in Python, but I am not looking 
> for code, I just need pointers. 
>
> Could you please point me where the problem is arising? It gets in a loop 
> in the python interactive shell and stops with the error 'TypeError: 
> lookForAcronyms() takes 1 positional argument but 2 were given' when I 
> attempt to run as a filter.
>
>
> Here is the code:
>
> # Based on https://github.com/cflewis/Pacrodoc/blob/master/pacrodoc.py
>
> import panflute as pf
> import sys
> import json
> import re
> import urllib
>
> acronyms = {}
>
> def processAcronym(linkData):
>     # Links look like this:
>     # [[{u'Str': u'Link Name'}], [u'Link URL', 'Link Title']]
>     acronym = linkData[0][0]['Str']
>     acronymText = linkData[1][0]
>
>     # First we check if there is an acronym being defined
>     if re.search('^acro:', linkData[1][0]):
>         # An acronym is being defined, so strip off the acro:
>         # prefix and unencode the text
>         acronyms[acronym] = {'text': urllib.unquote(acronymText[5:]), 
> 'used': False}
>
>         # Strip out this link
>         return {'Str': ''}
>
>     # Now we check if its referring to an acronym instead
>     if not acronymText and acronym in acronyms:
>         if not acronyms[acronym]['used']:
>             acronyms[acronym]['used'] = True
>             return {'Str': '%s (%s)' % (acronyms[acronym]['text'], 
> acronym)}
>         else:
>             return {'Str': acronym}
>
>     # It was just a normal link, so return it unchanged
>     return {'Link': linkData}
>
> def lookForAcronyms(jsonData):
>     if isinstance(jsonData, list):
>         return [lookForAcronyms(value) for value in jsonData]
>
>     if isinstance(jsonData, dict):
>         if 'Link' in jsonData:
>             return processAcronym(jsonData['Link'])
>         else:
>             return {k: lookForAcronyms(v) for k, v in jsonData.items()}
>
>     return jsonData
>
> def main(doc=None):
>     return pf.run_filter(lookForAcronyms, doc=doc) 
>
> if __name__ == '__main__':
>     main()
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/13631c6c-5a3a-441a-8076-928182fc69b3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 5581 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scripting with Haskell to achieve a filter for acronyms in pandoc
       [not found]                             ` <13631c6c-5a3a-441a-8076-928182fc69b3-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-12-28  6:06                               ` Luis Fernado Silva Castro de Araújo
  0 siblings, 0 replies; 19+ messages in thread
From: Luis Fernado Silva Castro de Araújo @ 2016-12-28  6:06 UTC (permalink / raw)
  To: pandoc-discuss

[-- Attachment #1: Type: text/plain, Size: 1648 bytes --]

Sure, 

It is amazing how pandoc simplifies the production of complex documents like thesis, dissertations and final essays. I wanted to go a little further and also automate the production of the list of acronyms in these. 

The main format destination I am thinking of is LaTeX /pdf, but it sure can benefit other formats as well. 

There are two main LaTeX packages to automate the production of a list of acronyms: glossary and acronym. The last is simpler, cover most cases, and has the additional advantage of not needing any preprocessing. It can be run directly over the LaTeX code. 

So here enters pandoc. I wanted to create a filter that locates the specific syntax for pandoc I mentioned in last email and converts it to acronyms package syntax. 

The end result would be that the pandoc user will define as many acronyms as she wants, use them so that the subsequent links will point to a list of acronyms somewhere in the document, which in turn can be added using the proper acronym package code for that. Something like \listofacronyms. 

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/55c36fbf-4067-4f18-9c17-d24cb7996f5c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scripting with Haskell to achieve a filter for acronyms in pandoc
       [not found]                 ` <92666292-e593-4249-8ec2-ad37ceba79d2-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2016-12-28  2:37                   ` Sergio Correia
@ 2016-12-28 16:01                   ` BP Jonsson
       [not found]                     ` <d7c214b7-ce07-cb9f-2c10-6714bab0c313-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  1 sibling, 1 reply; 19+ messages in thread
From: BP Jonsson @ 2016-12-28 16:01 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Den 2016-12-27 kl. 22:34, skrev Luis Fernado Silva Castro de Araújo:
> Sorry John,
>
> I tried, but Haskell is too foreign to me. It would be simpler to go with a
> more familiar language.
>
>
>    - The aim is to create an acronym filter (in Haskell) to run with
>    pandoc. The filter would modify the current link syntax and make
>    [LRU](acro: "Least Recently Used") represent a definition for an acronym.
>
>
>    - In the LATEX side, an acronym addition would have to involve adding \usepackage[acronym,smallcaps]{glossaries}
>    to the preamble and \newacronym{LRU}{LRU}{Least Recently Used} for the
>    definition of LRU and finally \gls{LRU} to every time the term is used
>    in the text.

The first part, collecting the acronyms into a dict is easy. I 
have a filter which does something very similar to let notes refer 
to each other, written in perl.

The insertion of `\usepackage` and `\newacronym` blocks into the 
preamble is also reasonably strightforward.

Finally if you only are targetting LaTeX you can simply insert 
`\gls{LRU}` directly into your text to use the acronym. If you 
want to be able to use it with e.g. HTML you would have to have a 
second pass which either looks up those raw LaTeX `\gls{LRU}` 
elements and replaces them with appropriate HTML -- or a second 
pass which replaces e.g. inline code elements like `` `LRU`{.gls} 
`` with the appropriate LaTeX/HTML depending on the target format 
-- and code which inserts appropriate HTML for a list of acronyms,
unless you want popups -- which are also reasonably easy to 
achieve just by inserting the appropriate HTML/CSS code.

BTW would you like the link with the definition to be replaced 
with a `\gls{LRU}` as well?

I could certainly do this in Perl, as it is something I would have 
use for myself.  I would stumble a bit more if I did it in Python.

/bpj


-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/d7c214b7-ce07-cb9f-2c10-6714bab0c313%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scripting with Haskell to achieve a filter for acronyms in pandoc
       [not found]                     ` <d7c214b7-ce07-cb9f-2c10-6714bab0c313-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2016-12-28 16:40                       ` Sergio Correia
       [not found]                         ` <45ddd0a9-2d80-496d-8706-8afa29803d26-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: Sergio Correia @ 2016-12-28 16:40 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 3490 bytes --]

Agree with BPJ.

I wrote a sketch of the python code here:

https://github.com/sergiocorreia/panflute-filters/commit/fd669dd43fb075a708d283145bfb9ea83da6d9f9

There are two files, the py file and the md test file.

The filter is fairly simple, and was tested with python3:
https://github.com/sergiocorreia/panflute-filters/blob/master/filters/acronyms.py

One thing where I struggled is with adding the preamble ("\newacronym", 
etc.). I didn't found a way to add to the preamble from a filter, so what I 
did is two runs of pandoc (and the filter). One to create an external 
preamble file, and another to use it:

Something like:

*pandoc acronyms.md -F acronyms.py **--to=latex*
*pandoc acronyms.md -F acronyms.py **--to=latex **-s **-H 
acronyms_header.tex*



Best,
S


On Wednesday, December 28, 2016 at 11:01:41 AM UTC-5, BP Jonsson wrote:
>
> Den 2016-12-27 kl. 22:34, skrev Luis Fernado Silva Castro de Araújo: 
> > Sorry John, 
> > 
> > I tried, but Haskell is too foreign to me. It would be simpler to go 
> with a 
> > more familiar language. 
> > 
> > 
> >    - The aim is to create an acronym filter (in Haskell) to run with 
> >    pandoc. The filter would modify the current link syntax and make 
> >    [LRU](acro: "Least Recently Used") represent a definition for an 
> acronym. 
> > 
> > 
> >    - In the LATEX side, an acronym addition would have to involve adding 
> \usepackage[acronym,smallcaps]{glossaries} 
> >    to the preamble and \newacronym{LRU}{LRU}{Least Recently Used} for 
> the 
> >    definition of LRU and finally \gls{LRU} to every time the term is 
> used 
> >    in the text. 
>
> The first part, collecting the acronyms into a dict is easy. I 
> have a filter which does something very similar to let notes refer 
> to each other, written in perl. 
>
> The insertion of `\usepackage` and `\newacronym` blocks into the 
> preamble is also reasonably strightforward. 
>
> Finally if you only are targetting LaTeX you can simply insert 
> `\gls{LRU}` directly into your text to use the acronym. If you 
> want to be able to use it with e.g. HTML you would have to have a 
> second pass which either looks up those raw LaTeX `\gls{LRU}` 
> elements and replaces them with appropriate HTML -- or a second 
> pass which replaces e.g. inline code elements like `` `LRU`{.gls} 
> `` with the appropriate LaTeX/HTML depending on the target format 
> -- and code which inserts appropriate HTML for a list of acronyms, 
> unless you want popups -- which are also reasonably easy to 
> achieve just by inserting the appropriate HTML/CSS code. 
>
> BTW would you like the link with the definition to be replaced 
> with a `\gls{LRU}` as well? 
>
> I could certainly do this in Perl, as it is something I would have 
> use for myself.  I would stumble a bit more if I did it in Python. 
>
> /bpj 
>
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/45ddd0a9-2d80-496d-8706-8afa29803d26%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 4427 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scripting with Haskell to achieve a filter for acronyms in pandoc
       [not found]                         ` <45ddd0a9-2d80-496d-8706-8afa29803d26-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-12-29  0:08                           ` BP Jonsson
       [not found]                             ` <4fc0c0e2-3a27-2fed-3df7-e8ecb7a48a85-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: BP Jonsson @ 2016-12-29  0:08 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Den 2016-12-28 kl. 17:40, skrev Sergio Correia:
> One thing where I struggled is with adding the preamble ("\newacronym",
> etc.). I didn't found a way to add to the preamble from a filter, so what I
> did is two runs of pandoc (and the filter). One to create an external
> preamble file, and another to use it:

There are two ways, both involving putting a MetaBlocks containing 
a RawBlock of the appropriate format in the metadata and inserting 
that from the pandoc template:

1)  Add the MetaBlocks at the end of metadata->header-includes, 
making sure that the latter is a MetaList, creating it if needed.

     The drawback of this approach is that you clobber any actual 
data inserted with the -H option.

2)  Add something like this to the preamble of the pandoc template:

         $if(acronyms-code)$
         $acronyms-code$
         $endif$

     then in the filter set the value of metadata->acronyms-code 
to the MetaBlocks.

FWIW I wrote some Perl code which, if the output format isn't 
latex, puts a DefinitionList containing the acronyms and their 
definitions at the end of a Div with id "acronyms".
The terms/acronyms are wrapped in spans with an appropriate id, 
and acronyms in the text are wrapped in internal links to those.
You define acronyms in the text by inserting a link like 
`[LRU](acro "Least Recently Used")` into the text;
elsewhere you insert a similar link without a title `[LRU](acro)`.
Both kinds are replaced by a link like `[LRU](#acronym-LRU "Least 
Recently Used)` pointing to the definition in the list.
If you put an attribute `data-acro-def="URL"` on the definition 
link the definition in the definition list will be linked to that URL,
e.g. `[PDF](acro "Portable Document 
Format"){data-acro-def="https://en.wikipedia.org/wiki/PDF"}`
and in the definition list you will get the equivalent of

     PDF

     :   [Portable Document Format](https://en.wikipedia.org/wiki/PDF)

I also inserted some code for optional locale-aware sorting of the 
entries in the list using Unicode::Collate -- mostly a function 
which I had lying around.

And best of all it needs only one pass! :-)

I'll post it as soon as I have tested it and whipped together some 
documentation.

/bpj

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scripting with Haskell to achieve a filter for acronyms in pandoc
       [not found]                             ` <4fc0c0e2-3a27-2fed-3df7-e8ecb7a48a85-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2016-12-29  4:15                               ` Sergio Correia
       [not found]                                 ` <1fdbe7b4-4b43-4989-af94-d50664ce6a7e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: Sergio Correia @ 2016-12-29  4:15 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 3208 bytes --]

Interesting, thanks for the solution!

I thought that Pandoc extracted the contents of the metadata early on, but 
it makes sense to use it only when required (i.e. when filling out the 
template).


Cheers,
S

On Wednesday, December 28, 2016 at 7:08:39 PM UTC-5, BP Jonsson wrote:
>
> Den 2016-12-28 kl. 17:40, skrev Sergio Correia: 
> > One thing where I struggled is with adding the preamble ("\newacronym", 
> > etc.). I didn't found a way to add to the preamble from a filter, so 
> what I 
> > did is two runs of pandoc (and the filter). One to create an external 
> > preamble file, and another to use it: 
>
>
> There are two ways, both involving putting a MetaBlocks containing 
> a RawBlock of the appropriate format in the metadata and inserting 
> that from the pandoc template: 
>
> 1)  Add the MetaBlocks at the end of metadata->header-includes, 
> making sure that the latter is a MetaList, creating it if needed. 
>
>      The drawback of this approach is that you clobber any actual 
> data inserted with the -H option. 
>
> 2)  Add something like this to the preamble of the pandoc template: 
>
>          $if(acronyms-code)$ 
>          $acronyms-code$ 
>          $endif$ 
>
>      then in the filter set the value of metadata->acronyms-code 
> to the MetaBlocks. 
>
> FWIW I wrote some Perl code which, if the output format isn't 
> latex, puts a DefinitionList containing the acronyms and their 
> definitions at the end of a Div with id "acronyms". 
> The terms/acronyms are wrapped in spans with an appropriate id, 
> and acronyms in the text are wrapped in internal links to those. 
> You define acronyms in the text by inserting a link like 
> `[LRU](acro "Least Recently Used")` into the text; 
> elsewhere you insert a similar link without a title `[LRU](acro)`. 
> Both kinds are replaced by a link like `[LRU](#acronym-LRU "Least 
> Recently Used)` pointing to the definition in the list. 
> If you put an attribute `data-acro-def="URL"` on the definition 
> link the definition in the definition list will be linked to that URL, 
> e.g. `[PDF](acro "Portable Document 
> Format"){data-acro-def="https://en.wikipedia.org/wiki/PDF"}` 
> and in the definition list you will get the equivalent of 
>
>      PDF 
>
>      :   [Portable Document Format](https://en.wikipedia.org/wiki/PDF) 
>
> I also inserted some code for optional locale-aware sorting of the 
> entries in the list using Unicode::Collate -- mostly a function 
> which I had lying around. 
>
> And best of all it needs only one pass! :-) 
>
> I'll post it as soon as I have tested it and whipped together some 
> documentation. 
>
> /bpj 
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/1fdbe7b4-4b43-4989-af94-d50664ce6a7e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 5019 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scripting with Haskell to achieve a filter for acronyms in pandoc
       [not found]                                 ` <1fdbe7b4-4b43-4989-af94-d50664ce6a7e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-12-29 22:44                                   ` Luis Fernado Silva Castro de Araújo
       [not found]                                     ` <6da29099-4a66-4489-8f6f-be8b2212b329-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: Luis Fernado Silva Castro de Araújo @ 2016-12-29 22:44 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 3629 bytes --]

Guys,

Thank you so much for the help!

That is precisely what I wanted, I will adapt the code to the 
[acro](http://mirror.aarnet.edu.au/pub/CTAN/macros/latex/contrib/acro/acro_en.pdf) 
package in order to avoid running pandoc twice and will return.


Happy new year!



On Thursday, 29 December 2016 15:15:36 UTC+11, Sergio Correia wrote:
>
> Interesting, thanks for the solution!
>
> I thought that Pandoc extracted the contents of the metadata early on, but 
> it makes sense to use it only when required (i.e. when filling out the 
> template).
>
>
> Cheers,
> S
>
> On Wednesday, December 28, 2016 at 7:08:39 PM UTC-5, BP Jonsson wrote:
>>
>> Den 2016-12-28 kl. 17:40, skrev Sergio Correia: 
>> > One thing where I struggled is with adding the preamble ("\newacronym", 
>> > etc.). I didn't found a way to add to the preamble from a filter, so 
>> what I 
>> > did is two runs of pandoc (and the filter). One to create an external 
>> > preamble file, and another to use it: 
>>
>>
>> There are two ways, both involving putting a MetaBlocks containing 
>> a RawBlock of the appropriate format in the metadata and inserting 
>> that from the pandoc template: 
>>
>> 1)  Add the MetaBlocks at the end of metadata->header-includes, 
>> making sure that the latter is a MetaList, creating it if needed. 
>>
>>      The drawback of this approach is that you clobber any actual 
>> data inserted with the -H option. 
>>
>> 2)  Add something like this to the preamble of the pandoc template: 
>>
>>          $if(acronyms-code)$ 
>>          $acronyms-code$ 
>>          $endif$ 
>>
>>      then in the filter set the value of metadata->acronyms-code 
>> to the MetaBlocks. 
>>
>> FWIW I wrote some Perl code which, if the output format isn't 
>> latex, puts a DefinitionList containing the acronyms and their 
>> definitions at the end of a Div with id "acronyms". 
>> The terms/acronyms are wrapped in spans with an appropriate id, 
>> and acronyms in the text are wrapped in internal links to those. 
>> You define acronyms in the text by inserting a link like 
>> `[LRU](acro "Least Recently Used")` into the text; 
>> elsewhere you insert a similar link without a title `[LRU](acro)`. 
>> Both kinds are replaced by a link like `[LRU](#acronym-LRU "Least 
>> Recently Used)` pointing to the definition in the list. 
>> If you put an attribute `data-acro-def="URL"` on the definition 
>> link the definition in the definition list will be linked to that URL, 
>> e.g. `[PDF](acro "Portable Document 
>> Format"){data-acro-def="https://en.wikipedia.org/wiki/PDF"}` 
>> and in the definition list you will get the equivalent of 
>>
>>      PDF 
>>
>>      :   [Portable Document Format](https://en.wikipedia.org/wiki/PDF) 
>>
>> I also inserted some code for optional locale-aware sorting of the 
>> entries in the list using Unicode::Collate -- mostly a function 
>> which I had lying around. 
>>
>> And best of all it needs only one pass! :-) 
>>
>> I'll post it as soon as I have tested it and whipped together some 
>> documentation. 
>>
>> /bpj 
>>
>>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/6da29099-4a66-4489-8f6f-be8b2212b329%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 5619 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scripting with Haskell to achieve a filter for acronyms in pandoc
       [not found]                                     ` <6da29099-4a66-4489-8f6f-be8b2212b329-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-12-29 23:50                                       ` Luis Fernado Silva Castro de Araújo
       [not found]                                         ` <f5347462-26d3-4b8a-bc06-fdedfbd7c4f3-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: Luis Fernado Silva Castro de Araújo @ 2016-12-29 23:50 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1529 bytes --]

Were are almost there,


I made the changes so the filter uses the simpler package acro and it is 
working, as I just tested it. However, I did not manage to make it work in 
a single run. I tried to create a tex file and load in the same run with 
--include-before-body but it does not work. Is there any ways of appending 
the newly created acronyms definitions into the header? Instead of having 
to run `pandoc acronym.md -F acronyms.py -t latex' before actually running 
the command to generate the pdf?

Here are my changes: 

[acronym.md](http://pastebin.com/K1cBqRYt)
[acronyms.py](http://pastebin.com/GByfSdUv)

It must be run in two steps:

`pandoc acronym.md -F acronyms.py -t latex' 

and

`pandoc acronym.md -F acronyms.py -o acronym.pdf -H acronyms_header.tex`


To my understanding the -H acronyms_header.tex command will overwrite any 
commands from the yaml header within the document, I would like to avoid 
this.


Thank you again,

luis

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/f5347462-26d3-4b8a-bc06-fdedfbd7c4f3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 2254 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scripting with Haskell to achieve a filter for acronyms in pandoc
       [not found]                                         ` <f5347462-26d3-4b8a-bc06-fdedfbd7c4f3-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-12-30  5:53                                           ` Sergio Correia
       [not found]                                             ` <659aafe6-54b9-496d-8c28-bd4384479960-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: Sergio Correia @ 2016-12-30  5:53 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2211 bytes --]

This version should work in one go:

https://github.com/sergiocorreia/panflute-filters/tree/master/filters/acronyms.py

It also required a minor update to panflute, so please run pip before 
testing.

Going deeper: I must admit I learned a bit about dealing with Meta objects, 
and feel things are a bit too complicated perhaps. EG: the MetaBool and 
MetaList elements could be only used behind the schemes, but the user just 
faces nice bool and list built-in types.

On Thursday, December 29, 2016 at 6:50:46 PM UTC-5, Luis Fernado Silva 
Castro de Araújo wrote:
>
> Were are almost there,
>
>
> I made the changes so the filter uses the simpler package acro and it is 
> working, as I just tested it. However, I did not manage to make it work in 
> a single run. I tried to create a tex file and load in the same run with 
> --include-before-body but it does not work. Is there any ways of appending 
> the newly created acronyms definitions into the header? Instead of having 
> to run `pandoc acronym.md -F acronyms.py -t latex' before actually 
> running the command to generate the pdf?
>
> Here are my changes: 
>
> [acronym.md](http://pastebin.com/K1cBqRYt)
> [acronyms.py](http://pastebin.com/GByfSdUv)
>
> It must be run in two steps:
>
> `pandoc acronym.md -F acronyms.py -t latex' 
>
> and
>
> `pandoc acronym.md -F acronyms.py -o acronym.pdf -H acronyms_header.tex`
>
>
> To my understanding the -H acronyms_header.tex command will overwrite any 
> commands from the yaml header within the document, I would like to avoid 
> this.
>
>
> Thank you again,
>
> luis
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/659aafe6-54b9-496d-8c28-bd4384479960%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 5532 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scripting with Haskell to achieve a filter for acronyms in pandoc
       [not found]                                             ` <659aafe6-54b9-496d-8c28-bd4384479960-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2017-01-01 21:09                                               ` BP Jonsson
       [not found]                                                 ` <CAFC_yuSAZ3Tom2hXdq0_KviEjUVSMvMRrvWPQAZXOA83EHDAFg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: BP Jonsson @ 2017-01-01 21:09 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 3395 bytes --]

Aren't you clobbering the `tex` variable from line 34 on line 35?
Also why multiple MetaInlines elements? I think inserting a newline at the
end of each RawInline in a single MetaInlines or a MetaBlocks of RawBlock
elements would be more effective.

/bpj


Den 30 dec 2016 06:53 skrev "Sergio Correia" <sergio.correia-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:

> This version should work in one go:
>
> https://github.com/sergiocorreia/panflute-filters/tree/master/filters/
> acronyms.py
>
> It also required a minor update to panflute, so please run pip before
> testing.
>
> Going deeper: I must admit I learned a bit about dealing with Meta
> objects, and feel things are a bit too complicated perhaps. EG: the
> MetaBool and MetaList elements could be only used behind the schemes, but
> the user just faces nice bool and list built-in types.
>
> On Thursday, December 29, 2016 at 6:50:46 PM UTC-5, Luis Fernado Silva
> Castro de Araújo wrote:
>>
>> Were are almost there,
>>
>>
>> I made the changes so the filter uses the simpler package acro and it is
>> working, as I just tested it. However, I did not manage to make it work in
>> a single run. I tried to create a tex file and load in the same run with
>> --include-before-body but it does not work. Is there any ways of appending
>> the newly created acronyms definitions into the header? Instead of having
>> to run `pandoc acronym.md -F acronyms.py -t latex' before actually
>> running the command to generate the pdf?
>>
>> Here are my changes:
>>
>> [acronym.md](http://pastebin.com/K1cBqRYt)
>> [acronyms.py](http://pastebin.com/GByfSdUv)
>>
>> It must be run in two steps:
>>
>> `pandoc acronym.md -F acronyms.py -t latex'
>>
>> and
>>
>> `pandoc acronym.md -F acronyms.py -o acronym.pdf -H acronyms_header.tex`
>>
>>
>> To my understanding the -H acronyms_header.tex command will overwrite any
>> commands from the yaml header within the document, I would like to avoid
>> this.
>>
>>
>> Thank you again,
>>
>> luis
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/pandoc-discuss/659aafe6-54b9-496d-8c28-bd4384479960%
> 40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/659aafe6-54b9-496d-8c28-bd4384479960%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAFC_yuSAZ3Tom2hXdq0_KviEjUVSMvMRrvWPQAZXOA83EHDAFg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 5375 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scripting with Haskell to achieve a filter for acronyms in pandoc
       [not found]                                                 ` <CAFC_yuSAZ3Tom2hXdq0_KviEjUVSMvMRrvWPQAZXOA83EHDAFg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-01-02  2:08                                                   ` Sergio Correia
       [not found]                                                     ` <59f7d3c3-aadb-403b-aac8-6c158ff50777-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: Sergio Correia @ 2017-01-02  2:08 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 3926 bytes --]

Welp, yes, lines 34 and 35 do the same thing (but I prefer the template 
approach as it's more robust). Thus, line 34 should be deleted (although it 
has no effect at all).

About the MetaInlines, I chose that approach because that's how I usually 
see people write headers-include (as a list of lines). But it's only a 
stylistic difference AFAIK.



On Sunday, January 1, 2017 at 4:09:33 PM UTC-5, BP Jonsson wrote:
>
> Aren't you clobbering the `tex` variable from line 34 on line 35?
> Also why multiple MetaInlines elements? I think inserting a newline at the 
> end of each RawInline in a single MetaInlines or a MetaBlocks of RawBlock 
> elements would be more effective.
>
> /bpj
>
>
> Den 30 dec 2016 06:53 skrev "Sergio Correia" <sergio....-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org 
> <javascript:>>:
>
>> This version should work in one go:
>>
>>
>> https://github.com/sergiocorreia/panflute-filters/tree/master/filters/acronyms.py
>>
>> It also required a minor update to panflute, so please run pip before 
>> testing.
>>
>> Going deeper: I must admit I learned a bit about dealing with Meta 
>> objects, and feel things are a bit too complicated perhaps. EG: the 
>> MetaBool and MetaList elements could be only used behind the schemes, but 
>> the user just faces nice bool and list built-in types.
>>
>> On Thursday, December 29, 2016 at 6:50:46 PM UTC-5, Luis Fernado Silva 
>> Castro de Araújo wrote:
>>>
>>> Were are almost there,
>>>
>>>
>>> I made the changes so the filter uses the simpler package acro and it is 
>>> working, as I just tested it. However, I did not manage to make it work in 
>>> a single run. I tried to create a tex file and load in the same run with 
>>> --include-before-body but it does not work. Is there any ways of appending 
>>> the newly created acronyms definitions into the header? Instead of having 
>>> to run `pandoc acronym.md -F acronyms.py -t latex' before actually 
>>> running the command to generate the pdf?
>>>
>>> Here are my changes: 
>>>
>>> [acronym.md](http://pastebin.com/K1cBqRYt)
>>> [acronyms.py](http://pastebin.com/GByfSdUv)
>>>
>>> It must be run in two steps:
>>>
>>> `pandoc acronym.md -F acronyms.py -t latex' 
>>>
>>> and
>>>
>>> `pandoc acronym.md -F acronyms.py -o acronym.pdf -H acronyms_header.tex`
>>>
>>>
>>> To my understanding the -H acronyms_header.tex command will overwrite 
>>> any commands from the yaml header within the document, I would like to 
>>> avoid this.
>>>
>>>
>>> Thank you again,
>>>
>>> luis
>>>
>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>.
>> To post to this group, send email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org 
>> <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/pandoc-discuss/659aafe6-54b9-496d-8c28-bd4384479960%40googlegroups.com 
>> <https://groups.google.com/d/msgid/pandoc-discuss/659aafe6-54b9-496d-8c28-bd4384479960%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/59f7d3c3-aadb-403b-aac8-6c158ff50777%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 9482 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scripting with Haskell to achieve a filter for acronyms in pandoc
       [not found]                                                     ` <59f7d3c3-aadb-403b-aac8-6c158ff50777-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2017-01-03  3:33                                                       ` Luis Fernado Silva Castro de Araújo
  0 siblings, 0 replies; 19+ messages in thread
From: Luis Fernado Silva Castro de Araújo @ 2017-01-03  3:33 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 836 bytes --]

Hi all,

just to mention that the code works, thus the proposed problem has been 
fixed with this code. Thank you very much for the help.

I made a version of if that uses the package acro instead of the glossary 
package, if anyone is interested let me know.

lf

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/8a76376b-9d29-448e-9333-c20ff5e243e9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 1268 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-01-03  3:33 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-09  0:43 Scripting with Haskell to achieve a filter for acronyms in pandoc Luis Fernado Silva Castro de Araújo
     [not found] ` <6c9b4eec-0ea9-46bf-8da3-51998b63c902-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-09-19 18:25   ` John MacFarlane
     [not found]     ` <20160919182539.GA9066-l/d5Ua9yGnxXsXJlQylH7w@public.gmane.org>
2016-09-22  5:11       ` Luis Fernado Silva Castro de Araújo
     [not found]         ` <be83b525-f6d9-4536-8271-53d80b3ffd5b-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-09-23  8:35           ` John MacFarlane
     [not found]             ` <20160923083514.GI86115-BKjuZOBx5Kn2N3qrpRCZGbhGAdq7xJNKhPhL2mjWHbk@public.gmane.org>
2016-12-27 21:34               ` Luis Fernado Silva Castro de Araújo
     [not found]                 ` <92666292-e593-4249-8ec2-ad37ceba79d2-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-12-28  2:37                   ` Sergio Correia
     [not found]                     ` <719e577c-e900-4951-afb2-0423e71ff601-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-12-28  3:14                       ` Luis Fernado Silva Castro de Araújo
     [not found]                         ` <f979c153-7a62-41f9-a782-64532d1cee6b-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-12-28  4:22                           ` Sergio Correia
     [not found]                             ` <13631c6c-5a3a-441a-8076-928182fc69b3-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-12-28  6:06                               ` Luis Fernado Silva Castro de Araújo
2016-12-28 16:01                   ` BP Jonsson
     [not found]                     ` <d7c214b7-ce07-cb9f-2c10-6714bab0c313-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-12-28 16:40                       ` Sergio Correia
     [not found]                         ` <45ddd0a9-2d80-496d-8706-8afa29803d26-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-12-29  0:08                           ` BP Jonsson
     [not found]                             ` <4fc0c0e2-3a27-2fed-3df7-e8ecb7a48a85-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2016-12-29  4:15                               ` Sergio Correia
     [not found]                                 ` <1fdbe7b4-4b43-4989-af94-d50664ce6a7e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-12-29 22:44                                   ` Luis Fernado Silva Castro de Araújo
     [not found]                                     ` <6da29099-4a66-4489-8f6f-be8b2212b329-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-12-29 23:50                                       ` Luis Fernado Silva Castro de Araújo
     [not found]                                         ` <f5347462-26d3-4b8a-bc06-fdedfbd7c4f3-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-12-30  5:53                                           ` Sergio Correia
     [not found]                                             ` <659aafe6-54b9-496d-8c28-bd4384479960-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-01-01 21:09                                               ` BP Jonsson
     [not found]                                                 ` <CAFC_yuSAZ3Tom2hXdq0_KviEjUVSMvMRrvWPQAZXOA83EHDAFg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-01-02  2:08                                                   ` Sergio Correia
     [not found]                                                     ` <59f7d3c3-aadb-403b-aac8-6c158ff50777-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-01-03  3:33                                                       ` Luis Fernado Silva Castro de Araújo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).