public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Example of python filter involving pandocfilters.Link
@ 2014-04-15 13:47 camille.huguenot-Re5JQEeQqe8AvxtiuMwx3w
       [not found] ` <2c193bee-4568-4388-8e0f-5bf485b0affb-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: camille.huguenot-Re5JQEeQqe8AvxtiuMwx3w @ 2014-04-15 13:47 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 993 bytes --]

Dear developers and users of pandoc,

  I could not find an example of a python filter manipulating the 'Link' 
constructor. Could someone point me to such an example ? I'm a poor python 
developer and debugging filters isn't very handy :-/ I'm trying to write a 
filter that automatically adds the .html extension to targets of links if 
missing (so as to avoid depending on the output format in our markdown 
sources).

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/2c193bee-4568-4388-8e0f-5bf485b0affb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 1415 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Example of python filter involving pandocfilters.Link
       [not found] ` <2c193bee-4568-4388-8e0f-5bf485b0affb-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2014-04-15 16:55   ` John MacFarlane
  2014-04-17  9:36   ` camille.huguenot-Re5JQEeQqe8AvxtiuMwx3w
  1 sibling, 0 replies; 6+ messages in thread
From: John MacFarlane @ 2014-04-15 16:55 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

value[0] will be the link label (a list of inline elements).
value[1] is the "target".  It is itself an array containing URL and
title.  So, value[1][0] will be the URL.  value[1][1] will be the title.

You'll need to modify value[1][0], then return

Link(value[0], value[1])

Something like this:

from pandocfilters import toJSONFilter, Link

def add_html(key, value, format, meta):
   if key == 'Link':
     value[1][0] = value[1][0] + ".html"
     return Link(value[0], value[1])

if __name__ == "__main__":
   toJSONFilter(add_html)

Hope that helps.  Apologies that the python library is not better
designed or better documented.  The translation from Haskell types
to python is automatic, and once you get the hang of it you can use
the definitions in Text.Pandoc.Definition as documentation.

+++ camille.huguenot-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org [Apr 15 14 06:47 ]:
>   Dear developers and users of pandoc,
>     I could not find an example of a python filter manipulating the
>   'Link' constructor. Could someone point me to such an example ? I'm a
>   poor python developer and debugging filters isn't very handy :-/ I'm
>   trying to write a filter that automatically adds the .html extension to
>   targets of links if missing (so as to avoid depending on the output
>   format in our markdown sources).
>
>   --
>   You received this message because you are subscribed to the Google
>   Groups "pandoc-discuss" group.
>   To unsubscribe from this group and stop receiving emails from it, send
>   an email to [1]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>   To post to this group, send email to
>   [2]pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>   To view this discussion on the web visit
>   [3]https://groups.google.com/d/msgid/pandoc-discuss/2c193bee-4568-4388-
>   8e0f-5bf485b0affb%40googlegroups.com.
>   For more options, visit [4]https://groups.google.com/d/optout.
>
>References
>
>   1. mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>   2. mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>   3. https://groups.google.com/d/msgid/pandoc-discuss/2c193bee-4568-4388-8e0f-5bf485b0affb%40googlegroups.com?utm_medium=email&utm_source=footer
>   4. https://groups.google.com/d/optout


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Example of python filter involving pandocfilters.Link
       [not found] ` <2c193bee-4568-4388-8e0f-5bf485b0affb-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2014-04-15 16:55   ` John MacFarlane
@ 2014-04-17  9:36   ` camille.huguenot-Re5JQEeQqe8AvxtiuMwx3w
       [not found]     ` <1663e110-2801-42a1-8128-8d531339df67-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  1 sibling, 1 reply; 6+ messages in thread
From: camille.huguenot-Re5JQEeQqe8AvxtiuMwx3w @ 2014-04-17  9:36 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 2442 bytes --]

Thanks John. For those interested (I guess this script is a common 
request), here's the script that handles anchors and distant links:

#!/usr/bin/env python

"""
Pandoc filter to add ".html" to links when outputting html (if missing)
"""

import pandocfilters # provided by 
https://pypi.python.org/pypi/pandocfilters 

def link(key, value, format, meta):
        # Add .html to links that do not have it
    if key == 'Link' and (format == "html" or format == "html5"):
        # From John MacFarlane (pandoc 1.12):
        # value[0] is the link label (a list of inline elements).
        # value[1] is the "target".  It is itself an array containing URL 
and
        # title. So, value[1][0] is the URL. value[1][1] is the title. 
        url=value[1][0]
        if url.startswith("http") or url.startswith("www"):
            # A distant link, not touching it
            return

        # A local link (if markdown files have the convention to prefix 
distant link as tested above)
        idx=url.rfind('#')
        if 0 == idx:
            # There's an anchor in the first position! It refers to
            # the current file, not adding ".html" before it, it's useless 
and
            # would produce wrong html
            pass
        elif 0 < idx:
            # There's an anchor! If ".html" is missing before '#' we add it.
            head=url[0:idx]
            tail=url[idx+1:len(url)]
            if not head.endswith(".html"):
                # The extension is missing, let's add it
                value[1][0] = head+".html#"+tail
                return pandocfilters.Link(value[0], value[1])
        elif not url.endswith(".html"):
            # The extension is missing, let's add it
            value[1][0] = url+".html"
            return pandocfilters.Link(value[0], value[1])

if __name__ == "__main__":
    pandocfilters.toJSONFilter(link)

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/1663e110-2801-42a1-8128-8d531339df67%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 4146 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Example of python filter involving pandocfilters.Link
       [not found]     ` <1663e110-2801-42a1-8128-8d531339df67-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2017-01-04  9:29       ` Ben White
       [not found]         ` <f592c253-d183-4e7f-8424-d2c4ea861b35-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Ben White @ 2017-01-04  9:29 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 794 bytes --]

I am using pandoc 1.16.0.2 with the python pandoc filters. I noticed that 
the Link constructor now takes 3 parameters, so the above is not valid for 
this version. Does anybody know what the parameters should be now? Thx

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/f592c253-d183-4e7f-8424-d2c4ea861b35%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 1204 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Example of python filter involving pandocfilters.Link
       [not found]         ` <f592c253-d183-4e7f-8424-d2c4ea861b35-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2017-01-04 11:18           ` Sergio Correia
       [not found]             ` <dc036b87-28ca-42e8-8282-3952d7cd289f-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Sergio Correia @ 2017-01-04 11:18 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1425 bytes --]

You can see the definitions here:
https://github.com/jgm/pandoc-types/blob/master/Text/Pandoc/Definition.hs#L264

So you would do *Link(attributes, label, target)*, with *attributes* the 
new argument.

You can also see this with a quick call to pandoc:

*>echo [Link](url "title"){.class x=123 #id} | pandoc --to=native*
*[Para [Link ("id",["class"],[("x","123")]) [Str "Link"] ("url","title")]]*

I'm not 100% sure, but if you don't want anything in attributes, you could 
just add

*attributes = ["",[],[]]*

And then create *Link(attributes, ...)*

On Wednesday, January 4, 2017 at 4:29:18 AM UTC-5, Ben White wrote:
>
> I am using pandoc 1.16.0.2 with the python pandoc filters. I noticed that 
> the Link constructor now takes 3 parameters, so the above is not valid for 
> this version. Does anybody know what the parameters should be now? Thx
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/dc036b87-28ca-42e8-8282-3952d7cd289f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 2289 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Example of python filter involving pandocfilters.Link
       [not found]             ` <dc036b87-28ca-42e8-8282-3952d7cd289f-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2017-01-04 11:39               ` Ben White
  0 siblings, 0 replies; 6+ messages in thread
From: Ben White @ 2017-01-04 11:39 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 735 bytes --]

That's really useful, thanks a lot.

Spent some time digging about in the haddock docs to find the type defs, 
but the source in the link is easier to figure out.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/bbf4a7b9-d2c9-4320-b5d8-b392ab2877c2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 1174 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-01-04 11:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-15 13:47 Example of python filter involving pandocfilters.Link camille.huguenot-Re5JQEeQqe8AvxtiuMwx3w
     [not found] ` <2c193bee-4568-4388-8e0f-5bf485b0affb-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2014-04-15 16:55   ` John MacFarlane
2014-04-17  9:36   ` camille.huguenot-Re5JQEeQqe8AvxtiuMwx3w
     [not found]     ` <1663e110-2801-42a1-8128-8d531339df67-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-01-04  9:29       ` Ben White
     [not found]         ` <f592c253-d183-4e7f-8424-d2c4ea861b35-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-01-04 11:18           ` Sergio Correia
     [not found]             ` <dc036b87-28ca-42e8-8282-3952d7cd289f-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-01-04 11:39               ` Ben White

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).