* How to access Span elements with lua filter based on their content
@ 2020-11-05 17:01 krulis....-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
[not found] ` <fa68cec8-4ff1-4bbe-95fa-65d36c28bda7n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: krulis....-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org @ 2020-11-05 17:01 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 2070 bytes --]
I am using pandoc to convert `org-agenda` list of todos to `docx` and `pdf`
for my coworkers. File exported from `emacs org-agenda` can look like that
(simplified):
`tasks.org`
```
* TODO Feed the cat
```
Pandoc native output of this file parsing is:
```
[Header 1 ("feed-the-cat",[],[]) [Span ("",["todo","TODO"],[]) [Str
"TODO"],Space,Str "Feed",Space,Str "the",Space,Str "cat"]]
```
Now if I convert this to any output format, I get spurious "TODO" pandoc
strings (that are present from `org-mode`). How can I get rid of this
"TODO" string (preferably also with surrounding spaces)?
My first attempt was to use lua filter. I can simply do:
`deleteSpans.lua`
```
function Span(el)
return pandoc.Str('')
end
```
but this removes just all `Spans`, which could be bad (but doesnt mind in
my current specific case).
I have tryed to do better like this:
`removeTODO.lua`
```
function Span(el)
if el.text == 'TODO' then
return pandoc.Str('')
else
return nil
end
end
```
But this doesnt have any effect. When I look at `lua-filter` docs and on
`Span` constructor, and try to reverse-engineer that, I am very confused.
There are `Attr`, `attributes`, and such, and none of them worked.
So, how can I access, or match, `pandoc Span` elements based on their
content? Where can I read more about this?
If it is possible to achive from `emacs` or `org-agenda` side, I will be
very interested in that option too.
So far, I was little hesitant to go through hackage and `pandoc.types`; but
if that is the place to go (in the future), than I give it my best shot.
Thank you very much with any help in this.
Regards, Tomas
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/fa68cec8-4ff1-4bbe-95fa-65d36c28bda7n%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 2979 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: How to access Span elements with lua filter based on their content
[not found] ` <fa68cec8-4ff1-4bbe-95fa-65d36c28bda7n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2020-11-05 21:30 ` Albert Krewinkel
[not found] ` <871rh7trtg.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Albert Krewinkel @ 2020-11-05 21:30 UTC (permalink / raw)
To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw
krulis....-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org writes:
> I am using pandoc to convert `org-agenda` list of todos to `docx` and `pdf`
> for my coworkers. File exported from `emacs org-agenda` can look like that
> (simplified):
>
> `tasks.org`
> ```
> * TODO Feed the cat
> ```
>
> Pandoc native output of this file parsing is:
>
> ```
> [Header 1 ("feed-the-cat",[],[]) [Span ("",["todo","TODO"],[]) [Str
> "TODO"],Space,Str "Feed",Space,Str "the",Space,Str "cat"]]
> ```
>
> Now if I convert this to any output format, I get spurious "TODO" pandoc
> strings (that are present from `org-mode`). How can I get rid of this
> "TODO" string (preferably also with surrounding spaces)?
Two options:
1. The org reader recognizes most org export options. So adding the
following line to your input file should be enough:
#+OPTIONS: todo:nil
See: https://orgmode.org/manual/Export-Settings.html
2. With a Lua filter you'll want
function Span (span)
if span.classes:includes 'todo' then
return {} -- delete this element
end
end
> So, how can I access, or match, `pandoc Span` elements based on their
> content? Where can I read more about this?
_Just_ on their content is difficult for various reasons, but you can
compare AST elements using the normal `==` Lua operator. The comparison
of elements happens in Haskell, where elements don't have identity.
So `pandoc.Span {pandoc.Str 'hi'} == pandoc.Span {pandoc.Str 'hi'}`
would be true, but `{pandoc.Str 'hi'} == {pandoc.Str 'hi'}` would be
false, as lists are note treated as AST elements. We might change that
at some point.
HTH,
--
Albert Krewinkel
GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: How to access Span elements with lua filter based on their content
[not found] ` <871rh7trtg.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
@ 2020-11-06 15:50 ` krulis....-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
[not found] ` <84ffd932-2be5-4900-b115-58220e691dcbn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: krulis....-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org @ 2020-11-06 15:50 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 3127 bytes --]
Hello Mr. Krewinkel,
thank you for your help. The filter works great! The export option has made
no difference for me, but I might be using it wrong (as it is always with
me, I am learning working with Emacs, so I am probably doing something the
way I shouldnt :D).
The second part is difficult for me. Could you elaborate a little more
about how did you identified that those `'todo'` or `'TODO'` in the
TODO-Span as classes, and not attributes? This might be silly question, I
guess this is somehow inspired by HTML, but it would be really helpfull for
me to know how this element is represented in pandoc AST.
And the element [String "TODO"] is a one-element list in pandoc-AST,
therefore it cannot be chacked as-is? Did I got the last part correctly?
Regards, Tomas
Dne čtvrtek 5. listopadu 2020 v 22:31:07 UTC+1 uživatel Albert Krewinkel
napsal:
> krulis....-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org writes:
>
> > I am using pandoc to convert `org-agenda` list of todos to `docx` and
> `pdf`
> > for my coworkers. File exported from `emacs org-agenda` can look like
> that
> > (simplified):
> >
> > `tasks.org`
> > ```
> > * TODO Feed the cat
> > ```
> >
> > Pandoc native output of this file parsing is:
> >
> > ```
> > [Header 1 ("feed-the-cat",[],[]) [Span ("",["todo","TODO"],[]) [Str
> > "TODO"],Space,Str "Feed",Space,Str "the",Space,Str "cat"]]
> > ```
> >
> > Now if I convert this to any output format, I get spurious "TODO" pandoc
> > strings (that are present from `org-mode`). How can I get rid of this
> > "TODO" string (preferably also with surrounding spaces)?
>
> Two options:
>
> 1. The org reader recognizes most org export options. So adding the
> following line to your input file should be enough:
>
> #+OPTIONS: todo:nil
>
> See: https://orgmode.org/manual/Export-Settings.html
>
> 2. With a Lua filter you'll want
>
> function Span (span)
> if span.classes:includes 'todo' then
> return {} -- delete this element
> end
> end
>
> > So, how can I access, or match, `pandoc Span` elements based on their
> > content? Where can I read more about this?
>
> _Just_ on their content is difficult for various reasons, but you can
> compare AST elements using the normal `==` Lua operator. The comparison
> of elements happens in Haskell, where elements don't have identity.
>
> So `pandoc.Span {pandoc.Str 'hi'} == pandoc.Span {pandoc.Str 'hi'}`
> would be true, but `{pandoc.Str 'hi'} == {pandoc.Str 'hi'}` would be
> false, as lists are note treated as AST elements. We might change that
> at some point.
>
> HTH,
>
> --
> Albert Krewinkel
> GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124
>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/84ffd932-2be5-4900-b115-58220e691dcbn%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 4571 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: How to access Span elements with lua filter based on their content
[not found] ` <84ffd932-2be5-4900-b115-58220e691dcbn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2020-11-06 19:31 ` Albert Krewinkel
[not found] ` <87tuu2s2mw.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Albert Krewinkel @ 2020-11-06 19:31 UTC (permalink / raw)
To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw
Hey Tomas,
krulis....-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org writes:
> thank you for your help. The filter works great! The export option has made
> no difference for me, but I might be using it wrong (as it is always with
> me, I am learning working with Emacs, so I am probably doing something the
> way I shouldnt :D).
Feel free to send me an example file and I can take a quick look.
> The second part is difficult for me. Could you elaborate a little more
> about how did you identified that those `'todo'` or `'TODO'` in the
> TODO-Span as classes, and not attributes?
That's a good question. A Span consists of two parts, the Attr and the
inline contents. Attr values are triples consisting of the element's id,
classes, and key-value attributes, in that order.
https://pandoc.org/lua-filters.html#type-attr
If we look at
>> > [Header 1 ("feed-the-cat",[],[]) [Span ("",["todo","TODO"],[]) [Str
>> > "TODO"],Space,Str "Feed",Space,Str "the",Space,Str "cat"]]
we see that `["todo", "TODO"]` is the second element in the tuples, so
these are classes. We can now check the Lua filter docs for the Span
type to see how we can access the info:
https://pandoc.org/lua-filters.html#type-span
We find that Span elements have a `classes` field; the rest should be
discoverable by clicking and scrolling through the docs.
The most difficult part is to know that the triples in the native output
are Attr values. I'm actually not sure if this is documented anywhere
but the Haskell source. Any confusion about this is very understandable.
> And the element [String "TODO"] is a one-element list in pandoc-AST,
> therefore it cannot be chacked as-is? Did I got the last part correctly?
Yes. Haskell lists are translated into plain Lua tables; the latter
follow the usual Lua comparison rules. Pandoc AST elements are special,
in that they have `__eq` metamethods which use Haskell's comparision
mechanism under the hood. Hence the difference in behavior.
--
Albert Krewinkel
GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: How to access Span elements with lua filter based on their content
[not found] ` <87tuu2s2mw.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
@ 2020-11-08 18:20 ` krulis....-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
0 siblings, 0 replies; 5+ messages in thread
From: krulis....-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org @ 2020-11-08 18:20 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 3272 bytes --]
Thank you for offering help Mr. Krewinkel, but I should really first learn
with Org-mode features and functions. I am yet lookig at the quick-start
guide and learning about what I actually **can** do with org-mode. I should
think about hacking and modifying Emacs afterwards.
About spans, I believe that their contents are not documented in such
concise way, as you have written now (at least not in user guide or in any
filter documentation). Your explanation is perfectly understandable and
readable to me. Would it make sense to add this to the lua filter
documentation?
Dne pátek 6. listopadu 2020 v 20:32:05 UTC+1 uživatel Albert Krewinkel
napsal:
> Hey Tomas,
>
> krulis....-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org writes:
>
> > thank you for your help. The filter works great! The export option has
> made
> > no difference for me, but I might be using it wrong (as it is always with
> > me, I am learning working with Emacs, so I am probably doing something
> the
> > way I shouldnt :D).
>
> Feel free to send me an example file and I can take a quick look.
>
> > The second part is difficult for me. Could you elaborate a little more
> > about how did you identified that those `'todo'` or `'TODO'` in the
> > TODO-Span as classes, and not attributes?
>
> That's a good question. A Span consists of two parts, the Attr and the
> inline contents. Attr values are triples consisting of the element's id,
> classes, and key-value attributes, in that order.
> https://pandoc.org/lua-filters.html#type-attr
>
> If we look at
>
> >> > [Header 1 ("feed-the-cat",[],[]) [Span ("",["todo","TODO"],[]) [Str
> >> > "TODO"],Space,Str "Feed",Space,Str "the",Space,Str "cat"]]
>
> we see that `["todo", "TODO"]` is the second element in the tuples, so
> these are classes. We can now check the Lua filter docs for the Span
> type to see how we can access the info:
> https://pandoc.org/lua-filters.html#type-span
> We find that Span elements have a `classes` field; the rest should be
> discoverable by clicking and scrolling through the docs.
>
> The most difficult part is to know that the triples in the native output
> are Attr values. I'm actually not sure if this is documented anywhere
> but the Haskell source. Any confusion about this is very understandable.
>
> > And the element [String "TODO"] is a one-element list in pandoc-AST,
> > therefore it cannot be chacked as-is? Did I got the last part correctly?
>
> Yes. Haskell lists are translated into plain Lua tables; the latter
> follow the usual Lua comparison rules. Pandoc AST elements are special,
> in that they have `__eq` metamethods which use Haskell's comparision
> mechanism under the hood. Hence the difference in behavior.
>
> --
> Albert Krewinkel
> GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124
>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/3feef49d-8473-4adf-abd7-de523ecb6661n%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 4677 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-11-08 18:20 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-05 17:01 How to access Span elements with lua filter based on their content krulis....-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
[not found] ` <fa68cec8-4ff1-4bbe-95fa-65d36c28bda7n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-11-05 21:30 ` Albert Krewinkel
[not found] ` <871rh7trtg.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
2020-11-06 15:50 ` krulis....-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
[not found] ` <84ffd932-2be5-4900-b115-58220e691dcbn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-11-06 19:31 ` Albert Krewinkel
[not found] ` <87tuu2s2mw.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
2020-11-08 18:20 ` krulis....-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).