From: Paul Potts <paul-2ivHbsYhlDHrG+TUHvIryNi2O/JbrIOy@public.gmane.org>
To: pandoc-discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Subject: Re: Possible to use raw_attribute to insert an index entry?
Date: Tue, 2 Oct 2018 14:31:35 -0700 (PDT) [thread overview]
Message-ID: <07ccaf96-d95d-44dd-8dd3-74f84e258e6d@googlegroups.com> (raw)
In-Reply-To: <4cb35271-a7aa-4f41-8ee7-da2d35612e92-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
[-- Attachment #1.1: Type: text/plain, Size: 5772 bytes --]
On Monday, October 1, 2018 at 6:54:27 PM UTC-4, bwhelm wrote:
>
> I have a pandoc filter that I use primarily for adding comments to
> documents, but I also use it for things like cross-references and index
> entries. I just played around with indexing in docx a bit to see if I can
> get it to work, and it *seems* to work OK on simple documents. You might
> check it out here:
>
> <https://github.com/bwhelm/Pandoc-Comment-Filter>
>
> The syntax is to add `[INDEX ENTRY]{.i}` to a markdown file, which will
> produce this xml that Word will recognize as an index entry:
>
> <w:r><w:fldChar w:fldCharType="begin" /></w:r><w:r><w:instrText
> xml:space="preserve"> XE "</w:instrText></w:r><w:r><w:instrText>INDEX
> ENTRY</w:instrText></w:r><w:r><w:instrText xml:space="preserve">"
> </w:instrText></w:r><w:r><w:fldChar w:fldCharType="end" /></w:r>
>
> I don't know if that's the right way to do it, and it won't handle fancier
> types of index entries that Word can do (with main- and sub-entries, with
> specially formatted page numbers, etc.), but it's at least a proof of
> concept. I'd welcome any improvements to the filter as a pull request.
>
> Incidentally, here's the relevant bits taken out of the luafilter:
>
> ~~~ {.lua}
> local DOCX_TEXT = {}
> DOCX_TEXT.i = {}
> DOCX_TEXT.i.Open = '<w:r><w:fldChar
> w:fldCharType="begin"/></w:r><w:r><w:instrText xml:space="preserve"> XE
> "</w:instrText></w:r><w:r><w:instrText>'
> DOCX_TEXT.i.Close = '</w:instrText></w:r><w:r><w:instrText
> xml:space="preserve">" </w:instrText></w:r><w:r><w:fldChar
> w:fldCharType="end"/></w:r>'
> function handleInlines(span)
> local spanType = span.classes[1]
> if spanType == "i" then
> -- Process indexing ...
> if FORMAT == 'docx' then
> print(span.content)
> return docx(
> DOCX_TEXT.i.Open ..
> pandoc.utils.stringify(span.content) ..
> DOCX_TEXT.i.Close)
> else
> return {}
> end
> end
> end
>
> local COMMENT_FILTER = {
> {Span = handleInlines}
> }
>
> return COMMENT_FILTER
> ~~~
>
OK, I have made some progress but I'm having difficulty with the Lua
filter. I've basically tried to follow your example except modify the code
to generate the simpler .odt index mark syntax. So I've got:
~~~
local random = math.random
local function uuid()
local template ='xxxxxyyyyy'
return string.gsub(template, '[xy]', function (c)
local v = (c == 'x') and random(0, 0xf) or random(8, 0xb)
return string.format('%x', v)
end)
end
function handleInlines(span)
local spanType = span.classes[1]
if spanType == "i" then
if FORMAT == 'odt' then
local id_str = uuid()
local open_str = '<text:alphabetical-index-mark-start
text:id=\"' .. id_str .. '\" />'
local close_str = '<text:alphabetical-index-mark-end
text:id=\"' .. id_str .. '\" />'
local ret_element = {c = open_str ..
pandoc.utils.stringify(span.content) .. close_str}
print (ret_element)
for k,v in pairs(ret_element) do
print(k)
print(v)
end
local p_ret_element = pandoc.Str(open_str ..
pandoc.utils.stringify(span.content) .. close_str)
print (p_ret_element)
for k,v in pairs(p_ret_element) do
print(k)
print(v)
end
return { ret_element }
else
return {}
end
end
end
local COMMENT_FILTER = {
{Span = handleInlines}
}
return COMMENT_FILTER
~~~
In my Markdown source I've got [Lego Ninjago]{.i}
I've tried two approaches to create the return value of the handleinlines()
function.
If I use return { p_ret_element} the filter runs, but the resulting .odt
span looks like this:
<text:span text:style-name="T1"><text:alphabetical-index-mark-start
text:id="093c999bba" />Lego
Ninjago<text:alphabetical-index-mark-end text:id="093c999bba"
/></text:span>
In other words, it's converting the brackets and quotation marks and things
like that entities, while I want them to be literal.
So I'm wondering what needs to be in the returned list. It looks like a
table with one element named 'c.' But if I try making my own table and
putting it in the list as the return value, pandoc doesn't like what it is
getting back. Here's the output (from PowerShell):
table: 0000000004373080
c
<text:alphabetical-index-mark-start text:id="093c999bba" />Lego
Ninjago<text:alphabetical-index-mark-end text:id="093c999bba" />
table: 0000000004373180
c
<text:alphabetical-index-mark-start text:id="093c999bba" />Lego
Ninjago<text:alphabetical-index-mark-end text:id="093c999bba" />
Error running filter index_entries_odt.lua:
Could not read list: Could not get Inline value: Expected a string but got
a nil
So Pandoc is expecting something that pandoc.Str produces, but I don't want
the functionality of pandoc.Str to be applied to the string I've assembled.
Any ideas? I hope that makes sense, and hope the formatting wasn't too
butchered to be readable.
Thanks,
Paul
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/07ccaf96-d95d-44dd-8dd3-74f84e258e6d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
[-- Attachment #1.2: Type: text/html, Size: 8925 bytes --]
next prev parent reply other threads:[~2018-10-02 21:31 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-01 20:38 Paul Potts
[not found] ` <b88f670b-509b-43bc-9c88-15fa01493c65-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2018-10-01 21:56 ` John MacFarlane
[not found] ` <yh480kmurx8h72.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2018-10-01 22:28 ` Paul Potts
2018-10-01 23:01 ` Paul Potts
2018-10-01 22:54 ` bwhelm
[not found] ` <4cb35271-a7aa-4f41-8ee7-da2d35612e92-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2018-10-01 22:58 ` Paul Potts
2018-10-02 21:31 ` Paul Potts [this message]
[not found] ` <07ccaf96-d95d-44dd-8dd3-74f84e258e6d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2018-10-02 21:37 ` Paul Potts
2018-10-02 22:08 ` John MacFarlane
[not found] ` <yh480kin2k80j0.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2018-10-02 22:31 ` Paul Potts
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=07ccaf96-d95d-44dd-8dd3-74f84e258e6d@googlegroups.com \
--to=paul-2ivhbsyhldhrg+tuhviryni2o/jbrioy@public.gmane.org \
--cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).