public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: Paul Potts <paul-2ivHbsYhlDHrG+TUHvIryNi2O/JbrIOy@public.gmane.org>
To: pandoc-discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Subject: Re: Possible to use raw_attribute to insert an index entry?
Date: Tue, 2 Oct 2018 14:31:35 -0700 (PDT)	[thread overview]
Message-ID: <07ccaf96-d95d-44dd-8dd3-74f84e258e6d@googlegroups.com> (raw)
In-Reply-To: <4cb35271-a7aa-4f41-8ee7-da2d35612e92-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 5772 bytes --]



On Monday, October 1, 2018 at 6:54:27 PM UTC-4, bwhelm wrote:
>
> I have a pandoc filter that I use primarily for adding comments to 
> documents, but I also use it for things like cross-references and index 
> entries. I just played around with indexing in docx a bit to see if I can 
> get it to work, and it *seems* to work OK on simple documents. You might 
> check it out here:
>
> <https://github.com/bwhelm/Pandoc-Comment-Filter>
>
> The syntax is to add `[INDEX ENTRY]{.i}` to a markdown file, which will 
> produce this xml that Word will recognize as an index entry:
>
> <w:r><w:fldChar w:fldCharType="begin" /></w:r><w:r><w:instrText 
> xml:space="preserve"> XE &quot;</w:instrText></w:r><w:r><w:instrText>INDEX 
> ENTRY</w:instrText></w:r><w:r><w:instrText xml:space="preserve">&quot; 
> </w:instrText></w:r><w:r><w:fldChar w:fldCharType="end" /></w:r>
>
> I don't know if that's the right way to do it, and it won't handle fancier 
> types of index entries that Word can do (with main- and sub-entries, with 
> specially formatted page numbers, etc.), but it's at least a proof of 
> concept. I'd welcome any improvements to the filter as a pull request.
>
> Incidentally, here's the relevant bits taken out of the luafilter:
>
> ~~~ {.lua}
> local DOCX_TEXT = {}
> DOCX_TEXT.i = {}
> DOCX_TEXT.i.Open = '<w:r><w:fldChar 
> w:fldCharType="begin"/></w:r><w:r><w:instrText xml:space="preserve"> XE 
> "</w:instrText></w:r><w:r><w:instrText>'
> DOCX_TEXT.i.Close = '</w:instrText></w:r><w:r><w:instrText 
> xml:space="preserve">" </w:instrText></w:r><w:r><w:fldChar 
> w:fldCharType="end"/></w:r>'
> function handleInlines(span)
>     local spanType = span.classes[1]
>     if spanType == "i" then
>         -- Process indexing ...
>         if FORMAT == 'docx' then
>             print(span.content)
>             return docx(
>                 DOCX_TEXT.i.Open ..
>                 pandoc.utils.stringify(span.content) ..
>                 DOCX_TEXT.i.Close)
>         else
>             return {}
>         end
>     end
> end
>
> local COMMENT_FILTER = {
>     {Span = handleInlines}
> }
>
> return COMMENT_FILTER
> ~~~
>

OK, I have made some progress but I'm having difficulty with the Lua 
filter. I've basically tried to follow your example except modify the code 
to generate the simpler .odt index mark syntax. So I've got:

~~~
local random = math.random
local function uuid()
    local template ='xxxxxyyyyy'
    return string.gsub(template, '[xy]', function (c)
        local v = (c == 'x') and random(0, 0xf) or random(8, 0xb)
        return string.format('%x', v)
    end)
end

function handleInlines(span)
    local spanType = span.classes[1]
    if spanType == "i" then
        if FORMAT == 'odt' then
            local id_str = uuid()
            local open_str = '<text:alphabetical-index-mark-start 
text:id=\"' .. id_str .. '\" />'
            local close_str = '<text:alphabetical-index-mark-end 
text:id=\"' .. id_str .. '\" />'
            local ret_element = {c = open_str .. 
pandoc.utils.stringify(span.content) .. close_str}
            print (ret_element)
            for k,v in pairs(ret_element) do
                print(k)
                print(v)
            end
            local p_ret_element = pandoc.Str(open_str .. 
pandoc.utils.stringify(span.content) .. close_str)
            print (p_ret_element)
            for k,v in pairs(p_ret_element) do
                print(k)
                print(v)
            end
            return { ret_element }
        else
            return {}
        end
    end
end

local COMMENT_FILTER = {
    {Span = handleInlines}
}

return COMMENT_FILTER
~~~

In my Markdown source I've got [Lego Ninjago]{.i}

I've tried two approaches to create the return value of the handleinlines() 
function.

If I use return { p_ret_element} the filter runs, but the resulting .odt 
span looks like this:

    <text:span text:style-name="T1">&lt;text:alphabetical-index-mark-start 
text:id=&quot;093c999bba&quot; /&gt;Lego 
Ninjago&lt;text:alphabetical-index-mark-end text:id=&quot;093c999bba&quot; 
/&gt;</text:span>

In other words, it's converting the brackets and quotation marks and things 
like that entities, while I want them to be literal.

So I'm wondering what needs to be in the returned list. It looks like a 
table with one element named 'c.' But if I try making my own table and 
putting it in the list as the return value, pandoc doesn't like what it is 
getting back. Here's the output (from PowerShell):

table: 0000000004373080
c
<text:alphabetical-index-mark-start text:id="093c999bba" />Lego 
Ninjago<text:alphabetical-index-mark-end text:id="093c999bba" />
table: 0000000004373180
c
<text:alphabetical-index-mark-start text:id="093c999bba" />Lego 
Ninjago<text:alphabetical-index-mark-end text:id="093c999bba" />
Error running filter index_entries_odt.lua:
Could not read list: Could not get Inline value: Expected a string but got 
a nil

So Pandoc is expecting something that pandoc.Str produces, but I don't want 
the functionality of pandoc.Str to be applied to the string I've assembled. 
Any ideas? I hope that makes sense, and hope the formatting wasn't too 
butchered to be readable.

Thanks,

Paul

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/07ccaf96-d95d-44dd-8dd3-74f84e258e6d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 8925 bytes --]

  parent reply	other threads:[~2018-10-02 21:31 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-01 20:38 Paul Potts
     [not found] ` <b88f670b-509b-43bc-9c88-15fa01493c65-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2018-10-01 21:56   ` John MacFarlane
     [not found]     ` <yh480kmurx8h72.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2018-10-01 22:28       ` Paul Potts
2018-10-01 23:01       ` Paul Potts
2018-10-01 22:54   ` bwhelm
     [not found]     ` <4cb35271-a7aa-4f41-8ee7-da2d35612e92-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2018-10-01 22:58       ` Paul Potts
2018-10-02 21:31       ` Paul Potts [this message]
     [not found]         ` <07ccaf96-d95d-44dd-8dd3-74f84e258e6d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2018-10-02 21:37           ` Paul Potts
2018-10-02 22:08           ` John MacFarlane
     [not found]             ` <yh480kin2k80j0.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2018-10-02 22:31               ` Paul Potts

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=07ccaf96-d95d-44dd-8dd3-74f84e258e6d@googlegroups.com \
    --to=paul-2ivhbsyhldhrg+tuhviryni2o/jbrioy@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).