public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: Ismail Jattioui <ismail.jattioui1-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: pandoc-discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Subject: Re: Move TOC when converting html to docx
Date: Sun, 17 Jul 2022 23:33:09 -0700 (PDT)	[thread overview]
Message-ID: <a9967f45-314e-484c-a642-ecb03c315e10n@googlegroups.com> (raw)
In-Reply-To: <88926968-1ca3-40c4-944f-c78e0554ba84n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 3614 bytes --]

up please 

Le mardi 12 juillet 2022 à 16:32:43 UTC+2, Ismail Jattioui a écrit :

> I tried this code which looked like what I want to do, but it still 
> doesn’t work unfortunately.
>
> There are apparently no RawBlock in the html I posted and I don't see how 
> we can add one 
>
> I tried using Para and Block with no success :/ I got the following error 
> :
> PandocLuaError "Trying to set unavailable property text." at the line 
> indicated by ---->
>
> The command I am using:
>
> pandoc --metadata toc-title=custom-toc --lua-filter=filter.lua 
> input-test.html -o res.docx
>
> The luaFilter I am trying:
>
> ------------------------------------------------------
> local RAW_TOC = [[
> <w:sdt>
> <w:sdtContent xmlns:w="
> http://schemas.openxmlformats.org/wordprocessingml/2006/main">
> <w:p>
> <w:r>
> <w:fldChar w:fldCharType="begin" w:dirty="true" />
> <w:instrText xml:space="preserve">TOC \o "1-3" \h \z \u</w:instrText>
> <w:fldChar w:fldCharType="separate" />
> <w:fldChar w:fldCharType="end" />
> </w:r>
> </w:p>
> </w:sdtContent>
> </w:sdt>
> ]]
> local meta_key = "toc-title"
> local vars = {}
>
>
> local function getVars (meta)
>    for k, v in pairs(meta) do
>       if v.t == 'MetaInlines' then
>          print('isMetaInlines')
>          vars["$" .. k .. "$"] = { table.unpack(v) }
>       end
>    end
> end
>
> local function pageBreak(el)
>    if el.text == "pandoc-page-break" then
>       print('pageBreak')
>       return pandoc.Str ""
>    else
>       return el
>    end
> end
>
>
> local function toc(el)
>    print(el)
>    if pandoc.utils.stringify(el) ==  "pandoc-toc" then
>       ----> el.text = RAW_TOC
>       el.format = "openxml"
>       local para = pandoc.Para(vars)
>       local div = pandoc.Div({ para, el })
>       div["attr"]["attributes"]["custom-style"] = "TOC Heading"
>       return div
>    end
> end
>
> return {
>    { Meta = getVars },
>    { Str = pageBreak },
>    { RawBlock = toc }
> }
> ------------------------------------------------------
> Le lundi 11 juillet 2022 à 10:48:41 UTC+2, Ismail Jattioui a écrit :
>
>> Hi,
>>
>> I am trying to convert a html file to docx using pandoc. My problem is 
>> that I can’t manage to move the table of contents to a specific position in 
>> the document. I tried splitting my document into two, then merging it again 
>> but it isn’t optimal since we are using it in production and it costs us 2 
>> calls to pandoc and it isn't very maintanable
>>
>> I was wondering if there is a way to do that using Lua filters
>>
>> In a nutshell, let’s say I have the following html document that I wish 
>> to convert to DOCX :
>>
>> <!DOCTYPE html>
>> <html lang="en">
>>     <head>
>>         <meta charset="UTF-8" />
>>     </head>
>>     <h1>Title 1</h1>
>>     <p>Some stuff 2</p>
>>     <h2>Subtitle 1</h2>
>>     <p>Some stuff 2</p>
>>     <div>Other things</div>
>>     <div id="TOC">Insert TOC below</div>
>> </html>
>>
>> How do I manage to generate a Table of content below the div with the TOC 
>> id, without splitting the document ?
>>
>> Thanks in advance
>>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/a9967f45-314e-484c-a642-ecb03c315e10n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 7364 bytes --]

  parent reply	other threads:[~2022-07-18  6:33 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-11  8:48 Ismail Jattioui
     [not found] ` <77066946-d07a-489a-9ec2-99796422f682n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-07-12 14:32   ` Ismail Jattioui
     [not found]     ` <88926968-1ca3-40c4-944f-c78e0554ba84n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-07-18  6:33       ` Ismail Jattioui [this message]
2022-07-18  8:07   ` John MacFarlane
     [not found]     ` <EE47F68F-93F4-41CF-B650-7B1E1613D00E-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-07-21 13:48       ` Ismail Jattioui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a9967f45-314e-484c-a642-ecb03c315e10n@googlegroups.com \
    --to=ismail.jattioui1-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).