public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* docx writer: using styles from reference document
@ 2019-05-06 15:15 Alan McLachlan
       [not found] ` <0c37bc1d-ea10-4945-98a8-b169997ac437-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Alan McLachlan @ 2019-05-06 15:15 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1456 bytes --]

Hi

What I'm looking to do is write a docx and style it using a reference 
document, but I want the generated docx to use styles from the reference 
document, not the default Word styles.
For example: pandoc today generates the paragraph text set to "Normal" 
style, I want it to use "My Style 1" instead of Normal.

I've done some research and tested different approaches but can't quite get 
it working.
Hand crafting a reference.docx with the styles I need is not a great option 
because the reference docx I'm getting in is outside my control and subject 
to change (thanks corporate marketing dept).

1. Any suggestions for tackling this?

2. Failing that, is a pull request supporting this something you'd be 
interested in? My last Haskell was 20 years ago in uni, but the docx.hs 
code looks reasonably self contained and maybe I can put something together 
:)

regards
Alan


-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/0c37bc1d-ea10-4945-98a8-b169997ac437%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 2093 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: docx writer: using styles from reference document
       [not found] ` <0c37bc1d-ea10-4945-98a8-b169997ac437-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2019-05-06 15:36   ` Jesse Rosenthal
       [not found]     ` <87v9ynr3wc.fsf-4GNroTWusrE@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Jesse Rosenthal @ 2019-05-06 15:36 UTC (permalink / raw)
  To: Alan McLachlan, pandoc-discuss

Alan McLachlan <alan.mcl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> pandoc today generates the paragraph text set to "Normal" 
> style, I want it to use "My Style 1" instead of Normal.

This is a job for custom-styles:
https://pandoc.org/MANUAL.html#custom-styles

So, one (not particularly user-friendly) way to do it would be to wrap
ever paragaph in a div, to give it a custom style

~~~
::: {custom-style="My Style 1"}
Here is a paragraph.
:::

::: {custom-style="My Style 1"}
Here is another.
:::
~~~

But you probably don't want to do that -- instead, it would be nice to
just write as normal:

~~~
Here is a paragraph.

Here is another.
~~~

and then have paragraphs converted on the fly. To do that, you'd use a
pandoc filter (https://pandoc.org/lua-filters.html):

~~~
function Para(blk)
   local attr = pandoc.Attr()
   attr.attributes["custom-style"] = "My Style 1"
   return pandoc.Div({blk}, attr)
end
~~~

That will convert all plain paras into a styled div. Save that in a file
(`style.lua`), and then run it on the simple markdown file:

`pandoc input.md --lua-filter=styler.lua -o output.docx`



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: docx writer: using styles from reference document
       [not found]     ` <87v9ynr3wc.fsf-4GNroTWusrE@public.gmane.org>
@ 2019-05-06 15:41       ` Alan
       [not found]         ` <CABQ_dt8PD7jtvWt-8w92nLqK-hiusUVxR=P-JfnYEJembQ2XPQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Alan @ 2019-05-06 15:41 UTC (permalink / raw)
  To: Jesse Rosenthal; +Cc: pandoc-discuss

[-- Attachment #1: Type: text/plain, Size: 2187 bytes --]

Thanks for the reply Jesse

I did check out the custom styles, but you are right that I'd rather not
embed that in every paragraph. I'm dealing with moderately large documents
and it would get old pretty fast.

I hadn't considered the lua filters yet. Good idea, I will give it a try.

regards
Alan

On Mon, May 6, 2019 at 5:36 PM Jesse Rosenthal <jrosenthal-4GNroTWusrE@public.gmane.org> wrote:

> Alan McLachlan <alan.mcl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>
> > pandoc today generates the paragraph text set to "Normal"
> > style, I want it to use "My Style 1" instead of Normal.
>
> This is a job for custom-styles:
> https://pandoc.org/MANUAL.html#custom-styles
>
> So, one (not particularly user-friendly) way to do it would be to wrap
> ever paragaph in a div, to give it a custom style
>
> ~~~
> ::: {custom-style="My Style 1"}
> Here is a paragraph.
> :::
>
> ::: {custom-style="My Style 1"}
> Here is another.
> :::
> ~~~
>
> But you probably don't want to do that -- instead, it would be nice to
> just write as normal:
>
> ~~~
> Here is a paragraph.
>
> Here is another.
> ~~~
>
> and then have paragraphs converted on the fly. To do that, you'd use a
> pandoc filter (https://pandoc.org/lua-filters.html):
>
> ~~~
> function Para(blk)
>    local attr = pandoc.Attr()
>    attr.attributes["custom-style"] = "My Style 1"
>    return pandoc.Div({blk}, attr)
> end
> ~~~
>
> That will convert all plain paras into a styled div. Save that in a file
> (`style.lua`), and then run it on the simple markdown file:
>
> `pandoc input.md --lua-filter=styler.lua -o output.docx`
>
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CABQ_dt8PD7jtvWt-8w92nLqK-hiusUVxR%3DP-JfnYEJembQ2XPQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 3480 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: docx writer: using styles from reference document
       [not found]         ` <CABQ_dt8PD7jtvWt-8w92nLqK-hiusUVxR=P-JfnYEJembQ2XPQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2019-05-10 14:29           ` Alan
       [not found]             ` <CABQ_dt9Ee-2dtNgPm7D_jJxhReAzk1Gg2tjVvn2Jah96jSif4Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Alan @ 2019-05-10 14:29 UTC (permalink / raw)
  To: Jesse Rosenthal; +Cc: pandoc-discuss

[-- Attachment #1: Type: text/plain, Size: 3116 bytes --]

Hi

The lua filter approach is working reasonably well but I have a couple more
in depth questions. For context, my pandoc input format is docbook
generated from asciidoc.

1. Title, subtitle, author and other meta elements: how do I apply styles
to them?
 - I've tried wrapping them in Spans (inside a "function Pandoc(doc)") as
suggested by the custom-style docs, but they don't pick up any styles.
 - I tried turning them into Div/Para objects in the main body, but then
they appear after the generated TOC.
Basically I need to support setting up a cover page followed by a TOC.

2. Page breaks/hard breaks. They don't seem to be supported by the Pandoc
internal model. Any suggestions for getting around this?
Also related to the cover page need.

regards
Alan


On Mon, May 6, 2019 at 5:41 PM Alan <alan.mcl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> Thanks for the reply Jesse
>
> I did check out the custom styles, but you are right that I'd rather not
> embed that in every paragraph. I'm dealing with moderately large documents
> and it would get old pretty fast.
>
> I hadn't considered the lua filters yet. Good idea, I will give it a try.
>
> regards
> Alan
>
> On Mon, May 6, 2019 at 5:36 PM Jesse Rosenthal <jrosenthal-4GNroTWusrE@public.gmane.org> wrote:
>
>> Alan McLachlan <alan.mcl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>
>> > pandoc today generates the paragraph text set to "Normal"
>> > style, I want it to use "My Style 1" instead of Normal.
>>
>> This is a job for custom-styles:
>> https://pandoc.org/MANUAL.html#custom-styles
>>
>> So, one (not particularly user-friendly) way to do it would be to wrap
>> ever paragaph in a div, to give it a custom style
>>
>> ~~~
>> ::: {custom-style="My Style 1"}
>> Here is a paragraph.
>> :::
>>
>> ::: {custom-style="My Style 1"}
>> Here is another.
>> :::
>> ~~~
>>
>> But you probably don't want to do that -- instead, it would be nice to
>> just write as normal:
>>
>> ~~~
>> Here is a paragraph.
>>
>> Here is another.
>> ~~~
>>
>> and then have paragraphs converted on the fly. To do that, you'd use a
>> pandoc filter (https://pandoc.org/lua-filters.html):
>>
>> ~~~
>> function Para(blk)
>>    local attr = pandoc.Attr()
>>    attr.attributes["custom-style"] = "My Style 1"
>>    return pandoc.Div({blk}, attr)
>> end
>> ~~~
>>
>> That will convert all plain paras into a styled div. Save that in a file
>> (`style.lua`), and then run it on the simple markdown file:
>>
>> `pandoc input.md --lua-filter=styler.lua -o output.docx`
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CABQ_dt9Ee-2dtNgPm7D_jJxhReAzk1Gg2tjVvn2Jah96jSif4Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 4883 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: docx writer: using styles from reference document
       [not found]             ` <CABQ_dt9Ee-2dtNgPm7D_jJxhReAzk1Gg2tjVvn2Jah96jSif4Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2019-05-20 13:06               ` Alan
       [not found]                 ` <CABQ_dt_VkYEL5ED8i8Wo7GUeNA-Kgwyzjpzw2V=vdWYt1+kCLg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Alan @ 2019-05-20 13:06 UTC (permalink / raw)
  To: Jesse Rosenthal; +Cc: pandoc-discuss

[-- Attachment #1: Type: text/plain, Size: 4101 bytes --]

Hi

I made some progress on this:

1. I got the cover page working by using the lua filter to remove all the
meta data inlines then reconstructing it all as an "abstract" meta element.
Handily the Abstract has Blocks not Inlines, so the custom styles work. And
the TOC appears after the abstract.

2. Hard breaks remain a problem though. I can work around this on the cover
page with some spacing, but that's not feasible for the rest of a typical
document.
Question: is the non-support hard breaks in Pandoc an intentional position,
or is this just a gap that the community would be interested in filling if
someone cared enough to implement it?

One new challenge:

3. Page numbers. Can I use the lua filter to insert a page number footer in
the generated docx?

regards
Alan


On Fri, May 10, 2019 at 4:29 PM Alan <alan.mcl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> Hi
>
> The lua filter approach is working reasonably well but I have a couple
> more in depth questions. For context, my pandoc input format is docbook
> generated from asciidoc.
>
> 1. Title, subtitle, author and other meta elements: how do I apply styles
> to them?
>  - I've tried wrapping them in Spans (inside a "function Pandoc(doc)") as
> suggested by the custom-style docs, but they don't pick up any styles.
>  - I tried turning them into Div/Para objects in the main body, but then
> they appear after the generated TOC.
> Basically I need to support setting up a cover page followed by a TOC.
>
> 2. Page breaks/hard breaks. They don't seem to be supported by the Pandoc
> internal model. Any suggestions for getting around this?
> Also related to the cover page need.
>
> regards
> Alan
>
>
> On Mon, May 6, 2019 at 5:41 PM Alan <alan.mcl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
>> Thanks for the reply Jesse
>>
>> I did check out the custom styles, but you are right that I'd rather not
>> embed that in every paragraph. I'm dealing with moderately large documents
>> and it would get old pretty fast.
>>
>> I hadn't considered the lua filters yet. Good idea, I will give it a try.
>>
>> regards
>> Alan
>>
>> On Mon, May 6, 2019 at 5:36 PM Jesse Rosenthal <jrosenthal-4GNroTWusrE@public.gmane.org>
>> wrote:
>>
>>> Alan McLachlan <alan.mcl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>>
>>> > pandoc today generates the paragraph text set to "Normal"
>>> > style, I want it to use "My Style 1" instead of Normal.
>>>
>>> This is a job for custom-styles:
>>> https://pandoc.org/MANUAL.html#custom-styles
>>>
>>> So, one (not particularly user-friendly) way to do it would be to wrap
>>> ever paragaph in a div, to give it a custom style
>>>
>>> ~~~
>>> ::: {custom-style="My Style 1"}
>>> Here is a paragraph.
>>> :::
>>>
>>> ::: {custom-style="My Style 1"}
>>> Here is another.
>>> :::
>>> ~~~
>>>
>>> But you probably don't want to do that -- instead, it would be nice to
>>> just write as normal:
>>>
>>> ~~~
>>> Here is a paragraph.
>>>
>>> Here is another.
>>> ~~~
>>>
>>> and then have paragraphs converted on the fly. To do that, you'd use a
>>> pandoc filter (https://pandoc.org/lua-filters.html):
>>>
>>> ~~~
>>> function Para(blk)
>>>    local attr = pandoc.Attr()
>>>    attr.attributes["custom-style"] = "My Style 1"
>>>    return pandoc.Div({blk}, attr)
>>> end
>>> ~~~
>>>
>>> That will convert all plain paras into a styled div. Save that in a file
>>> (`style.lua`), and then run it on the simple markdown file:
>>>
>>> `pandoc input.md --lua-filter=styler.lua -o output.docx`
>>>
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CABQ_dt_VkYEL5ED8i8Wo7GUeNA-Kgwyzjpzw2V%3DvdWYt1%2BkCLg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 6302 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: docx writer: using styles from reference document
       [not found]                 ` <CABQ_dt_VkYEL5ED8i8Wo7GUeNA-Kgwyzjpzw2V=vdWYt1+kCLg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2019-05-20 13:09                   ` Brandon Keith Biggs
       [not found]                     ` <CAKAWQkXpaypSpUU62p=Fr_bhyLhLuAKMQ18MOy2TzTu6LmV3jg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2019-05-20 16:40                   ` John MacFarlane
  1 sibling, 1 reply; 13+ messages in thread
From: Brandon Keith Biggs @ 2019-05-20 13:09 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw; +Cc: Jesse Rosenthal

[-- Attachment #1: Type: text/plain, Size: 5346 bytes --]

Hello,
I would love a link in the documentation to a guide on using these filters,
because having a TOC, title page, controlling page numbers, and all that is
very common when using word.
Thanks,

Brandon Keith Biggs <http://brandonkeithbiggs.com/>


On Mon, May 20, 2019 at 6:06 AM Alan <alan.mcl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> Hi
>
> I made some progress on this:
>
> 1. I got the cover page working by using the lua filter to remove all the
> meta data inlines then reconstructing it all as an "abstract" meta element.
> Handily the Abstract has Blocks not Inlines, so the custom styles work. And
> the TOC appears after the abstract.
>
> 2. Hard breaks remain a problem though. I can work around this on the
> cover page with some spacing, but that's not feasible for the rest of a
> typical document.
> Question: is the non-support hard breaks in Pandoc an intentional
> position, or is this just a gap that the community would be interested in
> filling if someone cared enough to implement it?
>
> One new challenge:
>
> 3. Page numbers. Can I use the lua filter to insert a page number footer
> in the generated docx?
>
> regards
> Alan
>
>
> On Fri, May 10, 2019 at 4:29 PM Alan <alan.mcl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
>> Hi
>>
>> The lua filter approach is working reasonably well but I have a couple
>> more in depth questions. For context, my pandoc input format is docbook
>> generated from asciidoc.
>>
>> 1. Title, subtitle, author and other meta elements: how do I apply styles
>> to them?
>>  - I've tried wrapping them in Spans (inside a "function Pandoc(doc)") as
>> suggested by the custom-style docs, but they don't pick up any styles.
>>  - I tried turning them into Div/Para objects in the main body, but then
>> they appear after the generated TOC.
>> Basically I need to support setting up a cover page followed by a TOC.
>>
>> 2. Page breaks/hard breaks. They don't seem to be supported by the Pandoc
>> internal model. Any suggestions for getting around this?
>> Also related to the cover page need.
>>
>> regards
>> Alan
>>
>>
>> On Mon, May 6, 2019 at 5:41 PM Alan <alan.mcl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>
>>> Thanks for the reply Jesse
>>>
>>> I did check out the custom styles, but you are right that I'd rather not
>>> embed that in every paragraph. I'm dealing with moderately large documents
>>> and it would get old pretty fast.
>>>
>>> I hadn't considered the lua filters yet. Good idea, I will give it a try.
>>>
>>> regards
>>> Alan
>>>
>>> On Mon, May 6, 2019 at 5:36 PM Jesse Rosenthal <jrosenthal-4GNroTWusrE@public.gmane.org>
>>> wrote:
>>>
>>>> Alan McLachlan <alan.mcl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>>>
>>>> > pandoc today generates the paragraph text set to "Normal"
>>>> > style, I want it to use "My Style 1" instead of Normal.
>>>>
>>>> This is a job for custom-styles:
>>>> https://pandoc.org/MANUAL.html#custom-styles
>>>>
>>>> So, one (not particularly user-friendly) way to do it would be to wrap
>>>> ever paragaph in a div, to give it a custom style
>>>>
>>>> ~~~
>>>> ::: {custom-style="My Style 1"}
>>>> Here is a paragraph.
>>>> :::
>>>>
>>>> ::: {custom-style="My Style 1"}
>>>> Here is another.
>>>> :::
>>>> ~~~
>>>>
>>>> But you probably don't want to do that -- instead, it would be nice to
>>>> just write as normal:
>>>>
>>>> ~~~
>>>> Here is a paragraph.
>>>>
>>>> Here is another.
>>>> ~~~
>>>>
>>>> and then have paragraphs converted on the fly. To do that, you'd use a
>>>> pandoc filter (https://pandoc.org/lua-filters.html):
>>>>
>>>> ~~~
>>>> function Para(blk)
>>>>    local attr = pandoc.Attr()
>>>>    attr.attributes["custom-style"] = "My Style 1"
>>>>    return pandoc.Div({blk}, attr)
>>>> end
>>>> ~~~
>>>>
>>>> That will convert all plain paras into a styled div. Save that in a file
>>>> (`style.lua`), and then run it on the simple markdown file:
>>>>
>>>> `pandoc input.md --lua-filter=styler.lua -o output.docx`
>>>>
>>>>
>>>> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/CABQ_dt_VkYEL5ED8i8Wo7GUeNA-Kgwyzjpzw2V%3DvdWYt1%2BkCLg%40mail.gmail.com
> <https://groups.google.com/d/msgid/pandoc-discuss/CABQ_dt_VkYEL5ED8i8Wo7GUeNA-Kgwyzjpzw2V%3DvdWYt1%2BkCLg%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAKAWQkXpaypSpUU62p%3DFr_bhyLhLuAKMQ18MOy2TzTu6LmV3jg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 8179 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: docx writer: using styles from reference document
       [not found]                     ` <CAKAWQkXpaypSpUU62p=Fr_bhyLhLuAKMQ18MOy2TzTu6LmV3jg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2019-05-20 13:40                       ` Alan
  0 siblings, 0 replies; 13+ messages in thread
From: Alan @ 2019-05-20 13:40 UTC (permalink / raw)
  To: pandoc-discuss; +Cc: Jesse Rosenthal

[-- Attachment #1: Type: text/plain, Size: 6625 bytes --]

Hi Brandon

Jesse shared this with me earlier in the thread, it helped a lot:
https://pandoc.org/lua-filters.html

cheers
Alan

On Mon, May 20, 2019 at 3:09 PM Brandon Keith Biggs <
brandonkeithbiggs-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> Hello,
> I would love a link in the documentation to a guide on using these
> filters, because having a TOC, title page, controlling page numbers, and
> all that is very common when using word.
> Thanks,
>
> Brandon Keith Biggs <http://brandonkeithbiggs.com/>
>
>
> On Mon, May 20, 2019 at 6:06 AM Alan <alan.mcl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
>> Hi
>>
>> I made some progress on this:
>>
>> 1. I got the cover page working by using the lua filter to remove all the
>> meta data inlines then reconstructing it all as an "abstract" meta element.
>> Handily the Abstract has Blocks not Inlines, so the custom styles work. And
>> the TOC appears after the abstract.
>>
>> 2. Hard breaks remain a problem though. I can work around this on the
>> cover page with some spacing, but that's not feasible for the rest of a
>> typical document.
>> Question: is the non-support hard breaks in Pandoc an intentional
>> position, or is this just a gap that the community would be interested in
>> filling if someone cared enough to implement it?
>>
>> One new challenge:
>>
>> 3. Page numbers. Can I use the lua filter to insert a page number footer
>> in the generated docx?
>>
>> regards
>> Alan
>>
>>
>> On Fri, May 10, 2019 at 4:29 PM Alan <alan.mcl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>
>>> Hi
>>>
>>> The lua filter approach is working reasonably well but I have a couple
>>> more in depth questions. For context, my pandoc input format is docbook
>>> generated from asciidoc.
>>>
>>> 1. Title, subtitle, author and other meta elements: how do I apply
>>> styles to them?
>>>  - I've tried wrapping them in Spans (inside a "function Pandoc(doc)")
>>> as suggested by the custom-style docs, but they don't pick up any styles.
>>>  - I tried turning them into Div/Para objects in the main body, but then
>>> they appear after the generated TOC.
>>> Basically I need to support setting up a cover page followed by a TOC.
>>>
>>> 2. Page breaks/hard breaks. They don't seem to be supported by the
>>> Pandoc internal model. Any suggestions for getting around this?
>>> Also related to the cover page need.
>>>
>>> regards
>>> Alan
>>>
>>>
>>> On Mon, May 6, 2019 at 5:41 PM Alan <alan.mcl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>>
>>>> Thanks for the reply Jesse
>>>>
>>>> I did check out the custom styles, but you are right that I'd rather
>>>> not embed that in every paragraph. I'm dealing with moderately large
>>>> documents and it would get old pretty fast.
>>>>
>>>> I hadn't considered the lua filters yet. Good idea, I will give it a
>>>> try.
>>>>
>>>> regards
>>>> Alan
>>>>
>>>> On Mon, May 6, 2019 at 5:36 PM Jesse Rosenthal <jrosenthal-4GNroTWusrE@public.gmane.org>
>>>> wrote:
>>>>
>>>>> Alan McLachlan <alan.mcl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>>>>
>>>>> > pandoc today generates the paragraph text set to "Normal"
>>>>> > style, I want it to use "My Style 1" instead of Normal.
>>>>>
>>>>> This is a job for custom-styles:
>>>>> https://pandoc.org/MANUAL.html#custom-styles
>>>>>
>>>>> So, one (not particularly user-friendly) way to do it would be to wrap
>>>>> ever paragaph in a div, to give it a custom style
>>>>>
>>>>> ~~~
>>>>> ::: {custom-style="My Style 1"}
>>>>> Here is a paragraph.
>>>>> :::
>>>>>
>>>>> ::: {custom-style="My Style 1"}
>>>>> Here is another.
>>>>> :::
>>>>> ~~~
>>>>>
>>>>> But you probably don't want to do that -- instead, it would be nice to
>>>>> just write as normal:
>>>>>
>>>>> ~~~
>>>>> Here is a paragraph.
>>>>>
>>>>> Here is another.
>>>>> ~~~
>>>>>
>>>>> and then have paragraphs converted on the fly. To do that, you'd use a
>>>>> pandoc filter (https://pandoc.org/lua-filters.html):
>>>>>
>>>>> ~~~
>>>>> function Para(blk)
>>>>>    local attr = pandoc.Attr()
>>>>>    attr.attributes["custom-style"] = "My Style 1"
>>>>>    return pandoc.Div({blk}, attr)
>>>>> end
>>>>> ~~~
>>>>>
>>>>> That will convert all plain paras into a styled div. Save that in a
>>>>> file
>>>>> (`style.lua`), and then run it on the simple markdown file:
>>>>>
>>>>> `pandoc input.md --lua-filter=styler.lua -o output.docx`
>>>>>
>>>>>
>>>>> --
>> You received this message because you are subscribed to the Google Groups
>> "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/pandoc-discuss/CABQ_dt_VkYEL5ED8i8Wo7GUeNA-Kgwyzjpzw2V%3DvdWYt1%2BkCLg%40mail.gmail.com
>> <https://groups.google.com/d/msgid/pandoc-discuss/CABQ_dt_VkYEL5ED8i8Wo7GUeNA-Kgwyzjpzw2V%3DvdWYt1%2BkCLg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "pandoc-discuss" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/pandoc-discuss/zzEvOYD3IR4/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/CAKAWQkXpaypSpUU62p%3DFr_bhyLhLuAKMQ18MOy2TzTu6LmV3jg%40mail.gmail.com
> <https://groups.google.com/d/msgid/pandoc-discuss/CAKAWQkXpaypSpUU62p%3DFr_bhyLhLuAKMQ18MOy2TzTu6LmV3jg%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CABQ_dt-uz_JTNFiHTfC1zBzUBeXnDbTL457OeZ_gUozVQc5p0Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 10241 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: docx writer: using styles from reference document
       [not found]                 ` <CABQ_dt_VkYEL5ED8i8Wo7GUeNA-Kgwyzjpzw2V=vdWYt1+kCLg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2019-05-20 13:09                   ` Brandon Keith Biggs
@ 2019-05-20 16:40                   ` John MacFarlane
       [not found]                     ` <yh480ksgt95bb0.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
  1 sibling, 1 reply; 13+ messages in thread
From: John MacFarlane @ 2019-05-20 16:40 UTC (permalink / raw)
  To: Alan, Jesse Rosenthal; +Cc: pandoc-discuss

Alan <alan.mcl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:


> 2. Hard breaks remain a problem though. I can work around this on the cover
> page with some spacing, but that's not feasible for the rest of a typical
> document.
> Question: is the non-support hard breaks in Pandoc an intentional position,
> or is this just a gap that the community would be interested in filling if
> someone cared enough to implement it?

What do you mean by non support of hard breaks?
We support hard line breaks, but not hard page breaks.
You should be able to insert page breaks using a lua
filter, though.  (You'd need to figure out exactly
what openxml code to insert as raw openxml.)

For some of the relevant discussion, see

https://github.com/jgm/pandoc/pull/805
https://github.com/jgm/pandoc/pull/3230
https://github.com/jgm/pandoc/issues/1934

It's not so much about putting in the time to
implement as making the decisions about how
to do so.

> 3. Page numbers. Can I use the lua filter to insert a page number footer in
> the generated docx?

You can put a footer in the reference.docx.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: docx writer: using styles from reference document
       [not found]                     ` <yh480ksgt95bb0.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
@ 2019-05-20 17:17                       ` Jesse Rosenthal
       [not found]                         ` <87y331f3jz.fsf-4GNroTWusrE@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Jesse Rosenthal @ 2019-05-20 17:17 UTC (permalink / raw)
  To: pandoc-discuss, Alan

John MacFarlane <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> writes:

> You should be able to insert page breaks using a lua
> filter, though.  (You'd need to figure out exactly
> what openxml code to insert as raw openxml.)

The openxml is:

    <w:p><w:r><w:br w:type="page" /></w:r></w:p>

So you should be able to insert that as a RawBlock with format "openxml".




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: docx writer: using styles from reference document
       [not found]                         ` <87y331f3jz.fsf-4GNroTWusrE@public.gmane.org>
@ 2019-05-21 14:26                           ` Alan
       [not found]                             ` <CABQ_dt-d1KyE2U4_Hgfbx4=2tMtUaHO-5SfrnqJs8HLFEtTG4w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Alan @ 2019-05-21 14:26 UTC (permalink / raw)
  To: Jesse Rosenthal; +Cc: pandoc-discuss

[-- Attachment #1: Type: text/plain, Size: 1684 bytes --]

Thanks John, Jesse

Yes, I meant page breaks. Sorry for not being clear.
The raw openxml block approach works well, thanks for that. I need to mull
over whether to support a replacement element in the source document (eg
the horiz line) or just do something like add page breaks before L1
headers, but either way it should be doable.

Footers: my reference.docx has got a footer, but it contains two elements:
an image and a page number. The page number text isn't making it to the
pandoc output, but the image is. I'm still tinkering with this to see if I
figure out what's going wrong.

regards
Alan



On Mon, May 20, 2019 at 7:17 PM Jesse Rosenthal <jrosenthal-4GNroTWusrE@public.gmane.org> wrote:

> John MacFarlane <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> writes:
>
> > You should be able to insert page breaks using a lua
> > filter, though.  (You'd need to figure out exactly
> > what openxml code to insert as raw openxml.)
>
> The openxml is:
>
>     <w:p><w:r><w:br w:type="page" /></w:r></w:p>
>
> So you should be able to insert that as a RawBlock with format "openxml".
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CABQ_dt-d1KyE2U4_Hgfbx4%3D2tMtUaHO-5SfrnqJs8HLFEtTG4w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 2735 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: docx writer: using styles from reference document
       [not found]                             ` <CABQ_dt-d1KyE2U4_Hgfbx4=2tMtUaHO-5SfrnqJs8HLFEtTG4w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2019-05-23 12:49                               ` Agustín Martín
       [not found]                                 ` <52a0ab63-6bb8-4d35-9736-c6a654fc5982-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Agustín Martín @ 2019-05-23 12:49 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 3935 bytes --]

Hi Alan.

I have gone through some of the issues you're experiencing.

We also get an official "corporate" word template, which unfortunately has 
been designed by graphic designers and not Word power users. Thus the 
template is difficult to work with, using many custom styles alternative to 
the standard ones like "This is my corporate title 3" (which is actually a 
level 2 title....), and suffering from bad choices like exact line spacing, 
or hard-to-change default fonts, or messing up the navigation panel.

Since we only get new templates every 1-2 years, what I do is I replicate 
the look and feel of the template in a sensible way (using standard style 
names, and taking advantage of what Word offers). If I get flack for doing 
it, I can always create aliases to the standard styles with the "new" 
names. In the end I start with a much better working template that uses 
standard styles.

I try to leave pagebreaks for level 1 titles (easy to define in the style, 
like you said). If I *really* need additional pagebreaks, I can always use 
a lua filter with a specific code-word in my markdown, but that is usually 
an indication that something is not quite as "clean" as it should be.

IMHO the hardest part to get is the second page of the document, if your 
template has specific items there such as a table with metadata from the 
document, etc. Especially if that has to come before the index. If you can 
get by having the index in the second page, you should be able to do most 
of what you want anyway.

Knowing that you can include document properties in your heading/footer 
reference doc, makes it really easy to customize the first page (different 
from the rest) and the rest of the document with your title, department, 
whatever-you-need. Pagenumbers definitely work!

Another thing that is not easily doable is if your template has different 
section formatting (like a last page without header/footer). I've only 
successfully worked with one-section reference docs.

Good luck and BR,
  Agustín.


On Tuesday, May 21, 2019 at 4:26:26 PM UTC+2, Alan wrote:
>
> Thanks John, Jesse
>
> Yes, I meant page breaks. Sorry for not being clear.
> The raw openxml block approach works well, thanks for that. I need to mull 
> over whether to support a replacement element in the source document (eg 
> the horiz line) or just do something like add page breaks before L1 
> headers, but either way it should be doable.
>
> Footers: my reference.docx has got a footer, but it contains two elements: 
> an image and a page number. The page number text isn't making it to the 
> pandoc output, but the image is. I'm still tinkering with this to see if I 
> figure out what's going wrong.
>
> regards
> Alan
>
>
>
> On Mon, May 20, 2019 at 7:17 PM Jesse Rosenthal <jrose...-4GNroTWusrE@public.gmane.org 
> <javascript:>> wrote:
>
>> John MacFarlane <j...-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org <javascript:>> writes:
>>
>> > You should be able to insert page breaks using a lua
>> > filter, though.  (You'd need to figure out exactly
>> > what openxml code to insert as raw openxml.)
>>
>> The openxml is:
>>
>>     <w:p><w:r><w:br w:type="page" /></w:r></w:p>
>>
>> So you should be able to insert that as a RawBlock with format "openxml".
>>
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/52a0ab63-6bb8-4d35-9736-c6a654fc5982%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 5448 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: docx writer: using styles from reference document
       [not found]                                 ` <52a0ab63-6bb8-4d35-9736-c6a654fc5982-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2019-05-27  6:06                                   ` Alan
       [not found]                                     ` <CABQ_dt9jbn1avtHeq3cJpCe3fGbfAWTJFRdG5oukGPKxRcPr7Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Alan @ 2019-05-27  6:06 UTC (permalink / raw)
  To: pandoc-discuss

[-- Attachment #1: Type: text/plain, Size: 5746 bytes --]

Hi Augustin, thanks for the input.

It sounds like your corporate dotx template is a little more wacky than
mine (except for the footer). I luckily have no page2 summary stuff to deal
with, just a standard auto generated TOC. No special section formatting etc
luckily. I did look at hand crafting an equivalent dotx, but in this case
it does get a tweak every couple months and I'd like other teams to use the
automation that I'm building. So a fairly seamless "just drop in the new
template" is important.

Since my last mail I've worked around a few more issues using the RawBlock
approach that Jesse suggested. Page numbers in the footer are still not
working, but that has more to do with the mad jumble of a footer in my
source template.

cheers
Alan


On Thu, May 23, 2019 at 2:49 PM Agustín Martín <agusmba-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> Hi Alan.
>
> I have gone through some of the issues you're experiencing.
>
> We also get an official "corporate" word template, which unfortunately has
> been designed by graphic designers and not Word power users. Thus the
> template is difficult to work with, using many custom styles alternative to
> the standard ones like "This is my corporate title 3" (which is actually a
> level 2 title....), and suffering from bad choices like exact line spacing,
> or hard-to-change default fonts, or messing up the navigation panel.
>
> Since we only get new templates every 1-2 years, what I do is I replicate
> the look and feel of the template in a sensible way (using standard style
> names, and taking advantage of what Word offers). If I get flack for doing
> it, I can always create aliases to the standard styles with the "new"
> names. In the end I start with a much better working template that uses
> standard styles.
>
> I try to leave pagebreaks for level 1 titles (easy to define in the style,
> like you said). If I *really* need additional pagebreaks, I can always use
> a lua filter with a specific code-word in my markdown, but that is usually
> an indication that something is not quite as "clean" as it should be.
>
> IMHO the hardest part to get is the second page of the document, if your
> template has specific items there such as a table with metadata from the
> document, etc. Especially if that has to come before the index. If you can
> get by having the index in the second page, you should be able to do most
> of what you want anyway.
>
> Knowing that you can include document properties in your heading/footer
> reference doc, makes it really easy to customize the first page (different
> from the rest) and the rest of the document with your title, department,
> whatever-you-need. Pagenumbers definitely work!
>
> Another thing that is not easily doable is if your template has different
> section formatting (like a last page without header/footer). I've only
> successfully worked with one-section reference docs.
>
> Good luck and BR,
>   Agustín.
>
>
> On Tuesday, May 21, 2019 at 4:26:26 PM UTC+2, Alan wrote:
>>
>> Thanks John, Jesse
>>
>> Yes, I meant page breaks. Sorry for not being clear.
>> The raw openxml block approach works well, thanks for that. I need to
>> mull over whether to support a replacement element in the source document
>> (eg the horiz line) or just do something like add page breaks before L1
>> headers, but either way it should be doable.
>>
>> Footers: my reference.docx has got a footer, but it contains two
>> elements: an image and a page number. The page number text isn't making it
>> to the pandoc output, but the image is. I'm still tinkering with this to
>> see if I figure out what's going wrong.
>>
>> regards
>> Alan
>>
>>
>>
>> On Mon, May 20, 2019 at 7:17 PM Jesse Rosenthal <jrose...-4GNroTWusrE@public.gmane.org> wrote:
>>
>>> John MacFarlane <j...-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> writes:
>>>
>>> > You should be able to insert page breaks using a lua
>>> > filter, though.  (You'd need to figure out exactly
>>> > what openxml code to insert as raw openxml.)
>>>
>>> The openxml is:
>>>
>>>     <w:p><w:r><w:br w:type="page" /></w:r></w:p>
>>>
>>> So you should be able to insert that as a RawBlock with format "openxml".
>>>
>>>
>>>
>>> --
> You received this message because you are subscribed to a topic in the
> Google Groups "pandoc-discuss" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/pandoc-discuss/zzEvOYD3IR4/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/52a0ab63-6bb8-4d35-9736-c6a654fc5982%40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/52a0ab63-6bb8-4d35-9736-c6a654fc5982%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CABQ_dt9jbn1avtHeq3cJpCe3fGbfAWTJFRdG5oukGPKxRcPr7Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 7642 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: docx writer: using styles from reference document
       [not found]                                     ` <CABQ_dt9jbn1avtHeq3cJpCe3fGbfAWTJFRdG5oukGPKxRcPr7Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2023-06-16 18:55                                       ` Neil Piper
  0 siblings, 0 replies; 13+ messages in thread
From: Neil Piper @ 2023-06-16 18:55 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 7712 bytes --]

Hi all,

Possibly having a similar problem to earlier threads and not getting the 
Top Level headers coming through:

Am going from 
Asciidoc --> Docbook  (Asciidoctor)
Docbook -> DOCX (Pandoc)

Mostly working in a template I've generated from the Pandoc reference but 
my initial Docbook Top Level Heading / Title is coming out in DOCX as 
'Normal' (No Style)

My next heading works fine in DOCX going to a L1 'Heading 1' Style.

Do I need a Lua to work around this (seems Overkill) or is there an 
alternative way by passing the right arguments to Pandoc?

Command I'm running for Pandoc to DOCX conversion:

```
pandoc --from docbook --to docx --top-level-division=chapter 
--extract-media="$(pwd)"/DIST/images/extracts --output 
"$(pwd)"/DIST/docx/OUTPUT.docx 
--reference-doc="$(pwd)"/src/main/docx/MyDocbookStyles.docx 
"$(pwd)"/DIST/docbook/INPUT.xml
```

Samples:

Asciidoc sample code:
```
# PDF-Themes - Heading 1

 `font-family` OpenSans

 `font-style` Bold

## Heading 2
```
My Docbook looks like this (have marked up the problematic area + a 
screenshot attached)
Generated Docbook from Asciidoc;
```
<?xml version="1.0" encoding="UTF-8"?>
<?asciidoc-toc?>
<?asciidoc-numbered?>
<article xmlns="http://docbook.org/ns/docbook" 
xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en">
<info>
<!-- This one just displays as 'Normal' Style , Why is L1 Heading not 
styled as Doc Title ?-->
<title>PDF-Themes - Heading 1</title>
<date>2023-06-15</date>
</info>
<literallayout class="monospaced">`font-family` OpenSans</literallayout>
<literallayout class="monospaced">`font-style` Bold</literallayout>
<!-- This one is styled to my DOCX 'Heading 1' Style -->
<section xml:id="_heading_2">
<title>Heading 2</title>
...
</article>
```

Word Docx - Heading 1 is Style 'Normal'
[image: Screenshot 2023-06-16 at 19.51.22.png]

On Monday, 27 May 2019 at 07:06:54 UTC+1 Alan wrote:

> Hi Augustin, thanks for the input.
>
> It sounds like your corporate dotx template is a little more wacky than 
> mine (except for the footer). I luckily have no page2 summary stuff to deal 
> with, just a standard auto generated TOC. No special section formatting etc 
> luckily. I did look at hand crafting an equivalent dotx, but in this case 
> it does get a tweak every couple months and I'd like other teams to use the 
> automation that I'm building. So a fairly seamless "just drop in the new 
> template" is important.
>
> Since my last mail I've worked around a few more issues using the RawBlock 
> approach that Jesse suggested. Page numbers in the footer are still not 
> working, but that has more to do with the mad jumble of a footer in my 
> source template.
>
> cheers
> Alan
>
>
> On Thu, May 23, 2019 at 2:49 PM Agustín Martín <agu...-Re5JQEeQqe8@public.gmane.orgm> wrote:
>
>> Hi Alan.
>>
>> I have gone through some of the issues you're experiencing.
>>
>> We also get an official "corporate" word template, which unfortunately 
>> has been designed by graphic designers and not Word power users. Thus the 
>> template is difficult to work with, using many custom styles alternative to 
>> the standard ones like "This is my corporate title 3" (which is actually a 
>> level 2 title....), and suffering from bad choices like exact line spacing, 
>> or hard-to-change default fonts, or messing up the navigation panel.
>>
>> Since we only get new templates every 1-2 years, what I do is I replicate 
>> the look and feel of the template in a sensible way (using standard style 
>> names, and taking advantage of what Word offers). If I get flack for doing 
>> it, I can always create aliases to the standard styles with the "new" 
>> names. In the end I start with a much better working template that uses 
>> standard styles.
>>
>> I try to leave pagebreaks for level 1 titles (easy to define in the 
>> style, like you said). If I *really* need additional pagebreaks, I can 
>> always use a lua filter with a specific code-word in my markdown, but that 
>> is usually an indication that something is not quite as "clean" as it 
>> should be.
>>
>> IMHO the hardest part to get is the second page of the document, if your 
>> template has specific items there such as a table with metadata from the 
>> document, etc. Especially if that has to come before the index. If you can 
>> get by having the index in the second page, you should be able to do most 
>> of what you want anyway.
>>
>> Knowing that you can include document properties in your heading/footer 
>> reference doc, makes it really easy to customize the first page (different 
>> from the rest) and the rest of the document with your title, department, 
>> whatever-you-need. Pagenumbers definitely work!
>>
>> Another thing that is not easily doable is if your template has different 
>> section formatting (like a last page without header/footer). I've only 
>> successfully worked with one-section reference docs.
>>
>> Good luck and BR,
>>   Agustín.
>>
>>
>> On Tuesday, May 21, 2019 at 4:26:26 PM UTC+2, Alan wrote:
>>>
>>> Thanks John, Jesse
>>>
>>> Yes, I meant page breaks. Sorry for not being clear.
>>> The raw openxml block approach works well, thanks for that. I need to 
>>> mull over whether to support a replacement element in the source document 
>>> (eg the horiz line) or just do something like add page breaks before L1 
>>> headers, but either way it should be doable.
>>>
>>> Footers: my reference.docx has got a footer, but it contains two 
>>> elements: an image and a page number. The page number text isn't making it 
>>> to the pandoc output, but the image is. I'm still tinkering with this to 
>>> see if I figure out what's going wrong.
>>>
>>> regards
>>> Alan
>>>
>>>
>>>
>>> On Mon, May 20, 2019 at 7:17 PM Jesse Rosenthal <jrose...-4GNroTWusrE@public.gmane.org> 
>>> wrote:
>>>
>>>> John MacFarlane <j...-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> writes:
>>>>
>>>> > You should be able to insert page breaks using a lua
>>>> > filter, though.  (You'd need to figure out exactly
>>>> > what openxml code to insert as raw openxml.)
>>>>
>>>> The openxml is:
>>>>
>>>>     <w:p><w:r><w:br w:type="page" /></w:r></w:p>
>>>>
>>>> So you should be able to insert that as a RawBlock with format 
>>>> "openxml".
>>>>
>>>>
>>>>
>>>> -- 
>> You received this message because you are subscribed to a topic in the 
>> Google Groups "pandoc-discuss" group.
>> To unsubscribe from this topic, visit 
>> https://groups.google.com/d/topic/pandoc-discuss/zzEvOYD3IR4/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to 
>> pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> To post to this group, send email to pandoc-...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/pandoc-discuss/52a0ab63-6bb8-4d35-9736-c6a654fc5982%40googlegroups.com 
>> <https://groups.google.com/d/msgid/pandoc-discuss/52a0ab63-6bb8-4d35-9736-c6a654fc5982%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>
>
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/624c1ba1-059d-4f51-8eda-a0cad3c48246n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 11075 bytes --]

[-- Attachment #2: Screenshot 2023-06-16 at 19.51.22.png --]
[-- Type: image/png, Size: 242396 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-06-16 18:55 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-06 15:15 docx writer: using styles from reference document Alan McLachlan
     [not found] ` <0c37bc1d-ea10-4945-98a8-b169997ac437-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2019-05-06 15:36   ` Jesse Rosenthal
     [not found]     ` <87v9ynr3wc.fsf-4GNroTWusrE@public.gmane.org>
2019-05-06 15:41       ` Alan
     [not found]         ` <CABQ_dt8PD7jtvWt-8w92nLqK-hiusUVxR=P-JfnYEJembQ2XPQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-05-10 14:29           ` Alan
     [not found]             ` <CABQ_dt9Ee-2dtNgPm7D_jJxhReAzk1Gg2tjVvn2Jah96jSif4Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-05-20 13:06               ` Alan
     [not found]                 ` <CABQ_dt_VkYEL5ED8i8Wo7GUeNA-Kgwyzjpzw2V=vdWYt1+kCLg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-05-20 13:09                   ` Brandon Keith Biggs
     [not found]                     ` <CAKAWQkXpaypSpUU62p=Fr_bhyLhLuAKMQ18MOy2TzTu6LmV3jg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-05-20 13:40                       ` Alan
2019-05-20 16:40                   ` John MacFarlane
     [not found]                     ` <yh480ksgt95bb0.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2019-05-20 17:17                       ` Jesse Rosenthal
     [not found]                         ` <87y331f3jz.fsf-4GNroTWusrE@public.gmane.org>
2019-05-21 14:26                           ` Alan
     [not found]                             ` <CABQ_dt-d1KyE2U4_Hgfbx4=2tMtUaHO-5SfrnqJs8HLFEtTG4w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-05-23 12:49                               ` Agustín Martín
     [not found]                                 ` <52a0ab63-6bb8-4d35-9736-c6a654fc5982-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2019-05-27  6:06                                   ` Alan
     [not found]                                     ` <CABQ_dt9jbn1avtHeq3cJpCe3fGbfAWTJFRdG5oukGPKxRcPr7Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2023-06-16 18:55                                       ` Neil Piper

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).