public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Pandoc Filters
@ 2018-08-13 15:25 Hanz Husseiner
       [not found] ` <385b78c1-2518-48d8-b24d-5a6fa648b75e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Hanz Husseiner @ 2018-08-13 15:25 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1225 bytes --]

We are using Pandoc to convert Word documents (.DOCX) to reStructuredText 
(.RST). Some of our Word documents contain custom styles that we would like 
converted to an equivalent in RST. 

Example: 
 
• Wherever the custom style Note is used in the DOCX convert to .. note:: 
in RST.

• Wherever the custom style Warning is used in the DOCX convert to .. 
warning:: in RST.

• Wherever the custom style Code-Block is used in the DOCX convert to .. 
code-block:: in RST.

Is it possible to use Pandoc Filters <https://pandoc.org/filters.html> to 
do this and if so how would we go about creating such a filter?

Thank you

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/385b78c1-2518-48d8-b24d-5a6fa648b75e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 1897 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Pandoc Filters
       [not found] ` <385b78c1-2518-48d8-b24d-5a6fa648b75e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2018-08-13 15:29   ` Hanz Husseiner
  2018-08-13 17:13   ` John MacFarlane
  1 sibling, 0 replies; 6+ messages in thread
From: Hanz Husseiner @ 2018-08-13 15:29 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1423 bytes --]

I have attached a sample doc file clarify my question in my post... Thank 
you.

On Monday, August 13, 2018 at 11:25:29 AM UTC-4, Hanz Husseiner wrote:
>
> We are using Pandoc to convert Word documents (.DOCX) to reStructuredText 
> (.RST). Some of our Word documents contain custom styles that we would like 
> converted to an equivalent in RST. 
>
> Example: 
>  
> • Wherever the custom style Note is used in the DOCX convert to .. note:: 
> in RST.
>
> • Wherever the custom style Warning is used in the DOCX convert to .. 
> warning:: in RST.
>
> • Wherever the custom style Code-Block is used in the DOCX convert to .. 
> code-block:: in RST.
>
> Is it possible to use Pandoc Filters <https://pandoc.org/filters.html> to 
> do this and if so how would we go about creating such a filter?
>
> Thank you
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/2b231c9d-7254-4b14-b5c8-1c21e4998387%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 2606 bytes --]

[-- Attachment #2: Custom-Styles.docx --]
[-- Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document, Size: 14710 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Pandoc Filters
       [not found] ` <385b78c1-2518-48d8-b24d-5a6fa648b75e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2018-08-13 15:29   ` Hanz Husseiner
@ 2018-08-13 17:13   ` John MacFarlane
       [not found]     ` <yh480kd0ump4ua.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
  1 sibling, 1 reply; 6+ messages in thread
From: John MacFarlane @ 2018-08-13 17:13 UTC (permalink / raw)
  To: Hanz Husseiner, pandoc-discuss


There are two parts to this.  First, parsing the docx
and preserving the custom styles.  That can be done
with

    pandoc -f docx+styles

which will give you an AST like

    Div ("",[],[("custom-style","Note")])
     [Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "note."]]

Second part is rendering these special divs in RST as
admonitions.  Currently this would be tricky, though
you could do it in a filter.  You'd have to have the
filter render the contents of the div as RST (lua
filters expose functions you can use for this), then
indent this, append the ".. note::" to the front, and
include it as a RawBlock with Format "rst".

I think the RST writer should be changed to make this
easier.  Currently the RST reader parses

    .. warning::

       Hi

as

    [Div ("",["warning"],[])
     [Div ("",["admonition-title"],[])
      [Para [Str "Warning"]]
     ,Para [Str "Hi"]]]

But the writer does not render the same structure back
to

    .. warning::

       Hi

If it did, then all you'd have to do in a filter is
change

    Div ("",[],[("custom-style","Note")])
     [Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "note."]]

into

    [Div ("",["note"],[])
     [Para [Str "Hi"]]]

And this would be a trivial job for a filter.

I've added an issue for the writer: https://github.com/jgm/pandoc/issues/4833


Hanz Husseiner <hanzhanseller-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> We are using Pandoc to convert Word documents (.DOCX) to reStructuredText 
> (.RST). Some of our Word documents contain custom styles that we would like 
> converted to an equivalent in RST. 
>
> Example: 
>  
> • Wherever the custom style Note is used in the DOCX convert to .. note:: 
> in RST.
>
> • Wherever the custom style Warning is used in the DOCX convert to .. 
> warning:: in RST.
>
> • Wherever the custom style Code-Block is used in the DOCX convert to .. 
> code-block:: in RST.
>
> Is it possible to use Pandoc Filters <https://pandoc.org/filters.html> to 
> do this and if so how would we go about creating such a filter?
>
> Thank you
>
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/385b78c1-2518-48d8-b24d-5a6fa648b75e%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/yh480kd0ump4ua.fsf%40johnmacfarlane.net.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Pandoc Filters
       [not found]     ` <yh480kd0ump4ua.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
@ 2018-08-13 18:18       ` John MacFarlane
       [not found]         ` <yh480ka7pqp1tp.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: John MacFarlane @ 2018-08-13 18:18 UTC (permalink / raw)
  To: Hanz Husseiner, pandoc-discuss


OK, I've made the needed changes to the RST writer.
If you compile pandoc from source, you can use them.
Then a simple style like

% cat style-to-class.lua 
function Div(el)
    el.classes = {el.attributes['custom-style']:lower()}
    return el
end

might be all you need.

John MacFarlane <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> writes:

> There are two parts to this.  First, parsing the docx
> and preserving the custom styles.  That can be done
> with
>
>     pandoc -f docx+styles
>
> which will give you an AST like
>
>     Div ("",[],[("custom-style","Note")])
>      [Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "note."]]
>
> Second part is rendering these special divs in RST as
> admonitions.  Currently this would be tricky, though
> you could do it in a filter.  You'd have to have the
> filter render the contents of the div as RST (lua
> filters expose functions you can use for this), then
> indent this, append the ".. note::" to the front, and
> include it as a RawBlock with Format "rst".
>
> I think the RST writer should be changed to make this
> easier.  Currently the RST reader parses
>
>     .. warning::
>
>        Hi
>
> as
>
>     [Div ("",["warning"],[])
>      [Div ("",["admonition-title"],[])
>       [Para [Str "Warning"]]
>      ,Para [Str "Hi"]]]
>
> But the writer does not render the same structure back
> to
>
>     .. warning::
>
>        Hi
>
> If it did, then all you'd have to do in a filter is
> change
>
>     Div ("",[],[("custom-style","Note")])
>      [Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "note."]]
>
> into
>
>     [Div ("",["note"],[])
>      [Para [Str "Hi"]]]
>
> And this would be a trivial job for a filter.
>
> I've added an issue for the writer: https://github.com/jgm/pandoc/issues/4833
>
>
> Hanz Husseiner <hanzhanseller-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>
>> We are using Pandoc to convert Word documents (.DOCX) to reStructuredText 
>> (.RST). Some of our Word documents contain custom styles that we would like 
>> converted to an equivalent in RST. 
>>
>> Example: 
>>  
>> • Wherever the custom style Note is used in the DOCX convert to .. note:: 
>> in RST.
>>
>> • Wherever the custom style Warning is used in the DOCX convert to .. 
>> warning:: in RST.
>>
>> • Wherever the custom style Code-Block is used in the DOCX convert to .. 
>> code-block:: in RST.
>>
>> Is it possible to use Pandoc Filters <https://pandoc.org/filters.html> to 
>> do this and if so how would we go about creating such a filter?
>>
>> Thank you
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/385b78c1-2518-48d8-b24d-5a6fa648b75e%40googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/yh480ka7pqp1tp.fsf%40johnmacfarlane.net.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Pandoc Filters
       [not found]         ` <yh480ka7pqp1tp.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
@ 2018-08-15 14:49           ` Hanz Husseiner
       [not found]             ` <2326165f-8429-46be-a113-60a2ad4b871d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Hanz Husseiner @ 2018-08-15 14:49 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 4558 bytes --]

Thank you so much for your help. I am so sorry for asking to many question 
as I am very new to both Python and Pandoc.

The suggested class by you where should i add it.?

% cat style-to-class.lua 
function Div(el) 
    el.classes = {el.attributes['custom-style']:lower()} 
    return el 
end 

Thank you


On Monday, August 13, 2018 at 2:18:58 PM UTC-4, John MacFarlane wrote:
>
>
> OK, I've made the needed changes to the RST writer. 
> If you compile pandoc from source, you can use them. 
> Then a simple style like 
>
> % cat style-to-class.lua 
> function Div(el) 
>     el.classes = {el.attributes['custom-style']:lower()} 
>     return el 
> end 
>
> might be all you need. 
>
> John MacFarlane <j...-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org <javascript:>> writes: 
>
> > There are two parts to this.  First, parsing the docx 
> > and preserving the custom styles.  That can be done 
> > with 
> > 
> >     pandoc -f docx+styles 
> > 
> > which will give you an AST like 
> > 
> >     Div ("",[],[("custom-style","Note")]) 
> >      [Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "note."]] 
> > 
> > Second part is rendering these special divs in RST as 
> > admonitions.  Currently this would be tricky, though 
> > you could do it in a filter.  You'd have to have the 
> > filter render the contents of the div as RST (lua 
> > filters expose functions you can use for this), then 
> > indent this, append the ".. note::" to the front, and 
> > include it as a RawBlock with Format "rst". 
> > 
> > I think the RST writer should be changed to make this 
> > easier.  Currently the RST reader parses 
> > 
> >     .. warning:: 
> > 
> >        Hi 
> > 
> > as 
> > 
> >     [Div ("",["warning"],[]) 
> >      [Div ("",["admonition-title"],[]) 
> >       [Para [Str "Warning"]] 
> >      ,Para [Str "Hi"]]] 
> > 
> > But the writer does not render the same structure back 
> > to 
> > 
> >     .. warning:: 
> > 
> >        Hi 
> > 
> > If it did, then all you'd have to do in a filter is 
> > change 
> > 
> >     Div ("",[],[("custom-style","Note")]) 
> >      [Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "note."]] 
> > 
> > into 
> > 
> >     [Div ("",["note"],[]) 
> >      [Para [Str "Hi"]]] 
> > 
> > And this would be a trivial job for a filter. 
> > 
> > I've added an issue for the writer: 
> https://github.com/jgm/pandoc/issues/4833 
> > 
> > 
> > Hanz Husseiner <hanzha...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <javascript:>> writes: 
> > 
> >> We are using Pandoc to convert Word documents (.DOCX) to 
> reStructuredText 
> >> (.RST). Some of our Word documents contain custom styles that we would 
> like 
> >> converted to an equivalent in RST. 
> >> 
> >> Example: 
> >>   
> >> • Wherever the custom style Note is used in the DOCX convert to .. 
> note:: 
> >> in RST. 
> >> 
> >> • Wherever the custom style Warning is used in the DOCX convert to .. 
> >> warning:: in RST. 
> >> 
> >> • Wherever the custom style Code-Block is used in the DOCX convert to 
> .. 
> >> code-block:: in RST. 
> >> 
> >> Is it possible to use Pandoc Filters <https://pandoc.org/filters.html> 
> to 
> >> do this and if so how would we go about creating such a filter? 
> >> 
> >> Thank you 
> >> 
> >> -- 
> >> You received this message because you are subscribed to the Google 
> Groups "pandoc-discuss" group. 
> >> To unsubscribe from this group and stop receiving emails from it, send 
> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>. 
> >> To post to this group, send email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org 
> <javascript:>. 
> >> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/pandoc-discuss/385b78c1-2518-48d8-b24d-5a6fa648b75e%40googlegroups.com. 
>
> >> For more options, visit https://groups.google.com/d/optout. 
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/2326165f-8429-46be-a113-60a2ad4b871d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 8436 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Pandoc Filters
       [not found]             ` <2326165f-8429-46be-a113-60a2ad4b871d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2018-08-15 16:51               ` John MacFarlane
  0 siblings, 0 replies; 6+ messages in thread
From: John MacFarlane @ 2018-08-15 16:51 UTC (permalink / raw)
  To: Hanz Husseiner, pandoc-discuss

Hanz Husseiner <hanzhanseller-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Thank you so much for your help. I am so sorry for asking to many question 
> as I am very new to both Python and Pandoc.
>
> The suggested class by you where should i add it.?

If the class portion of the attributes matches a
standard RST admonition name (e.g. warning) it will be used.

So, setting

el.classes = {'warning'}

for example, will make it a warning.  You could have
a conditional like:

if el.attributes['custom-style'] == 'Warning' then
  el.classes = {'warning'}
end

and so on.

    


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-08-15 16:51 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-13 15:25 Pandoc Filters Hanz Husseiner
     [not found] ` <385b78c1-2518-48d8-b24d-5a6fa648b75e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2018-08-13 15:29   ` Hanz Husseiner
2018-08-13 17:13   ` John MacFarlane
     [not found]     ` <yh480kd0ump4ua.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2018-08-13 18:18       ` John MacFarlane
     [not found]         ` <yh480ka7pqp1tp.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2018-08-15 14:49           ` Hanz Husseiner
     [not found]             ` <2326165f-8429-46be-a113-60a2ad4b871d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2018-08-15 16:51               ` John MacFarlane

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).