* Pandoc Filters
@ 2018-08-13 15:25 Hanz Husseiner
[not found] ` <385b78c1-2518-48d8-b24d-5a6fa648b75e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Hanz Husseiner @ 2018-08-13 15:25 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 1225 bytes --]
We are using Pandoc to convert Word documents (.DOCX) to reStructuredText
(.RST). Some of our Word documents contain custom styles that we would like
converted to an equivalent in RST.
Example:
• Wherever the custom style Note is used in the DOCX convert to .. note::
in RST.
• Wherever the custom style Warning is used in the DOCX convert to ..
warning:: in RST.
• Wherever the custom style Code-Block is used in the DOCX convert to ..
code-block:: in RST.
Is it possible to use Pandoc Filters <https://pandoc.org/filters.html> to
do this and if so how would we go about creating such a filter?
Thank you
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/385b78c1-2518-48d8-b24d-5a6fa648b75e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
[-- Attachment #1.2: Type: text/html, Size: 1897 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Pandoc Filters
[not found] ` <385b78c1-2518-48d8-b24d-5a6fa648b75e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2018-08-13 15:29 ` Hanz Husseiner
2018-08-13 17:13 ` John MacFarlane
1 sibling, 0 replies; 6+ messages in thread
From: Hanz Husseiner @ 2018-08-13 15:29 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 1423 bytes --]
I have attached a sample doc file clarify my question in my post... Thank
you.
On Monday, August 13, 2018 at 11:25:29 AM UTC-4, Hanz Husseiner wrote:
>
> We are using Pandoc to convert Word documents (.DOCX) to reStructuredText
> (.RST). Some of our Word documents contain custom styles that we would like
> converted to an equivalent in RST.
>
> Example:
>
> • Wherever the custom style Note is used in the DOCX convert to .. note::
> in RST.
>
> • Wherever the custom style Warning is used in the DOCX convert to ..
> warning:: in RST.
>
> • Wherever the custom style Code-Block is used in the DOCX convert to ..
> code-block:: in RST.
>
> Is it possible to use Pandoc Filters <https://pandoc.org/filters.html> to
> do this and if so how would we go about creating such a filter?
>
> Thank you
>
>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/2b231c9d-7254-4b14-b5c8-1c21e4998387%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
[-- Attachment #1.2: Type: text/html, Size: 2606 bytes --]
[-- Attachment #2: Custom-Styles.docx --]
[-- Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document, Size: 14710 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Pandoc Filters
[not found] ` <385b78c1-2518-48d8-b24d-5a6fa648b75e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2018-08-13 15:29 ` Hanz Husseiner
@ 2018-08-13 17:13 ` John MacFarlane
[not found] ` <yh480kd0ump4ua.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
1 sibling, 1 reply; 6+ messages in thread
From: John MacFarlane @ 2018-08-13 17:13 UTC (permalink / raw)
To: Hanz Husseiner, pandoc-discuss
There are two parts to this. First, parsing the docx
and preserving the custom styles. That can be done
with
pandoc -f docx+styles
which will give you an AST like
Div ("",[],[("custom-style","Note")])
[Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "note."]]
Second part is rendering these special divs in RST as
admonitions. Currently this would be tricky, though
you could do it in a filter. You'd have to have the
filter render the contents of the div as RST (lua
filters expose functions you can use for this), then
indent this, append the ".. note::" to the front, and
include it as a RawBlock with Format "rst".
I think the RST writer should be changed to make this
easier. Currently the RST reader parses
.. warning::
Hi
as
[Div ("",["warning"],[])
[Div ("",["admonition-title"],[])
[Para [Str "Warning"]]
,Para [Str "Hi"]]]
But the writer does not render the same structure back
to
.. warning::
Hi
If it did, then all you'd have to do in a filter is
change
Div ("",[],[("custom-style","Note")])
[Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "note."]]
into
[Div ("",["note"],[])
[Para [Str "Hi"]]]
And this would be a trivial job for a filter.
I've added an issue for the writer: https://github.com/jgm/pandoc/issues/4833
Hanz Husseiner <hanzhanseller-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> We are using Pandoc to convert Word documents (.DOCX) to reStructuredText
> (.RST). Some of our Word documents contain custom styles that we would like
> converted to an equivalent in RST.
>
> Example:
>
> • Wherever the custom style Note is used in the DOCX convert to .. note::
> in RST.
>
> • Wherever the custom style Warning is used in the DOCX convert to ..
> warning:: in RST.
>
> • Wherever the custom style Code-Block is used in the DOCX convert to ..
> code-block:: in RST.
>
> Is it possible to use Pandoc Filters <https://pandoc.org/filters.html> to
> do this and if so how would we go about creating such a filter?
>
> Thank you
>
> --
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/385b78c1-2518-48d8-b24d-5a6fa648b75e%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/yh480kd0ump4ua.fsf%40johnmacfarlane.net.
For more options, visit https://groups.google.com/d/optout.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Pandoc Filters
[not found] ` <yh480kd0ump4ua.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
@ 2018-08-13 18:18 ` John MacFarlane
[not found] ` <yh480ka7pqp1tp.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: John MacFarlane @ 2018-08-13 18:18 UTC (permalink / raw)
To: Hanz Husseiner, pandoc-discuss
OK, I've made the needed changes to the RST writer.
If you compile pandoc from source, you can use them.
Then a simple style like
% cat style-to-class.lua
function Div(el)
el.classes = {el.attributes['custom-style']:lower()}
return el
end
might be all you need.
John MacFarlane <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> writes:
> There are two parts to this. First, parsing the docx
> and preserving the custom styles. That can be done
> with
>
> pandoc -f docx+styles
>
> which will give you an AST like
>
> Div ("",[],[("custom-style","Note")])
> [Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "note."]]
>
> Second part is rendering these special divs in RST as
> admonitions. Currently this would be tricky, though
> you could do it in a filter. You'd have to have the
> filter render the contents of the div as RST (lua
> filters expose functions you can use for this), then
> indent this, append the ".. note::" to the front, and
> include it as a RawBlock with Format "rst".
>
> I think the RST writer should be changed to make this
> easier. Currently the RST reader parses
>
> .. warning::
>
> Hi
>
> as
>
> [Div ("",["warning"],[])
> [Div ("",["admonition-title"],[])
> [Para [Str "Warning"]]
> ,Para [Str "Hi"]]]
>
> But the writer does not render the same structure back
> to
>
> .. warning::
>
> Hi
>
> If it did, then all you'd have to do in a filter is
> change
>
> Div ("",[],[("custom-style","Note")])
> [Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "note."]]
>
> into
>
> [Div ("",["note"],[])
> [Para [Str "Hi"]]]
>
> And this would be a trivial job for a filter.
>
> I've added an issue for the writer: https://github.com/jgm/pandoc/issues/4833
>
>
> Hanz Husseiner <hanzhanseller-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>
>> We are using Pandoc to convert Word documents (.DOCX) to reStructuredText
>> (.RST). Some of our Word documents contain custom styles that we would like
>> converted to an equivalent in RST.
>>
>> Example:
>>
>> • Wherever the custom style Note is used in the DOCX convert to .. note::
>> in RST.
>>
>> • Wherever the custom style Warning is used in the DOCX convert to ..
>> warning:: in RST.
>>
>> • Wherever the custom style Code-Block is used in the DOCX convert to ..
>> code-block:: in RST.
>>
>> Is it possible to use Pandoc Filters <https://pandoc.org/filters.html> to
>> do this and if so how would we go about creating such a filter?
>>
>> Thank you
>>
>> --
>> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/385b78c1-2518-48d8-b24d-5a6fa648b75e%40googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/yh480ka7pqp1tp.fsf%40johnmacfarlane.net.
For more options, visit https://groups.google.com/d/optout.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Pandoc Filters
[not found] ` <yh480ka7pqp1tp.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
@ 2018-08-15 14:49 ` Hanz Husseiner
[not found] ` <2326165f-8429-46be-a113-60a2ad4b871d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Hanz Husseiner @ 2018-08-15 14:49 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 4558 bytes --]
Thank you so much for your help. I am so sorry for asking to many question
as I am very new to both Python and Pandoc.
The suggested class by you where should i add it.?
% cat style-to-class.lua
function Div(el)
el.classes = {el.attributes['custom-style']:lower()}
return el
end
Thank you
On Monday, August 13, 2018 at 2:18:58 PM UTC-4, John MacFarlane wrote:
>
>
> OK, I've made the needed changes to the RST writer.
> If you compile pandoc from source, you can use them.
> Then a simple style like
>
> % cat style-to-class.lua
> function Div(el)
> el.classes = {el.attributes['custom-style']:lower()}
> return el
> end
>
> might be all you need.
>
> John MacFarlane <j...-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org <javascript:>> writes:
>
> > There are two parts to this. First, parsing the docx
> > and preserving the custom styles. That can be done
> > with
> >
> > pandoc -f docx+styles
> >
> > which will give you an AST like
> >
> > Div ("",[],[("custom-style","Note")])
> > [Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "note."]]
> >
> > Second part is rendering these special divs in RST as
> > admonitions. Currently this would be tricky, though
> > you could do it in a filter. You'd have to have the
> > filter render the contents of the div as RST (lua
> > filters expose functions you can use for this), then
> > indent this, append the ".. note::" to the front, and
> > include it as a RawBlock with Format "rst".
> >
> > I think the RST writer should be changed to make this
> > easier. Currently the RST reader parses
> >
> > .. warning::
> >
> > Hi
> >
> > as
> >
> > [Div ("",["warning"],[])
> > [Div ("",["admonition-title"],[])
> > [Para [Str "Warning"]]
> > ,Para [Str "Hi"]]]
> >
> > But the writer does not render the same structure back
> > to
> >
> > .. warning::
> >
> > Hi
> >
> > If it did, then all you'd have to do in a filter is
> > change
> >
> > Div ("",[],[("custom-style","Note")])
> > [Para [Str "This",Space,Str "is",Space,Str "a",Space,Str "note."]]
> >
> > into
> >
> > [Div ("",["note"],[])
> > [Para [Str "Hi"]]]
> >
> > And this would be a trivial job for a filter.
> >
> > I've added an issue for the writer:
> https://github.com/jgm/pandoc/issues/4833
> >
> >
> > Hanz Husseiner <hanzha...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <javascript:>> writes:
> >
> >> We are using Pandoc to convert Word documents (.DOCX) to
> reStructuredText
> >> (.RST). Some of our Word documents contain custom styles that we would
> like
> >> converted to an equivalent in RST.
> >>
> >> Example:
> >>
> >> • Wherever the custom style Note is used in the DOCX convert to ..
> note::
> >> in RST.
> >>
> >> • Wherever the custom style Warning is used in the DOCX convert to ..
> >> warning:: in RST.
> >>
> >> • Wherever the custom style Code-Block is used in the DOCX convert to
> ..
> >> code-block:: in RST.
> >>
> >> Is it possible to use Pandoc Filters <https://pandoc.org/filters.html>
> to
> >> do this and if so how would we go about creating such a filter?
> >>
> >> Thank you
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> Groups "pandoc-discuss" group.
> >> To unsubscribe from this group and stop receiving emails from it, send
> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>.
> >> To post to this group, send email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> <javascript:>.
> >> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/385b78c1-2518-48d8-b24d-5a6fa648b75e%40googlegroups.com.
>
> >> For more options, visit https://groups.google.com/d/optout.
>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/2326165f-8429-46be-a113-60a2ad4b871d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
[-- Attachment #1.2: Type: text/html, Size: 8436 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Pandoc Filters
[not found] ` <2326165f-8429-46be-a113-60a2ad4b871d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2018-08-15 16:51 ` John MacFarlane
0 siblings, 0 replies; 6+ messages in thread
From: John MacFarlane @ 2018-08-15 16:51 UTC (permalink / raw)
To: Hanz Husseiner, pandoc-discuss
Hanz Husseiner <hanzhanseller-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> Thank you so much for your help. I am so sorry for asking to many question
> as I am very new to both Python and Pandoc.
>
> The suggested class by you where should i add it.?
If the class portion of the attributes matches a
standard RST admonition name (e.g. warning) it will be used.
So, setting
el.classes = {'warning'}
for example, will make it a warning. You could have
a conditional like:
if el.attributes['custom-style'] == 'Warning' then
el.classes = {'warning'}
end
and so on.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-08-15 16:51 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-13 15:25 Pandoc Filters Hanz Husseiner
[not found] ` <385b78c1-2518-48d8-b24d-5a6fa648b75e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2018-08-13 15:29 ` Hanz Husseiner
2018-08-13 17:13 ` John MacFarlane
[not found] ` <yh480kd0ump4ua.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2018-08-13 18:18 ` John MacFarlane
[not found] ` <yh480ka7pqp1tp.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2018-08-15 14:49 ` Hanz Husseiner
[not found] ` <2326165f-8429-46be-a113-60a2ad4b871d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2018-08-15 16:51 ` John MacFarlane
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).