public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* docx -> markdown & Citavi Content Control
@ 2019-02-12 16:31 Nyoman Bennyamino
       [not found] ` <87d9e5ab-3b83-46cf-a538-a6f2308454d1-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Nyoman Bennyamino @ 2019-02-12 16:31 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2534 bytes --]

Hello,

I'd like to use pandoc to convert a MS Word 365 (*.docx) file to markdown. 
My reference management program (Citavi) is using Word Content Control 
fields to inject references into footnotes. Unfortunately, pandoc is 
omitting the all references instead of at least converting the references 
into plain text.

For example:

One of my footnote is: "92. See Author, New York 2018, p. 18"

The source code for this footnote (from Word's footnotes.xml file):

<w:footnote w:id="92"><w:p w14:paraId="315F39A7" w14:textId="2167C15F" 
w:rsidR="000B6667" w:rsidRDefault="000B6667" 
w:rsidP="00F42B65"><w:pPr><w:pStyle w:val="Funoten"/></w:pPr><w:r 
w:rsidRPr="002C10AE"><w:rPr><w:rStyle 
w:val="Funotenzeichen"/><w:b/><w:vertAlign 
w:val="baseline"/></w:rPr><w:footnoteRef/></w:r><w:r><w:t 
xml:space="preserve"> 
</w:t></w:r><w:r><w:tab/></w:r><w:sdt><w:sdtPr><w:alias w:val="Don't edit 
this field"/><w:tag 
w:val="CitaviPlaceholder#ecdec0e6-4e94-491b-a6d1-74308e00ccd9"/><w:id 
w:val="827022648"/><w:placeholder><w:docPart 
w:val="B8F488EF9BC34FE5B5D83FE3A9CB28BD"/></w:placeholder></w:sdtPr><w:sdtContent><w:r><w:fldChar 
w:fldCharType="begin"/></w:r><w:r><w:instrText>ADDIN 
CitaviPlaceholder{ey....AifQ==}</w:instrText></w:r><w:r><w:fldChar 
w:fldCharType="separate"/></w:r><w:r w:rsidR="00BB051A"><w:t 
xml:space="preserve">*See* </w:t></w:r><w:r w:rsidR="00BB051A" 
w:rsidRPr="00BB051A"><w:rPr><w:smallCaps/></w:rPr><w:t>Author</w:t></w:r><w:r 
w:rsidR="00BB051A" w:rsidRPr="00BB051A"><w:t>*, **New York 2018*</w:t></w:r><w:r 
w:rsidR="00BB051A" w:rsidRPr="00BB051A"><w:rPr><w:vertAlign 
w:val="superscript"/></w:rPr><w:t>81</w:t></w:r><w:r w:rsidR="00BB051A" 
w:rsidRPr="00BB051A"><w:t>*, p. 18.*</w:t></w:r><w:r><w:fldChar 
w:fldCharType="end"/></w:r></w:sdtContent></w:sdt></w:p></w:footnote>

Output after converting:

pandoc -f docx "test.docx" -w markdown_strict --reference-location "block"

"92. "


Any ideas how to fix this?


Thanks,

Nyoman

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/87d9e5ab-3b83-46cf-a538-a6f2308454d1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 10196 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: docx -> markdown & Citavi Content Control
       [not found] ` <87d9e5ab-3b83-46cf-a538-a6f2308454d1-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2019-02-12 17:14   ` John MacFarlane
  2019-02-12 17:36   ` Jesse Rosenthal
  1 sibling, 0 replies; 4+ messages in thread
From: John MacFarlane @ 2019-02-12 17:14 UTC (permalink / raw)
  To: Nyoman Bennyamino, pandoc-discuss


You should submit an issue on our GitHub bug tracker.  Maybe @jkr
would be able to take a look.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: docx -> markdown & Citavi Content Control
       [not found] ` <87d9e5ab-3b83-46cf-a538-a6f2308454d1-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2019-02-12 17:14   ` John MacFarlane
@ 2019-02-12 17:36   ` Jesse Rosenthal
       [not found]     ` <87o97gvrgw.fsf-4GNroTWusrE@public.gmane.org>
  1 sibling, 1 reply; 4+ messages in thread
From: Jesse Rosenthal @ 2019-02-12 17:36 UTC (permalink / raw)
  To: Nyoman Bennyamino, pandoc-discuss

The instructions are in `instrtext` inside `fldchar`, which we do
support (at least as far as parsing text). So this looks like a bug. It
would be great if you could submit it to the github issue tracker.

When you do sumbit a bug report, please make sure to note what version of
pandoc you're using (support for this was only added last year), as well
as a copy of your input docx.

--Jesse

Nyoman Bennyamino <nyoman.bennyamino-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Hello,
>
> I'd like to use pandoc to convert a MS Word 365 (*.docx) file to markdown. 
> My reference management program (Citavi) is using Word Content Control 
> fields to inject references into footnotes. Unfortunately, pandoc is 
> omitting the all references instead of at least converting the references 
> into plain text.
>
> For example:
>
> One of my footnote is: "92. See Author, New York 2018, p. 18"
>
> The source code for this footnote (from Word's footnotes.xml file):
>
> <w:footnote w:id="92"><w:p w14:paraId="315F39A7" w14:textId="2167C15F" 
> w:rsidR="000B6667" w:rsidRDefault="000B6667" 
> w:rsidP="00F42B65"><w:pPr><w:pStyle w:val="Funoten"/></w:pPr><w:r 
> w:rsidRPr="002C10AE"><w:rPr><w:rStyle 
> w:val="Funotenzeichen"/><w:b/><w:vertAlign 
> w:val="baseline"/></w:rPr><w:footnoteRef/></w:r><w:r><w:t 
> xml:space="preserve"> 
> </w:t></w:r><w:r><w:tab/></w:r><w:sdt><w:sdtPr><w:alias w:val="Don't edit 
> this field"/><w:tag 
> w:val="CitaviPlaceholder#ecdec0e6-4e94-491b-a6d1-74308e00ccd9"/><w:id 
> w:val="827022648"/><w:placeholder><w:docPart 
> w:val="B8F488EF9BC34FE5B5D83FE3A9CB28BD"/></w:placeholder></w:sdtPr><w:sdtContent><w:r><w:fldChar 
> w:fldCharType="begin"/></w:r><w:r><w:instrText>ADDIN 
> CitaviPlaceholder{ey....AifQ==}</w:instrText></w:r><w:r><w:fldChar 
> w:fldCharType="separate"/></w:r><w:r w:rsidR="00BB051A"><w:t 
> xml:space="preserve">*See* </w:t></w:r><w:r w:rsidR="00BB051A" 
> w:rsidRPr="00BB051A"><w:rPr><w:smallCaps/></w:rPr><w:t>Author</w:t></w:r><w:r 
> w:rsidR="00BB051A" w:rsidRPr="00BB051A"><w:t>*, **New York 2018*</w:t></w:r><w:r 
> w:rsidR="00BB051A" w:rsidRPr="00BB051A"><w:rPr><w:vertAlign 
> w:val="superscript"/></w:rPr><w:t>81</w:t></w:r><w:r w:rsidR="00BB051A" 
> w:rsidRPr="00BB051A"><w:t>*, p. 18.*</w:t></w:r><w:r><w:fldChar 
> w:fldCharType="end"/></w:r></w:sdtContent></w:sdt></w:p></w:footnote>
>
> Output after converting:
>
> pandoc -f docx "test.docx" -w markdown_strict --reference-location "block"
>
> "92. "
>
>
> Any ideas how to fix this?
>
>
> Thanks,
>
> Nyoman
>
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/87d9e5ab-3b83-46cf-a538-a6f2308454d1%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: docx -> markdown & Citavi Content Control
       [not found]     ` <87o97gvrgw.fsf-4GNroTWusrE@public.gmane.org>
@ 2019-02-12 18:47       ` Nyoman Bennyamino
  0 siblings, 0 replies; 4+ messages in thread
From: Nyoman Bennyamino @ 2019-02-12 18:47 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 4095 bytes --]

Thank you all very much!

I've submitted a bug report <https://github.com/jgm/pandoc/issues/5302>on 
GitHub.

Cheers!

Am Dienstag, 12. Februar 2019 18:36:50 UTC+1 schrieb Jesse Rosenthal:
>
> The instructions are in `instrtext` inside `fldchar`, which we do 
> support (at least as far as parsing text). So this looks like a bug. It 
> would be great if you could submit it to the github issue tracker. 
>
> When you do sumbit a bug report, please make sure to note what version of 
> pandoc you're using (support for this was only added last year), as well 
> as a copy of your input docx. 
>
> --Jesse 
>
> Nyoman Bennyamino <nyoman.b...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <javascript:>> writes: 
>
> > Hello, 
> > 
> > I'd like to use pandoc to convert a MS Word 365 (*.docx) file to 
> markdown. 
> > My reference management program (Citavi) is using Word Content Control 
> > fields to inject references into footnotes. Unfortunately, pandoc is 
> > omitting the all references instead of at least converting the 
> references 
> > into plain text. 
> > 
> > For example: 
> > 
> > One of my footnote is: "92. See Author, New York 2018, p. 18" 
> > 
> > The source code for this footnote (from Word's footnotes.xml file): 
> > 
> > <w:footnote w:id="92"><w:p w14:paraId="315F39A7" w14:textId="2167C15F" 
> > w:rsidR="000B6667" w:rsidRDefault="000B6667" 
> > w:rsidP="00F42B65"><w:pPr><w:pStyle w:val="Funoten"/></w:pPr><w:r 
> > w:rsidRPr="002C10AE"><w:rPr><w:rStyle 
> > w:val="Funotenzeichen"/><w:b/><w:vertAlign 
> > w:val="baseline"/></w:rPr><w:footnoteRef/></w:r><w:r><w:t 
> > xml:space="preserve"> 
> > </w:t></w:r><w:r><w:tab/></w:r><w:sdt><w:sdtPr><w:alias w:val="Don't 
> edit 
> > this field"/><w:tag 
> > w:val="CitaviPlaceholder#ecdec0e6-4e94-491b-a6d1-74308e00ccd9"/><w:id 
> > w:val="827022648"/><w:placeholder><w:docPart 
> > 
> w:val="B8F488EF9BC34FE5B5D83FE3A9CB28BD"/></w:placeholder></w:sdtPr><w:sdtContent><w:r><w:fldChar 
>
> > w:fldCharType="begin"/></w:r><w:r><w:instrText>ADDIN 
> > CitaviPlaceholder{ey....AifQ==}</w:instrText></w:r><w:r><w:fldChar 
> > w:fldCharType="separate"/></w:r><w:r w:rsidR="00BB051A"><w:t 
> > xml:space="preserve">*See* </w:t></w:r><w:r w:rsidR="00BB051A" 
> > 
> w:rsidRPr="00BB051A"><w:rPr><w:smallCaps/></w:rPr><w:t>Author</w:t></w:r><w:r 
>
> > w:rsidR="00BB051A" w:rsidRPr="00BB051A"><w:t>*, **New York 
> 2018*</w:t></w:r><w:r 
> > w:rsidR="00BB051A" w:rsidRPr="00BB051A"><w:rPr><w:vertAlign 
> > w:val="superscript"/></w:rPr><w:t>81</w:t></w:r><w:r w:rsidR="00BB051A" 
> > w:rsidRPr="00BB051A"><w:t>*, p. 18.*</w:t></w:r><w:r><w:fldChar 
> > w:fldCharType="end"/></w:r></w:sdtContent></w:sdt></w:p></w:footnote> 
> > 
> > Output after converting: 
> > 
> > pandoc -f docx "test.docx" -w markdown_strict --reference-location 
> "block" 
> > 
> > "92. " 
> > 
> > 
> > Any ideas how to fix this? 
> > 
> > 
> > Thanks, 
> > 
> > Nyoman 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "pandoc-discuss" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>. 
> > To post to this group, send email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org 
> <javascript:>. 
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/pandoc-discuss/87d9e5ab-3b83-46cf-a538-a6f2308454d1%40googlegroups.com. 
>
> > For more options, visit https://groups.google.com/d/optout. 
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/36b188d6-6558-4458-832c-c6ba8ec6c584%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 7177 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-02-12 18:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-12 16:31 docx -> markdown & Citavi Content Control Nyoman Bennyamino
     [not found] ` <87d9e5ab-3b83-46cf-a538-a6f2308454d1-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2019-02-12 17:14   ` John MacFarlane
2019-02-12 17:36   ` Jesse Rosenthal
     [not found]     ` <87o97gvrgw.fsf-4GNroTWusrE@public.gmane.org>
2019-02-12 18:47       ` Nyoman Bennyamino

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).