Let me explain my understanding how MS Word works. There is no internal structure in a Word document, so Word cannot discern if this is a caption or a regular text under the image. In Word, caption is not "inside" the image block, and is in no way connected to the image structurally. It just follows the image. A caption is just a paragraph with a style applied. A caption for an image can be under an image and above the table, or vice versa. Pandoc has an internal caption field in an image object, but its internalness is lost when this object is expanded into Word xml, and becomes two sequential paragraphs. So, if you want to turn Word "captions" back into objects with captions, you will have to parse the pandoc elements tree with your own filter, and merge the adjacent strings into objects with captions. сб, 15 авг. 2020 г., 12:26 Philipp Zumstein : > Maybe, the problem is larger. Let me try to explain what I found out: > > I used a test DOCX from the repo with an image: > https://github.com/jgm/pandoc/blob/master/test/docx/golden/image.docx > > 1) DOCX -> MD: Besides the caption in the square brackets (alt text) I > also see an extra line following the image with the caption text. > 2) DOCX -> MD -> PDF: In the PDF output the images are in a figure float > and have a caption with the label "Figure" and automatically numbered, > which is what I want. But each caption occurs additionally in a separate > line in the text, which I don't want. This is a follow-up problem of what I > describe under 1) > 3) DOCX -> LATEX/PDF: The images are not in any figure float and the > caption text is just the next line and can therefore be splitted from the > image. That is not what I want. > > Isn't this a general problem how images with captions are transformed with > pandoc? > > I do the workflow 2) but have currently to manually delete the extra lines > in the MD document resp. copy them into the brackets. > > > Am Fr., 14. Aug. 2020 um 23:20 Uhr schrieb Philipp Zumstein < > zuphilip-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>: > >> Okay, it works for you w/o problems. Do I guess correctly that you have >> Word in an English localization? If I try to open the different parts of >> the word document then I see in the document.xml that the caption is saved >> in a XML-tag of the form >> ``` >> >> ``` >> Is this handled in the DOCX-reader? Can you point me to the place which >> is responsible for reading the image caption in the code of the docx reader? >> >> Oh, the things in the curly braces is only the id of the image, such that >> you can point to it like [see](#image1). But that is negligible for my >> problem here. >> >> Best regards, >> Philipp >> >> Am Fr., 14. Aug. 2020 um 20:16 Uhr schrieb Leonard Rosenthol < >> leonardr-bM6h3K5UM15l57MIdRCFDg@public.gmane.org>: >> >>> The Image1 doesn't go into the DocX file - but the title (Abb. 1) does >>> as the caption. >>> >>> And going back to markdown, it comes back in the right spot. >>> >>> What are you trying to do with the {#image1} >>> >>> Leonard >>> >>> Leonard >>> >>> >>> On Thu, Aug 13, 2020 at 4:59 PM Denis Maier < >>> denis.maier.lists-cl+VPiYnx/1AfugRpC6u6w@public.gmane.org> wrote: >>> >>>> Just for the record. I've just tried, and roundtripping doesn't work. >>>> >>>> That's the input document: >>>> >>>> ``` >>>> >>>> hallo. >>>> >>>> ![Abb. 1: title](texworks.png){ #image1 } >>>> ``` >>>> >>>> Converting to docx produces an image with a caption (style is "image >>>> caption"). Converting the untouched document back to md gives me: >>>> >>>> ```hallo. >>>> >>>> ![Abb. 1: title](media/rId20.png){width="3.5555555555555554in" >>>> height="3.5555555555555554in"} >>>> >>>> Abb. 1: title >>>> ``` >>>> >>>> But, I also have a German localized Word... >>>> Some time ago there was an issue that styles weren't picked up properly >>>> if localized styles were used. But that doesn't seem to be the case here as >>>> I have not saved the docx with word. The styles as produced by pandoc >>>> should still be there. >>>> >>>> Best, >>>> Denis >>>> >>>> >>>> I right click on the image and choose "Beschriftung einfügen..." in >>>> Word. However, this is then transformed to a separate line in MD: >>>> >>>> ``` >>>> ![](media/image1.png){width="1.3888888888888888in" height="1.375in"} >>>> >>>> Abbildung : title >>>> ``` >>>> >>>> Is this working for you? >>>> >>>> Is there possibly a difference if I do that in a German localized Word? >>>> >>>> Thank you and best regards, >>>> Philipp >>>> >>>> >>>> Am Do., 13. Aug. 2020 um 22:22 Uhr schrieb Leonard Rosenthol < >>>> leonardr-bM6h3K5UM15l57MIdRCFDg@public.gmane.org>: >>>> >>>>> AFAICT from a quick read of the DocX Reader - if you set the caption >>>>> in w/ord using its "Insert Caption" choice, that will come over into the >>>>> Markdown. >>>>> >>>>> Leonard >>>>> >>>>> On Thu, Aug 13, 2020 at 4:09 PM Philipp Zumstein >>>>> wrote: >>>>> >>>>>> I would like to create some DOCX document which will then translate >>>>>> to Markdown containing an image with its caption (in the square bracket), >>>>>> i.e. the result after the pandoc transformation DOCX -> MD should look >>>>>> something like this >>>>>> >>>>>> ![Abb. 1: title](ip-logo.png){ #image1 } >>>>>> >>>>>> I tried to add the caption "Abb. 1: title" in Word on a newline after >>>>>> the image and choosed the style "Image Caption", but that did not work. >>>>>> Also if I use the Word functionality to add a caption to the image, that >>>>>> was again only parsed as an additional line of text. The only thing which >>>>>> works is to format the image in Word and add some alternative (hidden) text. >>>>>> >>>>>> Is there a more visible way to achieve the above markdown line from a >>>>>> word document? How should I use the styles "Image Caption" or "Captioned >>>>>> Image" in Word correctly such that pandoc will do something with them? Is >>>>>> it normal that I don't see these styles in the native ATX output? >>>>>> >>>>>> I am using a German version of Word on a windows machine with pandoc >>>>>> version 2.10.1. >>>>>> >>>>>> Thank you very much for any hint! >>>>>> Philipp >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "pandoc-discuss" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/pandoc-discuss/601f7c12-1b83-43df-97ca-4288126ac4e4n%40googlegroups.com >>>>>> >>>>>> . >>>>>> >>>>> -- >>>>> You received this message because you are subscribed to a topic in the >>>>> Google Groups "pandoc-discuss" group. >>>>> To unsubscribe from this topic, visit >>>>> https://groups.google.com/d/topic/pandoc-discuss/Pm6_hoJ2Zao/unsubscribe >>>>> . >>>>> To unsubscribe from this group and all its topics, send an email to >>>>> pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/pandoc-discuss/CALu%3Dv3Jic%2BxJzRqZqKc68bgq9%2BhJu4ggT8QVYywREoNjxJJ9Tw%40mail.gmail.com >>>>> >>>>> . >>>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "pandoc-discuss" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/pandoc-discuss/CAAjpKCQxWSdbcYLQ0hEDNM-G0RZEzaST0a6QPBd40aJGtHs1og%40mail.gmail.com >>>> >>>> . >>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "pandoc-discuss" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/pandoc-discuss/fa0d0129-89db-209c-3d4b-0f54fbc34dc3%40mailbox.org >>>> >>>> . >>>> >>> -- >>> You received this message because you are subscribed to a topic in the >>> Google Groups "pandoc-discuss" group. >>> To unsubscribe from this topic, visit >>> https://groups.google.com/d/topic/pandoc-discuss/Pm6_hoJ2Zao/unsubscribe >>> . >>> To unsubscribe from this group and all its topics, send an email to >>> pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/pandoc-discuss/CALu%3Dv3KF2OLWfzNVFuConLm07t6cT-WHW0McykWoX3Rk1oLgag%40mail.gmail.com >>> >>> . >>> >> -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/CAAjpKCSks1XoOZDm%3DJtN05p3yt0JJVbnRkdVe%2BwidioOmtncyw%40mail.gmail.com > > . > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CALZUCcDR-q1_uoxCBxNaHpAq3HLw2erW8gqcYLmJ7Sm%3D-WqVyg%40mail.gmail.com.