When Pandoc creates ODT file from HTML containing SVG images, it losslessly embeds the images as-is. That's a problem for me, because my pipeline does some automatic transformations to the document using LibreOffice UNO API and ultimately saves as DOCX. When LibreOffice saves a ODT file containing the SVG images into DOCX, it rasterizes the images in a very poor resolution, that is according to folks in the LibreOffice forum, uncontrollable. But there is a trick: I can use Inkscape to do convert the SVG images into EMF. EMF files are not rasterized by LibreOffice when it saves the document as DOCX. The problem is that the EMF files have obviously different binary content than SVG originals. When I replace them in the "Pictures/" folder inside the ODT, LibreOffice notices that the file name of the EMF pictures does not match their hash and claims the "image is corrupted" and gives an option to repair. Unfortunately, that repair dialog cannot get automated in the headless environment, which means I need to know how to make the "non-broken" ODT document in the first place. For that I need to know the hashing scheme. I tried to read the Pandoc sources to get the answer myself, but my zero knowledge of Haskell is a major obstacle. My gut feeling says the answer is somewhere in the `pandoc/src/Text/Pandoc/Writers/OpenDocument.hs`. On Saturday, December 2, 2023 at 9:05:42 PM UTC John MacFarlane wrote: > Why is it necessary to do this? Docx can handle svgs, can't it? > > > On Dec 2, 2023, at 6:16 AM, Adam Ryczkowski > wrote: > > > > Hi! > > > > I write a script that replaces "svg" images with "emf" in the odt in > order to allow lossless convertion to "docx" format using LibreOffice. > > > > The problem is that mere replacing the files and fixing the > `content.xml` does not suffice. The image file name is some form of hash of > its contents. If the contents does not match, Libreoffice reports the > document to be "broken" (but allows to repair). Alas, this repair cannot be > automated. > > > > I tried to get that from the Pandoc source code, but Haskell's syntax > seem too alien to me. > > > > What is the naming convention for the files in the Pictures/ folder? > > > > -- > > You received this message because you are subscribed to the Google > Groups "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/88f8c8dc-b9b6-4e6e-91ce-75e08412e466n%40googlegroups.com > . > > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/734880a0-db9e-4855-b228-22902fbb387an%40googlegroups.com.