From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/18102 Path: news.gmane.org!.POSTED!not-for-mail From: Thomas Blom Newsgroups: gmane.text.pandoc Subject: Re: docx -> markdown: images-in-table extracted but not written Date: Mon, 28 Aug 2017 12:30:51 -0700 (PDT) Message-ID: References: <295dbf64-f431-4dfa-98fc-a1089455da59@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_6702_1205747920.1503948651463" X-Trace: blaine.gmane.org 1503948654 4832 195.159.176.226 (28 Aug 2017 19:30:54 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 28 Aug 2017 19:30:54 +0000 (UTC) To: pandoc-discuss Original-X-From: pandoc-discuss+bncBCMLVPERXYMRB3G6SHGQKGQERVZTNPY-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mon Aug 28 21:30:50 2017 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-io0-f186.google.com ([209.85.223.186]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dmPkD-0000nz-Ue for gtp-pandoc-discuss@m.gmane.org; Mon, 28 Aug 2017 21:30:46 +0200 Original-Received: by mail-io0-f186.google.com with SMTP id c69sf189337ioe.19 for ; Mon, 28 Aug 2017 12:30:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:date:from:to:message-id:in-reply-to:references:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:list-subscribe :list-unsubscribe; bh=5wzMOPY2spP2UsSwEEa8ly8etbtYGxUP25Hy8aZooAo=; b=akW0iZiCQPOyAajOq1V/6EUiM7LIn5TpnzZW2zsZrijMwhRnFw9/mFCPAOI8tU6dHO uUbrZeMVMRhCW6fEZuIKzh3FbQAhK7cuNW1vsbqNkGkUDdtMT4u9+GNNpkcZPMlSl1Cu 129cePyY0oO4LpwDV3r7sy5sYJjKO9ni+41nciG3VIrZTykmrdXPSQctf4OudNjaIls/ Q0LQsMlQeNPzVC5Q5HJhk+DCu50DtpeKTneUDEUNency6bQAMXcLwTTlRQUK94RsLfbi CtXteMauXE6v/WmpyegtM3S7OWs8WI2UYtqs+Kqs8X3+EsvtOkIuSgKmm/PHNJNlfssH tMmw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-subscribe:list-unsubscribe; bh=5wzMOPY2spP2UsSwEEa8ly8etbtYGxUP25Hy8aZooAo=; b=hQ2kYYJJGzYU7cy9nufV5RS/KLIHEvskvK8s/vh8PkOH/Se8vhRciAw9yO78iiG2vQ jK773o2p9HFmENuKA7pRwbqbKWPRvYR1QNHJsW041auX0fqxRYHv8apAivqhiDaxR1fw QKSemqddvjtlux1GVFVSbuTlqBjHMCG0HWCSA+u7cA5fg0dbCnpcGJGzpYUNafpyMdIH G6gu8u+J4NmusgxxrNeRVlMXsgqQ1SB/QIVFBI9mpPhbGy2a02GYF0Pkkja5ps1MLuwK z+WyIxw+oJgQE8yDUpkNJfseMPFXpMg6hc8teE1F4frGKGan96CJ9uWV7QwJBTZ6wKTL U+og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:date:from:to:message-id:in-reply-to :references:subject:mime-version:x-original-sender:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=5wzMOPY2spP2UsSwEEa8ly8etbtYGxUP25Hy8aZooAo=; b=t3HEZx0bNwOrgRhPvvo36I7C0EKbHM7PdPc5glf1ZTzIIdkznCNhdQdX7WYAomMbsR poHZabSvhx+tHGIs+k2Yh1zhhk+Ld+QR8rjfORFK53Yba9EbVSFh4TY/Sg+/XbHoYqIf VSasJfKEhWgLvaoQfbZSN+xs+xJDK1NCwfhSDhInVeUWi+J+XzBuE32dQaV+uMVVcVT5 7U8qzBYmKZv3pZblg+XgLWO4X1uF4/Munh0VrK3H2i8hb2w9GRriHljFWJQ2SnykmNGl ucL50/oK/1ohdzIE693TcbFwR4ldExxRTCwB7z8RCfD8l6d4lSTd8EPBdREVg/kXAFDp sMJg== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AHYfb5hNawkYLIaXCn1jyfngNCpoetz7R5LM/GTzzdWrBy0kH81Wuccu Hst115CHmJVaeA== X-Received: by 10.36.46.150 with SMTP id i144mr54591ita.0.1503948652807; Mon, 28 Aug 2017 12:30:52 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 10.107.19.165 with SMTP id 37ls880731iot.1.gmail; Mon, 28 Aug 2017 12:30:51 -0700 (PDT) X-Received: by 10.36.181.90 with SMTP id j26mr52730iti.6.1503948651953; Mon, 28 Aug 2017 12:30:51 -0700 (PDT) In-Reply-To: <295dbf64-f431-4dfa-98fc-a1089455da59-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> X-Original-Sender: blomcode-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.org gmane.text.pandoc:18102 Archived-At: ------=_Part_6702_1205747920.1503948651463 Content-Type: multipart/alternative; boundary="----=_Part_6703_12388529.1503948651464" ------=_Part_6703_12388529.1503948651464 Content-Type: text/plain; charset="UTF-8" I have just solved this myself. Tables in Word allow you to select whether or not "Table Headers" are used, and (I know next to nothing about Word) I suppose if this is turned on pandoc sees the initial row entry as header data. If the next portion of the table then does not have the corresponding columns, the header column will be omitted. In the attached document that didn't work as expected, the two images that appear first in the table show up in the header portion. The next row of data is a single caption, and only has one column. Consequently, the pandoc writer does not write the entry for the second column header. This is just a guess - but turning off the "Table Header" checkbox in the word document causes the second image to show up correctly in the markdown. Thanks, Thomas On Monday, August 28, 2017 at 2:22:40 PM UTC-5, Thomas Blom wrote: > > Hello, > > The attached files demonstrate an issue in which pandoc (v1.19.2.1 on OSX > 10.12) correctly extracts two images from a table but then only creates the > table entry for one of them in the resulting markdown. > > pandoc -t markdown_strict --extract_media=test table_images.docx -o > test.md > > In the tables_images file, the problem is noted. In the > table_images_works file, in which the formatting appears to be the same, > the problem does not occur. > > Can anyone explain this? The images in all case are 300ppi png files > embedded in Word for Mac 2011 documents. > > Thanks! > Thomas Blom > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/e1c9bf0a-80ae-4acf-93c8-e77e61a35bdd%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. ------=_Part_6703_12388529.1503948651464 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I have just solved this myself. =C2=A0Tables in Word allow= you to select whether or not "Table Headers" are used, and (I kn= ow next to nothing about Word) I suppose if this is turned on pandoc sees t= he initial row entry as header data. =C2=A0If the next portion of the table= then does not have the corresponding columns, the header column will be om= itted. =C2=A0

In the attached document that didn't w= ork as expected, the two images that appear first in the table show up in t= he header <th> portion. =C2=A0The next row of data is a =C2=A0single = caption, and only has one column. =C2=A0Consequently, the pandoc writer doe= s not write the entry for the second column header.

This is just a guess - but turning off the "Table Header" check= box in the word document causes the second image to show up correctly in th= e markdown.

Thanks,
Thomas

On Mon= day, August 28, 2017 at 2:22:40 PM UTC-5, Thomas Blom wrote:
Hello,

Th= e attached files demonstrate an issue in which pandoc (v1.19.2.1 on OSX 10.= 12) correctly extracts two images from a table but then only creates the ta= ble entry for one of them in the resulting markdown.

pandoc -t markdown_strict --extrac= t_media=3Dtest table_images.docx -o test.md<= /font>

In the tables_images file, the problem is n= oted. =C2=A0In the table_images_works file, in which the formatting appears= to be the same, the problem does not occur.

Can a= nyone explain this? =C2=A0The images in all case are 300ppi png files embed= ded in Word for Mac 2011 documents.

Thanks!
<= div>Thomas Blom

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/= msgid/pandoc-discuss/e1c9bf0a-80ae-4acf-93c8-e77e61a35bdd%40googlegroups.co= m.
For more options, visit http= s://groups.google.com/d/optout.
------=_Part_6703_12388529.1503948651464-- ------=_Part_6702_1205747920.1503948651463--