public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: Kevin Fjelsted <kfjelsted-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
Subject: Re: When Converting From DOCX to Markdown Formatting For Lists Are Not Carried Over
Date: Wed, 18 May 2022 16:20:47 -0500	[thread overview]
Message-ID: <07CD1F5A-8D33-419E-AF1A-25107C02C360@gmail.com> (raw)
In-Reply-To: <3EC245C1-7D01-4B78-A445-DB215285A29D-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

Sorry about the typo and spell corrector I missed. 

> I wonder if the issue is on Pandoc’s  end or on Apple Pages end since the export came from Pages.
> Does anyone have MS word and can test this issue?

* Save  a docx file from word with at least two lists one ordered and one unordered. 
Convert with Pandoc to markdown and make sure the list formatting shows up in other words a “-“ before each unordered item, and a unique number before each ordered list item. 
One would issue this works because it is so common.,

Then try the same function by creating a document from scratch in Mac Pages and follow the same procedure including exporting a DOCX file. 
This file will not created ordered or unordered markdown in the experiences previously documented in this thread.

Is this an issue to be reported in Pandoc, or Apple? 
It should be noted that DOCX and ePub are really the only portable document methods available for Pages users that want to export to an open format.   
> 
> 
>> On May 18, 2022, at 10:45 AM, Bastien DUMONT <bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org> wrote:
>> 
>> I have found the cause of this discrepancy: before testing your file, I opened it with LibreOffice, which modified it without notifying me. The modifications made by LibreOffice made the lists be recognized by Pandoc. I just downloaded your file again, converted it directly with Pandoc without opening it, and got the same result as you.
>> 
>> I attach a diff of word/document.xml in both DOCX files (unmodified and modified by LibreOffice). If you submit a bug request, it may be helpful to provide it.
>> 
>> Le Wednesday 18 May 2022 à 09:30:59AM, Kevin Fjelsted a écrit :
>>> Interesting I get the following from my Pandoc —version command which was
>>> installed with HomeBrew on the Mac.
>>> <<<kfjelsted@kfjelsted-Mac ~ % pandoc --version
>>> pandoc 2.18
>>> Compiled with pandoc-types 1.22.2, texmath 0.12.5, skylighting 0.12.3,
>>> citeproc 0.7, ipynb 0.2, hslua 2.2.0
>>> Scripting engine: Lua 5.4
>>> User data directory: /Users/kfjelsted/.local/share/pandoc
>>> Copyright (C) 2006-2022 John MacFarlane. Web:  [1]https://pandoc.org
>>> This is free software; see the source for copying conditions. There is no
>>> warranty, not even for merchantability or fitness for a particular purpose.
>>> kfjelsted@kfjelsted-Mac ~ % 
>>> 
>>>>>> 
>>> Bastien, clearly your panic is running correctly and shows the lists.
>>> We are using the same commands and same version.
>>> The mystery deepens!
>>> -Kevin
>>> 
>>> 
>>> 
>>> 
>>>   On May 18, 2022, at 1:44 AM, Bastien DUMONT <[2]bastien.dumont@posteo.net>
>>>   wrote:
>>> 
>>>   I get the following output:
>>> 
>>>   ```
>>>   List Examples
>>> 
>>>   This is an unordered list.
>>> 
>>>   -   Apples
>>> 
>>>   -   Bannas
>>> 
>>>   -   Oranges
>>> 
>>>   -   Peaches
>>> 
>>>   This is a numbered list
>>> 
>>>   1.  Dogs
>>> 
>>>   2.  Cats
>>> 
>>>   3.  Mice
>>> 
>>>   4.  Elephants
>>>   ```
>>> 
>>>   You may be using an old version of pandoc. If true, try upgrading to 2.18.
>>> 
>>>   Le Tuesday 17 May 2022 à 05:58:45PM, Kevin Fjelsted a écrit :
>>> 
>>>       I ran the following command to create the Markdown.
>>> 
>>>       <<<<
>>>       kfjelsted@kfjelsted-Mac desktop % pandoc -s "list test.docx" --wrap=
>>>       none --reference-links -t gfm -o "list test.md"
>>> 
>>>       kfjelsted@kfjelsted-Mac desktop % pandoc -s "list test.docx" --wrap=
>>>       none --reference-links -t gfm -o "list test.md"
>>> 
>>> 
>>> 
>>> 
>>>       I have attached the DOCX file that was used for the input. I have also
>>>       attached the .MD file that was produced.
>>> 
>>>       There are no bullet or number symbols in the markdown.
>>>       -Kevin
>>> 
>>> 
>>> 
>>>           On May 17, 2022, at 5:43 AM, Bastien DUMONT <[3]
>>>           bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org> wrote:
>>> 
>>>           It should work with -t markdown too (which is implicit with "-o
>>>           xxx.md"). Could you provide a small example file containing one
>>>           such list?
>>> 
>>>           Le Tuesday 17 May 2022 à 05:19:24AM, Kevin Fjelsted a écrit :
>>> 
>>>               Interesting I still do not get lists that are created in Word
>>>               to convert.
>>>               This is the command I tried.
>>>               $ pandoc -s "Test Word format.docx" --wrap=none
>>>               --reference-links -t gfm -o
>>>               "Test Word format.md"
>>> 
>>> 
>>> 
>>> 
>>>                 On May 17, 2022, at 1:51 AM, 'Nikolay Mihaylov' via
>>>               pandoc-discuss <[1]
>>>                 [4]pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> wrote:
>>> 
>>>                 You should use `-t gfm` flag for bullet lists to work.
>>> 
>>>                 On Sunday, May 15, 2022 at 1:29:36 AM UTC+3 kfje...@[2][5]
>>>               gmail.com wrote:
>>> 
>>>                     Why is it that when I convert from Docx to Markdown,
>>>               bulleted and
>>>                     numbered lists do not retain formatting i.e., no "-" in
>>>               front of each
>>>                     unordered list item, and no number in front of each
>>>               numbered item.
>>> 
>>> 
>>>                 --
>>>                 You received this message because you are subscribed to the
>>>               Google Groups
>>>                 "pandoc-discuss" group.
>>>                 To unsubscribe from this group and stop receiving emails from
>>>               it, send an
>>>                 email to [3][6]pandoc-discuss+unsubscribe@googlegroups.com.
>>>                 To view this discussion on the web visit [4][7]https://
>>>               groups.google.com/d/
>>>                 msgid/pandoc-discuss/
>>>                 a59d216d-f527-4d14-a9b8-f75da3f1af19n%[8]40googlegroups.com.
>>> 
>>> 
>>>               --
>>>               You received this message because you are subscribed to the
>>>               Google Groups
>>>               "pandoc-discuss" group.
>>>               To unsubscribe from this group and stop receiving emails from
>>>               it, send an email
>>>               to [5][9]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>>               To view this discussion on the web visit [6][10]https://
>>>               groups.google.com/d/msgid/
>>>               pandoc-discuss/6F48F869-C199-4EE6-9050-278CF14C78EC%[11]
>>>               40gmail.com.
>>> 
>>>               References:
>>> 
>>>               [1] [12]mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>>>               [2] [13]http://gmail.com/
>>>               [3] [14]mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh4Ykp1iOSErHA@public.gmane.orgm
>>>               [4] [15]https://groups.google.com/d/msgid/pandoc-discuss/
>>>               a59d216d-f527-4d14-a9b8-f75da3f1af19n%40googlegroups.com?
>>>               utm_medium=email&utm_source=footer
>>>               [5] [16]mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh4Ykp1iOSErHA@public.gmane.orgm
>>>               [6] [17]https://groups.google.com/d/msgid/pandoc-discuss/
>>>               6F48F869-C199-4EE6-9050-278CF14C78EC%40gmail.com?utm_medium=
>>>               email&utm_source=footer
>>> 
>>> 
>>>           --
>>>           You received this message because you are subscribed to the Google
>>>           Groups "pandoc-discuss" group.
>>>           To unsubscribe from this group and stop receiving emails from it,
>>>           send an email to [18]pandoc-discuss+unsubscribe@googlegroups.com.
>>>           To view this discussion on the web visit [19]https://
>>>           groups.google.com/d/msgid/pandoc-discuss/
>>>           YoN8ahDbTBUMk8F1%40localhost.
>>> 
>>> 
>>>       --
>>>       You received this message because you are subscribed to the Google
>>>       Groups "pandoc-discuss" group.
>>>       To unsubscribe from this group and stop receiving emails from it, send
>>>       an email to [20]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>>       To view this discussion on the web visit [21]https://groups.google.com/
>>>       d/msgid/pandoc-discuss/662275E5-EDA3-4D1D-B8C2-51B4EA12C103%40gmail.com
>>>       .
>>> 
>>> 
>>>   --
>>>   You received this message because you are subscribed to the Google Groups
>>>   "pandoc-discuss" group.
>>>   To unsubscribe from this group and stop receiving emails from it, send an
>>>   email to [22]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>>   To view this discussion on the web visit [23]https://groups.google.com/d/
>>>   msgid/pandoc-discuss/YoSVu600xH8jeOtt%40localhost.
>>> 
>>> 
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "pandoc-discuss" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an email
>>> to [24]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>> To view this discussion on the web visit [25]https://groups.google.com/d/msgid/
>>> pandoc-discuss/06C4B1E7-54D0-44D9-98B0-D47AB6A0CB21%40gmail.com.
>>> 
>>> References:
>>> 
>>> [1] https://pandoc.org/
>>> [2] mailto:bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org
>>> [3] mailto:bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org
>>> [4] mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>>> [5] http://gmail.com/
>>> [6] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>>> [7] https://groups.google.com/d/
>>> [8] http://40googlegroups.com/
>>> [9] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>>> [10] https://groups.google.com/d/msgid/
>>> [11] http://40gmail.com/
>>> [12] mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>>> [13] http://gmail.com/
>>> [14] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>>> [15] https://groups.google.com/d/msgid/pandoc-discuss/a59d216d-f527-4d14-a9b8-f75da3f1af19n%40googlegroups.com?utm_medium=email&utm_source=footer
>>> [16] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>>> [17] https://groups.google.com/d/msgid/pandoc-discuss/6F48F869-C199-4EE6-9050-278CF14C78EC%40gmail.com?utm_medium=email&utm_source=footer
>>> [18] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>>> [19] https://groups.google.com/d/msgid/pandoc-discuss/YoN8ahDbTBUMk8F1%40localhost
>>> [20] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>>> [21] https://groups.google.com/d/msgid/pandoc-discuss/662275E5-EDA3-4D1D-B8C2-51B4EA12C103%40gmail.com
>>> [22] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>>> [23] https://groups.google.com/d/msgid/pandoc-discuss/YoSVu600xH8jeOtt%40localhost
>>> [24] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>>> [25] https://groups.google.com/d/msgid/pandoc-discuss/06C4B1E7-54D0-44D9-98B0-D47AB6A0CB21%40gmail.com?utm_medium=email&utm_source=footer
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/YoUUkMpW7L91FxOU%40localhost.
>> <lists_without-lo-and-with-lo.diff>
> 

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/07CD1F5A-8D33-419E-AF1A-25107C02C360%40gmail.com.


  parent reply	other threads:[~2022-05-18 21:20 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-14 22:29 Kevin Fjelsted
     [not found] ` <e575a611-a1ba-4875-8c92-4f41aea7dff1n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-05-17  6:51   ` 'Nikolay Mihaylov' via pandoc-discuss
     [not found]     ` <a59d216d-f527-4d14-a9b8-f75da3f1af19n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-05-17 10:19       ` Kevin Fjelsted
     [not found]         ` <6F48F869-C199-4EE6-9050-278CF14C78EC-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-05-17 10:43           ` Bastien DUMONT
2022-05-17 22:58             ` Kevin Fjelsted
     [not found]               ` <662275E5-EDA3-4D1D-B8C2-51B4EA12C103-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-05-18  6:44                 ` Bastien DUMONT
2022-05-18 14:30                   ` Kevin Fjelsted
     [not found]                     ` <06C4B1E7-54D0-44D9-98B0-D47AB6A0CB21-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-05-18 15:45                       ` Bastien DUMONT
2022-05-18 17:57                         ` Kevin Fjelsted
     [not found]                           ` <502CBF2E-C516-4E63-97A5-38DFC6CB22D9-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-05-18 18:44                             ` BPJ
2022-05-18 18:02                         ` Kevin Fjelsted
     [not found]                           ` <3EC245C1-7D01-4B78-A445-DB215285A29D-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-05-18 21:20                             ` Kevin Fjelsted [this message]
     [not found]                               ` <07CD1F5A-8D33-419E-AF1A-25107C02C360-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-05-18 23:25                                 ` Bastien DUMONT

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=07CD1F5A-8D33-419E-AF1A-25107C02C360@gmail.com \
    --to=kfjelsted-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).