public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: Bastien DUMONT <bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org>
To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
Subject: Re: When Converting From DOCX to Markdown Formatting For Lists Are Not Carried Over
Date: Wed, 18 May 2022 23:25:24 +0000	[thread overview]
Message-ID: <YoWAZBw9NRGrVrio@localhost> (raw)
In-Reply-To: <07CD1F5A-8D33-419E-AF1A-25107C02C360-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

The question is more about if the output of Pages is compliant with the specification (https://www.ecma-international.org/publications-and-standards/standards/ecma-376/). From what I have seen (for instance the example given in Part 1, p. 53-54), it is. If I am right, this is not a bug of Pages, since it should produce compliant documents, not necessarily documents that follow exactly the same encoding strategies as Word.

Le Wednesday 18 May 2022 à 04:20:47PM, Kevin Fjelsted a écrit :
> Sorry about the typo and spell corrector I missed. 
> 
> > I wonder if the issue is on Pandoc’s  end or on Apple Pages end since the export came from Pages.
> > Does anyone have MS word and can test this issue?
> 
> * Save  a docx file from word with at least two lists one ordered and one unordered. 
> Convert with Pandoc to markdown and make sure the list formatting shows up in other words a “-“ before each unordered item, and a unique number before each ordered list item. 
> One would issue this works because it is so common.,
> 
> Then try the same function by creating a document from scratch in Mac Pages and follow the same procedure including exporting a DOCX file. 
> This file will not created ordered or unordered markdown in the experiences previously documented in this thread.
> 
> Is this an issue to be reported in Pandoc, or Apple? 
> It should be noted that DOCX and ePub are really the only portable document methods available for Pages users that want to export to an open format.   
> > 
> > 
> >> On May 18, 2022, at 10:45 AM, Bastien DUMONT <bastien.dumont-VwIFZPTo/vqzQB+pC5nmwQ@public.gmane.orgt> wrote:
> >> 
> >> I have found the cause of this discrepancy: before testing your file, I opened it with LibreOffice, which modified it without notifying me. The modifications made by LibreOffice made the lists be recognized by Pandoc. I just downloaded your file again, converted it directly with Pandoc without opening it, and got the same result as you.
> >> 
> >> I attach a diff of word/document.xml in both DOCX files (unmodified and modified by LibreOffice). If you submit a bug request, it may be helpful to provide it.
> >> 
> >> Le Wednesday 18 May 2022 à 09:30:59AM, Kevin Fjelsted a écrit :
> >>> Interesting I get the following from my Pandoc —version command which was
> >>> installed with HomeBrew on the Mac.
> >>> <<<kfjelsted@kfjelsted-Mac ~ % pandoc --version
> >>> pandoc 2.18
> >>> Compiled with pandoc-types 1.22.2, texmath 0.12.5, skylighting 0.12.3,
> >>> citeproc 0.7, ipynb 0.2, hslua 2.2.0
> >>> Scripting engine: Lua 5.4
> >>> User data directory: /Users/kfjelsted/.local/share/pandoc
> >>> Copyright (C) 2006-2022 John MacFarlane. Web:  [1]https://pandoc.org
> >>> This is free software; see the source for copying conditions. There is no
> >>> warranty, not even for merchantability or fitness for a particular purpose.
> >>> kfjelsted@kfjelsted-Mac ~ % 
> >>> 
> >>>>>> 
> >>> Bastien, clearly your panic is running correctly and shows the lists.
> >>> We are using the same commands and same version.
> >>> The mystery deepens!
> >>> -Kevin
> >>> 
> >>> 
> >>> 
> >>> 
> >>>   On May 18, 2022, at 1:44 AM, Bastien DUMONT <[2]bastien.dumont@posteo.net>
> >>>   wrote:
> >>> 
> >>>   I get the following output:
> >>> 
> >>>   ```
> >>>   List Examples
> >>> 
> >>>   This is an unordered list.
> >>> 
> >>>   -   Apples
> >>> 
> >>>   -   Bannas
> >>> 
> >>>   -   Oranges
> >>> 
> >>>   -   Peaches
> >>> 
> >>>   This is a numbered list
> >>> 
> >>>   1.  Dogs
> >>> 
> >>>   2.  Cats
> >>> 
> >>>   3.  Mice
> >>> 
> >>>   4.  Elephants
> >>>   ```
> >>> 
> >>>   You may be using an old version of pandoc. If true, try upgrading to 2.18.
> >>> 
> >>>   Le Tuesday 17 May 2022 à 05:58:45PM, Kevin Fjelsted a écrit :
> >>> 
> >>>       I ran the following command to create the Markdown.
> >>> 
> >>>       <<<<
> >>>       kfjelsted@kfjelsted-Mac desktop % pandoc -s "list test.docx" --wrap=
> >>>       none --reference-links -t gfm -o "list test.md"
> >>> 
> >>>       kfjelsted@kfjelsted-Mac desktop % pandoc -s "list test.docx" --wrap=
> >>>       none --reference-links -t gfm -o "list test.md"
> >>> 
> >>> 
> >>> 
> >>> 
> >>>       I have attached the DOCX file that was used for the input. I have also
> >>>       attached the .MD file that was produced.
> >>> 
> >>>       There are no bullet or number symbols in the markdown.
> >>>       -Kevin
> >>> 
> >>> 
> >>> 
> >>>           On May 17, 2022, at 5:43 AM, Bastien DUMONT <[3]
> >>>           bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org> wrote:
> >>> 
> >>>           It should work with -t markdown too (which is implicit with "-o
> >>>           xxx.md"). Could you provide a small example file containing one
> >>>           such list?
> >>> 
> >>>           Le Tuesday 17 May 2022 à 05:19:24AM, Kevin Fjelsted a écrit :
> >>> 
> >>>               Interesting I still do not get lists that are created in Word
> >>>               to convert.
> >>>               This is the command I tried.
> >>>               $ pandoc -s "Test Word format.docx" --wrap=none
> >>>               --reference-links -t gfm -o
> >>>               "Test Word format.md"
> >>> 
> >>> 
> >>> 
> >>> 
> >>>                 On May 17, 2022, at 1:51 AM, 'Nikolay Mihaylov' via
> >>>               pandoc-discuss <[1]
> >>>                 [4]pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> wrote:
> >>> 
> >>>                 You should use `-t gfm` flag for bullet lists to work.
> >>> 
> >>>                 On Sunday, May 15, 2022 at 1:29:36 AM UTC+3 kfje...@[2][5]
> >>>               gmail.com wrote:
> >>> 
> >>>                     Why is it that when I convert from Docx to Markdown,
> >>>               bulleted and
> >>>                     numbered lists do not retain formatting i.e., no "-" in
> >>>               front of each
> >>>                     unordered list item, and no number in front of each
> >>>               numbered item.
> >>> 
> >>> 
> >>>                 --
> >>>                 You received this message because you are subscribed to the
> >>>               Google Groups
> >>>                 "pandoc-discuss" group.
> >>>                 To unsubscribe from this group and stop receiving emails from
> >>>               it, send an
> >>>                 email to [3][6]pandoc-discuss+unsubscribe@googlegroups.com.
> >>>                 To view this discussion on the web visit [4][7]https://
> >>>               groups.google.com/d/
> >>>                 msgid/pandoc-discuss/
> >>>                 a59d216d-f527-4d14-a9b8-f75da3f1af19n%[8]40googlegroups.com.
> >>> 
> >>> 
> >>>               --
> >>>               You received this message because you are subscribed to the
> >>>               Google Groups
> >>>               "pandoc-discuss" group.
> >>>               To unsubscribe from this group and stop receiving emails from
> >>>               it, send an email
> >>>               to [5][9]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> >>>               To view this discussion on the web visit [6][10]https://
> >>>               groups.google.com/d/msgid/
> >>>               pandoc-discuss/6F48F869-C199-4EE6-9050-278CF14C78EC%[11]
> >>>               40gmail.com.
> >>> 
> >>>               References:
> >>> 
> >>>               [1] [12]mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>>               [2] [13]http://gmail.com/
> >>>               [3] [14]mailto:pandoc-discuss+unsubscribe@googlegroups.com
> >>>               [4] [15]https://groups.google.com/d/msgid/pandoc-discuss/
> >>>               a59d216d-f527-4d14-a9b8-f75da3f1af19n%40googlegroups.com?
> >>>               utm_medium=email&utm_source=footer
> >>>               [5] [16]mailto:pandoc-discuss+unsubscribe@googlegroups.com
> >>>               [6] [17]https://groups.google.com/d/msgid/pandoc-discuss/
> >>>               6F48F869-C199-4EE6-9050-278CF14C78EC%40gmail.com?utm_medium=
> >>>               email&utm_source=footer
> >>> 
> >>> 
> >>>           --
> >>>           You received this message because you are subscribed to the Google
> >>>           Groups "pandoc-discuss" group.
> >>>           To unsubscribe from this group and stop receiving emails from it,
> >>>           send an email to [18]pandoc-discuss+unsubscribe@googlegroups.com.
> >>>           To view this discussion on the web visit [19]https://
> >>>           groups.google.com/d/msgid/pandoc-discuss/
> >>>           YoN8ahDbTBUMk8F1%40localhost.
> >>> 
> >>> 
> >>>       --
> >>>       You received this message because you are subscribed to the Google
> >>>       Groups "pandoc-discuss" group.
> >>>       To unsubscribe from this group and stop receiving emails from it, send
> >>>       an email to [20]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> >>>       To view this discussion on the web visit [21]https://groups.google.com/
> >>>       d/msgid/pandoc-discuss/662275E5-EDA3-4D1D-B8C2-51B4EA12C103%40gmail.com
> >>>       .
> >>> 
> >>> 
> >>>   --
> >>>   You received this message because you are subscribed to the Google Groups
> >>>   "pandoc-discuss" group.
> >>>   To unsubscribe from this group and stop receiving emails from it, send an
> >>>   email to [22]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> >>>   To view this discussion on the web visit [23]https://groups.google.com/d/
> >>>   msgid/pandoc-discuss/YoSVu600xH8jeOtt%40localhost.
> >>> 
> >>> 
> >>> --
> >>> You received this message because you are subscribed to the Google Groups
> >>> "pandoc-discuss" group.
> >>> To unsubscribe from this group and stop receiving emails from it, send an email
> >>> to [24]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> >>> To view this discussion on the web visit [25]https://groups.google.com/d/msgid/
> >>> pandoc-discuss/06C4B1E7-54D0-44D9-98B0-D47AB6A0CB21%40gmail.com.
> >>> 
> >>> References:
> >>> 
> >>> [1] https://pandoc.org/
> >>> [2] mailto:bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org
> >>> [3] mailto:bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org
> >>> [4] mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>> [5] http://gmail.com/
> >>> [6] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>> [7] https://groups.google.com/d/
> >>> [8] http://40googlegroups.com/
> >>> [9] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>> [10] https://groups.google.com/d/msgid/
> >>> [11] http://40gmail.com/
> >>> [12] mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>> [13] http://gmail.com/
> >>> [14] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>> [15] https://groups.google.com/d/msgid/pandoc-discuss/a59d216d-f527-4d14-a9b8-f75da3f1af19n%40googlegroups.com?utm_medium=email&utm_source=footer
> >>> [16] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>> [17] https://groups.google.com/d/msgid/pandoc-discuss/6F48F869-C199-4EE6-9050-278CF14C78EC%40gmail.com?utm_medium=email&utm_source=footer
> >>> [18] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>> [19] https://groups.google.com/d/msgid/pandoc-discuss/YoN8ahDbTBUMk8F1%40localhost
> >>> [20] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>> [21] https://groups.google.com/d/msgid/pandoc-discuss/662275E5-EDA3-4D1D-B8C2-51B4EA12C103%40gmail.com
> >>> [22] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>> [23] https://groups.google.com/d/msgid/pandoc-discuss/YoSVu600xH8jeOtt%40localhost
> >>> [24] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>> [25] https://groups.google.com/d/msgid/pandoc-discuss/06C4B1E7-54D0-44D9-98B0-D47AB6A0CB21%40gmail.com?utm_medium=email&utm_source=footer
> >> 
> >> -- 
> >> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> >> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> >> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/YoUUkMpW7L91FxOU%40localhost.
> >> <lists_without-lo-and-with-lo.diff>
> > 
> 
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/07CD1F5A-8D33-419E-AF1A-25107C02C360%40gmail.com.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/YoWAZBw9NRGrVrio%40localhost.


      parent reply	other threads:[~2022-05-18 23:25 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-14 22:29 Kevin Fjelsted
     [not found] ` <e575a611-a1ba-4875-8c92-4f41aea7dff1n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-05-17  6:51   ` 'Nikolay Mihaylov' via pandoc-discuss
     [not found]     ` <a59d216d-f527-4d14-a9b8-f75da3f1af19n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-05-17 10:19       ` Kevin Fjelsted
     [not found]         ` <6F48F869-C199-4EE6-9050-278CF14C78EC-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-05-17 10:43           ` Bastien DUMONT
2022-05-17 22:58             ` Kevin Fjelsted
     [not found]               ` <662275E5-EDA3-4D1D-B8C2-51B4EA12C103-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-05-18  6:44                 ` Bastien DUMONT
2022-05-18 14:30                   ` Kevin Fjelsted
     [not found]                     ` <06C4B1E7-54D0-44D9-98B0-D47AB6A0CB21-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-05-18 15:45                       ` Bastien DUMONT
2022-05-18 17:57                         ` Kevin Fjelsted
     [not found]                           ` <502CBF2E-C516-4E63-97A5-38DFC6CB22D9-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-05-18 18:44                             ` BPJ
2022-05-18 18:02                         ` Kevin Fjelsted
     [not found]                           ` <3EC245C1-7D01-4B78-A445-DB215285A29D-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-05-18 21:20                             ` Kevin Fjelsted
     [not found]                               ` <07CD1F5A-8D33-419E-AF1A-25107C02C360-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-05-18 23:25                                 ` Bastien DUMONT [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YoWAZBw9NRGrVrio@localhost \
    --to=bastien.dumont-vwifzpto/vqstnjn9+bgxg@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).