From: Bastien DUMONT <bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org>
To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
Subject: Re: When Converting From DOCX to Markdown Formatting For Lists Are Not Carried Over
Date: Wed, 18 May 2022 23:25:24 +0000 [thread overview]
Message-ID: <YoWAZBw9NRGrVrio@localhost> (raw)
In-Reply-To: <07CD1F5A-8D33-419E-AF1A-25107C02C360-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
The question is more about if the output of Pages is compliant with the specification (https://www.ecma-international.org/publications-and-standards/standards/ecma-376/). From what I have seen (for instance the example given in Part 1, p. 53-54), it is. If I am right, this is not a bug of Pages, since it should produce compliant documents, not necessarily documents that follow exactly the same encoding strategies as Word.
Le Wednesday 18 May 2022 à 04:20:47PM, Kevin Fjelsted a écrit :
> Sorry about the typo and spell corrector I missed.
>
> > I wonder if the issue is on Pandoc’s end or on Apple Pages end since the export came from Pages.
> > Does anyone have MS word and can test this issue?
>
> * Save a docx file from word with at least two lists one ordered and one unordered.
> Convert with Pandoc to markdown and make sure the list formatting shows up in other words a “-“ before each unordered item, and a unique number before each ordered list item.
> One would issue this works because it is so common.,
>
> Then try the same function by creating a document from scratch in Mac Pages and follow the same procedure including exporting a DOCX file.
> This file will not created ordered or unordered markdown in the experiences previously documented in this thread.
>
> Is this an issue to be reported in Pandoc, or Apple?
> It should be noted that DOCX and ePub are really the only portable document methods available for Pages users that want to export to an open format.
> >
> >
> >> On May 18, 2022, at 10:45 AM, Bastien DUMONT <bastien.dumont-VwIFZPTo/vqzQB+pC5nmwQ@public.gmane.orgt> wrote:
> >>
> >> I have found the cause of this discrepancy: before testing your file, I opened it with LibreOffice, which modified it without notifying me. The modifications made by LibreOffice made the lists be recognized by Pandoc. I just downloaded your file again, converted it directly with Pandoc without opening it, and got the same result as you.
> >>
> >> I attach a diff of word/document.xml in both DOCX files (unmodified and modified by LibreOffice). If you submit a bug request, it may be helpful to provide it.
> >>
> >> Le Wednesday 18 May 2022 à 09:30:59AM, Kevin Fjelsted a écrit :
> >>> Interesting I get the following from my Pandoc —version command which was
> >>> installed with HomeBrew on the Mac.
> >>> <<<kfjelsted@kfjelsted-Mac ~ % pandoc --version
> >>> pandoc 2.18
> >>> Compiled with pandoc-types 1.22.2, texmath 0.12.5, skylighting 0.12.3,
> >>> citeproc 0.7, ipynb 0.2, hslua 2.2.0
> >>> Scripting engine: Lua 5.4
> >>> User data directory: /Users/kfjelsted/.local/share/pandoc
> >>> Copyright (C) 2006-2022 John MacFarlane. Web: [1]https://pandoc.org
> >>> This is free software; see the source for copying conditions. There is no
> >>> warranty, not even for merchantability or fitness for a particular purpose.
> >>> kfjelsted@kfjelsted-Mac ~ %
> >>>
> >>>>>>
> >>> Bastien, clearly your panic is running correctly and shows the lists.
> >>> We are using the same commands and same version.
> >>> The mystery deepens!
> >>> -Kevin
> >>>
> >>>
> >>>
> >>>
> >>> On May 18, 2022, at 1:44 AM, Bastien DUMONT <[2]bastien.dumont@posteo.net>
> >>> wrote:
> >>>
> >>> I get the following output:
> >>>
> >>> ```
> >>> List Examples
> >>>
> >>> This is an unordered list.
> >>>
> >>> - Apples
> >>>
> >>> - Bannas
> >>>
> >>> - Oranges
> >>>
> >>> - Peaches
> >>>
> >>> This is a numbered list
> >>>
> >>> 1. Dogs
> >>>
> >>> 2. Cats
> >>>
> >>> 3. Mice
> >>>
> >>> 4. Elephants
> >>> ```
> >>>
> >>> You may be using an old version of pandoc. If true, try upgrading to 2.18.
> >>>
> >>> Le Tuesday 17 May 2022 à 05:58:45PM, Kevin Fjelsted a écrit :
> >>>
> >>> I ran the following command to create the Markdown.
> >>>
> >>> <<<<
> >>> kfjelsted@kfjelsted-Mac desktop % pandoc -s "list test.docx" --wrap=
> >>> none --reference-links -t gfm -o "list test.md"
> >>>
> >>> kfjelsted@kfjelsted-Mac desktop % pandoc -s "list test.docx" --wrap=
> >>> none --reference-links -t gfm -o "list test.md"
> >>>
> >>>
> >>>
> >>>
> >>> I have attached the DOCX file that was used for the input. I have also
> >>> attached the .MD file that was produced.
> >>>
> >>> There are no bullet or number symbols in the markdown.
> >>> -Kevin
> >>>
> >>>
> >>>
> >>> On May 17, 2022, at 5:43 AM, Bastien DUMONT <[3]
> >>> bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org> wrote:
> >>>
> >>> It should work with -t markdown too (which is implicit with "-o
> >>> xxx.md"). Could you provide a small example file containing one
> >>> such list?
> >>>
> >>> Le Tuesday 17 May 2022 à 05:19:24AM, Kevin Fjelsted a écrit :
> >>>
> >>> Interesting I still do not get lists that are created in Word
> >>> to convert.
> >>> This is the command I tried.
> >>> $ pandoc -s "Test Word format.docx" --wrap=none
> >>> --reference-links -t gfm -o
> >>> "Test Word format.md"
> >>>
> >>>
> >>>
> >>>
> >>> On May 17, 2022, at 1:51 AM, 'Nikolay Mihaylov' via
> >>> pandoc-discuss <[1]
> >>> [4]pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> wrote:
> >>>
> >>> You should use `-t gfm` flag for bullet lists to work.
> >>>
> >>> On Sunday, May 15, 2022 at 1:29:36 AM UTC+3 kfje...@[2][5]
> >>> gmail.com wrote:
> >>>
> >>> Why is it that when I convert from Docx to Markdown,
> >>> bulleted and
> >>> numbered lists do not retain formatting i.e., no "-" in
> >>> front of each
> >>> unordered list item, and no number in front of each
> >>> numbered item.
> >>>
> >>>
> >>> --
> >>> You received this message because you are subscribed to the
> >>> Google Groups
> >>> "pandoc-discuss" group.
> >>> To unsubscribe from this group and stop receiving emails from
> >>> it, send an
> >>> email to [3][6]pandoc-discuss+unsubscribe@googlegroups.com.
> >>> To view this discussion on the web visit [4][7]https://
> >>> groups.google.com/d/
> >>> msgid/pandoc-discuss/
> >>> a59d216d-f527-4d14-a9b8-f75da3f1af19n%[8]40googlegroups.com.
> >>>
> >>>
> >>> --
> >>> You received this message because you are subscribed to the
> >>> Google Groups
> >>> "pandoc-discuss" group.
> >>> To unsubscribe from this group and stop receiving emails from
> >>> it, send an email
> >>> to [5][9]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> >>> To view this discussion on the web visit [6][10]https://
> >>> groups.google.com/d/msgid/
> >>> pandoc-discuss/6F48F869-C199-4EE6-9050-278CF14C78EC%[11]
> >>> 40gmail.com.
> >>>
> >>> References:
> >>>
> >>> [1] [12]mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>> [2] [13]http://gmail.com/
> >>> [3] [14]mailto:pandoc-discuss+unsubscribe@googlegroups.com
> >>> [4] [15]https://groups.google.com/d/msgid/pandoc-discuss/
> >>> a59d216d-f527-4d14-a9b8-f75da3f1af19n%40googlegroups.com?
> >>> utm_medium=email&utm_source=footer
> >>> [5] [16]mailto:pandoc-discuss+unsubscribe@googlegroups.com
> >>> [6] [17]https://groups.google.com/d/msgid/pandoc-discuss/
> >>> 6F48F869-C199-4EE6-9050-278CF14C78EC%40gmail.com?utm_medium=
> >>> email&utm_source=footer
> >>>
> >>>
> >>> --
> >>> You received this message because you are subscribed to the Google
> >>> Groups "pandoc-discuss" group.
> >>> To unsubscribe from this group and stop receiving emails from it,
> >>> send an email to [18]pandoc-discuss+unsubscribe@googlegroups.com.
> >>> To view this discussion on the web visit [19]https://
> >>> groups.google.com/d/msgid/pandoc-discuss/
> >>> YoN8ahDbTBUMk8F1%40localhost.
> >>>
> >>>
> >>> --
> >>> You received this message because you are subscribed to the Google
> >>> Groups "pandoc-discuss" group.
> >>> To unsubscribe from this group and stop receiving emails from it, send
> >>> an email to [20]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> >>> To view this discussion on the web visit [21]https://groups.google.com/
> >>> d/msgid/pandoc-discuss/662275E5-EDA3-4D1D-B8C2-51B4EA12C103%40gmail.com
> >>> .
> >>>
> >>>
> >>> --
> >>> You received this message because you are subscribed to the Google Groups
> >>> "pandoc-discuss" group.
> >>> To unsubscribe from this group and stop receiving emails from it, send an
> >>> email to [22]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> >>> To view this discussion on the web visit [23]https://groups.google.com/d/
> >>> msgid/pandoc-discuss/YoSVu600xH8jeOtt%40localhost.
> >>>
> >>>
> >>> --
> >>> You received this message because you are subscribed to the Google Groups
> >>> "pandoc-discuss" group.
> >>> To unsubscribe from this group and stop receiving emails from it, send an email
> >>> to [24]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> >>> To view this discussion on the web visit [25]https://groups.google.com/d/msgid/
> >>> pandoc-discuss/06C4B1E7-54D0-44D9-98B0-D47AB6A0CB21%40gmail.com.
> >>>
> >>> References:
> >>>
> >>> [1] https://pandoc.org/
> >>> [2] mailto:bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org
> >>> [3] mailto:bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org
> >>> [4] mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>> [5] http://gmail.com/
> >>> [6] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>> [7] https://groups.google.com/d/
> >>> [8] http://40googlegroups.com/
> >>> [9] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>> [10] https://groups.google.com/d/msgid/
> >>> [11] http://40gmail.com/
> >>> [12] mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>> [13] http://gmail.com/
> >>> [14] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>> [15] https://groups.google.com/d/msgid/pandoc-discuss/a59d216d-f527-4d14-a9b8-f75da3f1af19n%40googlegroups.com?utm_medium=email&utm_source=footer
> >>> [16] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>> [17] https://groups.google.com/d/msgid/pandoc-discuss/6F48F869-C199-4EE6-9050-278CF14C78EC%40gmail.com?utm_medium=email&utm_source=footer
> >>> [18] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>> [19] https://groups.google.com/d/msgid/pandoc-discuss/YoN8ahDbTBUMk8F1%40localhost
> >>> [20] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>> [21] https://groups.google.com/d/msgid/pandoc-discuss/662275E5-EDA3-4D1D-B8C2-51B4EA12C103%40gmail.com
> >>> [22] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>> [23] https://groups.google.com/d/msgid/pandoc-discuss/YoSVu600xH8jeOtt%40localhost
> >>> [24] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> >>> [25] https://groups.google.com/d/msgid/pandoc-discuss/06C4B1E7-54D0-44D9-98B0-D47AB6A0CB21%40gmail.com?utm_medium=email&utm_source=footer
> >>
> >> --
> >> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> >> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> >> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/YoUUkMpW7L91FxOU%40localhost.
> >> <lists_without-lo-and-with-lo.diff>
> >
>
> --
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/07CD1F5A-8D33-419E-AF1A-25107C02C360%40gmail.com.
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/YoWAZBw9NRGrVrio%40localhost.
prev parent reply other threads:[~2022-05-18 23:25 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-14 22:29 Kevin Fjelsted
[not found] ` <e575a611-a1ba-4875-8c92-4f41aea7dff1n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-05-17 6:51 ` 'Nikolay Mihaylov' via pandoc-discuss
[not found] ` <a59d216d-f527-4d14-a9b8-f75da3f1af19n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-05-17 10:19 ` Kevin Fjelsted
[not found] ` <6F48F869-C199-4EE6-9050-278CF14C78EC-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-05-17 10:43 ` Bastien DUMONT
2022-05-17 22:58 ` Kevin Fjelsted
[not found] ` <662275E5-EDA3-4D1D-B8C2-51B4EA12C103-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-05-18 6:44 ` Bastien DUMONT
2022-05-18 14:30 ` Kevin Fjelsted
[not found] ` <06C4B1E7-54D0-44D9-98B0-D47AB6A0CB21-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-05-18 15:45 ` Bastien DUMONT
2022-05-18 17:57 ` Kevin Fjelsted
[not found] ` <502CBF2E-C516-4E63-97A5-38DFC6CB22D9-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-05-18 18:44 ` BPJ
2022-05-18 18:02 ` Kevin Fjelsted
[not found] ` <3EC245C1-7D01-4B78-A445-DB215285A29D-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-05-18 21:20 ` Kevin Fjelsted
[not found] ` <07CD1F5A-8D33-419E-AF1A-25107C02C360-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-05-18 23:25 ` Bastien DUMONT [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YoWAZBw9NRGrVrio@localhost \
--to=bastien.dumont-vwifzpto/vqstnjn9+bgxg@public.gmane.org \
--cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).