public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* docx with images in tables to markdown and back
@ 2022-10-21 12:30 Jan Stühler
       [not found] ` <02f65a26-e99a-4cfb-9ef7-899e7f40f899n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: Jan Stühler @ 2022-10-21 12:30 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1468 bytes --]

Hello group.

I use 
```
pandoc -f docx -t markdown --extract-media "Lab 1-2.docx-dir" -o file.md 
file.docx
```
to convert a word document to markdown. The word document has (many) images 
which are sitting in table cells. One of the results is:
```
   1 
+===============================+======================================+
   2 | Click on the Link:\           | ![](./Lab 1-2.docx-dir/media/ima     
|
   3 | *[Click here to start the     | ge7.png){width="3.643961067366579in" 
|
   4 | Local Service                 | height="1.9991174540682415in"}       
|
```
(Line numbers from `vim`).

Observe the line break between `ima` and `ge7.png`.

To convert this markdown to word, I use
```
pandoc -f markdown -t docx -o file-new.docx file.md
```
which results in this error message:
```
[WARNING] Could not fetch resource 
./Lab%201-2.docx-dir/media/ima%20ge7.png: replacing image with description
```
Observe the `%20` between `ima` and `ge7.png`.

Is there something I can do so that pandoc can put the images into the word 
document?

Thanks alot.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/02f65a26-e99a-4cfb-9ef7-899e7f40f899n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 2195 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: docx with images in tables to markdown and back
       [not found] ` <02f65a26-e99a-4cfb-9ef7-899e7f40f899n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2022-10-21 16:16   ` John MacFarlane
       [not found]     ` <C16FAB55-4036-460E-A3CA-5C755EB2F207-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: John MacFarlane @ 2022-10-21 16:16 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

I'd recommend trying with `--reference-links`, which might reduce the width enough for them to fit on one line.

Another option is to force pipe tables to be used.

`-t markdown-grid_tables-multiline_tables-simple_tables`

or simply

`-t gfm`


> On Oct 21, 2022, at 5:30 AM, Jan Stühler <jan.stuehler-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> 
> Hello group.
> 
> I use 
> ```
> pandoc -f docx -t markdown --extract-media "Lab 1-2.docx-dir" -o file.md file.docx
> ```
> to convert a word document to markdown. The word document has (many) images which are sitting in table cells. One of the results is:
> ```
>    1 +===============================+======================================+
>    2 | Click on the Link:\           | ![](./Lab 1-2.docx-dir/media/ima     |
>    3 | *[Click here to start the     | ge7.png){width="3.643961067366579in" |
>    4 | Local Service                 | height="1.9991174540682415in"}       |
> ```
> (Line numbers from `vim`).
> 
> Observe the line break between `ima` and `ge7.png`.
> 
> To convert this markdown to word, I use
> ```
> pandoc -f markdown -t docx -o file-new.docx file.md
> ```
> which results in this error message:
> ```
> [WARNING] Could not fetch resource ./Lab%201-2.docx-dir/media/ima%20ge7.png: replacing image with description
> ```
> Observe the `%20` between `ima` and `ge7.png`.
> 
> Is there something I can do so that pandoc can put the images into the word document?
> 
> Thanks alot.
> 
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/02f65a26-e99a-4cfb-9ef7-899e7f40f899n%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/C16FAB55-4036-460E-A3CA-5C755EB2F207%40gmail.com.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: docx with images in tables to markdown and back
       [not found]     ` <C16FAB55-4036-460E-A3CA-5C755EB2F207-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2022-10-22 18:56       ` Jan Stühler
  0 siblings, 0 replies; 3+ messages in thread
From: Jan Stühler @ 2022-10-22 18:56 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2657 bytes --]

Interestingly, -t gfm brings me HTML tables. But --reference-links helped 
me very much. Thanks alot for that, must have skipped that in the 
documentation.



fiddlosopher schrieb am Freitag, 21. Oktober 2022 um 18:16:36 UTC+2:

> I'd recommend trying with `--reference-links`, which might reduce the 
> width enough for them to fit on one line.
>
> Another option is to force pipe tables to be used.
>
> `-t markdown-grid_tables-multiline_tables-simple_tables`
>
> or simply
>
> `-t gfm`
>
>
> > On Oct 21, 2022, at 5:30 AM, Jan Stühler <jan.st...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > 
> > Hello group.
> > 
> > I use 
> > ```
> > pandoc -f docx -t markdown --extract-media "Lab 1-2.docx-dir" -o file.md 
> file.docx
> > ```
> > to convert a word document to markdown. The word document has (many) 
> images which are sitting in table cells. One of the results is:
> > ```
> > 1 
> +===============================+======================================+
> > 2 | Click on the Link:\ | ![](./Lab 1-2.docx-dir/media/ima |
> > 3 | *[Click here to start the | ge7.png){width="3.643961067366579in" |
> > 4 | Local Service | height="1.9991174540682415in"} |
> > ```
> > (Line numbers from `vim`).
> > 
> > Observe the line break between `ima` and `ge7.png`.
> > 
> > To convert this markdown to word, I use
> > ```
> > pandoc -f markdown -t docx -o file-new.docx file.md
> > ```
> > which results in this error message:
> > ```
> > [WARNING] Could not fetch resource 
> ./Lab%201-2.docx-dir/media/ima%20ge7.png: replacing image with description
> > ```
> > Observe the `%20` between `ima` and `ge7.png`.
> > 
> > Is there something I can do so that pandoc can put the images into the 
> word document?
> > 
> > Thanks alot.
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "pandoc-discuss" group.
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/pandoc-discuss/02f65a26-e99a-4cfb-9ef7-899e7f40f899n%40googlegroups.com
> .
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/efc647f9-eab5-48a4-85b9-cc4dc63114b6n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 3891 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-10-22 18:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-21 12:30 docx with images in tables to markdown and back Jan Stühler
     [not found] ` <02f65a26-e99a-4cfb-9ef7-899e7f40f899n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-10-21 16:16   ` John MacFarlane
     [not found]     ` <C16FAB55-4036-460E-A3CA-5C755EB2F207-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-10-22 18:56       ` Jan Stühler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).