public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* pandoc docx to md generate image with size in inches
@ 2017-05-16 13:34 'laperouse laperouse' via pandoc-discuss
       [not found] ` <c167fca6-e90b-4183-ab31-f1f657c40189-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: 'laperouse laperouse' via pandoc-discuss @ 2017-05-16 13:34 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 864 bytes --]

Hello,

when I generate an MD from a docx file, images have a width in inches like 
this:

![](documentation-media/media/image19.png){width="3.4692344706911635in"
height="1.3647736220472442in"}

Is it possible to remove this information or to have it in pixel or cm/mm ?

Thanks,
Laperouse.


-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/c167fca6-e90b-4183-ab31-f1f657c40189%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 1451 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: pandoc docx to md generate image with size in inches
       [not found] ` <c167fca6-e90b-4183-ab31-f1f657c40189-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2017-05-17 10:56   ` John MacFarlane
       [not found]     ` <20170517105656.GE12816-BKjuZOBx5Kn2N3qrpRCZGbhGAdq7xJNKhPhL2mjWHbk@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: John MacFarlane @ 2017-05-17 10:56 UTC (permalink / raw)
  To: 'laperouse laperouse' via pandoc-discuss

If you don't want the attributes at all, you can specify

    -t markdown-link_attributes

Here's the relevant code from the docx reader:

    extentToAttr :: Extent -> Attr
    extentToAttr (Just (w, h)) =
      ("", [], [("width", showDim w), ("height", showDim h)] )
      where
        showDim d = show (d / 914400) ++ "in"
    extentToAttr _ = nullAttr

As you can see, the image dimensions are specified in EMU:
914400 EMU = 1 in.
360000 EMU = 1 cm.

I guess we could convert to cm instead of inches, and maybe
that makes more sense?

Also, we could limit the number of significant decimal
places.

Any thoughts?


+++ 'laperouse laperouse' via pandoc-discuss [May 16 17 06:34 ]:
>   Hello,
>   when I generate an MD from a docx file, images have a width in inches
>   like this:
>   ![](documentation-media/media/image19.png){width="3.4692344706911635in"
>   height="1.3647736220472442in"}
>   Is it possible to remove this information or to have it in pixel or
>   cm/mm ?
>   Thanks,
>   Laperouse.
>
>   --
>   You received this message because you are subscribed to the Google
>   Groups "pandoc-discuss" group.
>   To unsubscribe from this group and stop receiving emails from it, send
>   an email to [1]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>   To post to this group, send email to
>   [2]pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>   To view this discussion on the web visit
>   [3]https://groups.google.com/d/msgid/pandoc-discuss/c167fca6-e90b-4183-
>   ab31-f1f657c40189%40googlegroups.com.
>   For more options, visit [4]https://groups.google.com/d/optout.
>
>References
>
>   1. mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>   2. mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>   3. https://groups.google.com/d/msgid/pandoc-discuss/c167fca6-e90b-4183-ab31-f1f657c40189-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org?utm_medium=email&utm_source=footer
>   4. https://groups.google.com/d/optout


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: pandoc docx to md generate image with size in inches
       [not found]     ` <20170517105656.GE12816-BKjuZOBx5Kn2N3qrpRCZGbhGAdq7xJNKhPhL2mjWHbk@public.gmane.org>
@ 2017-05-17 16:43       ` 'laperouse laperouse' via pandoc-discuss
       [not found]         ` <4053cf3a-127e-4fc4-afa2-df969e80c71a-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: 'laperouse laperouse' via pandoc-discuss @ 2017-05-17 16:43 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 3215 bytes --]

yes, cm or mm would make more sense.
However, I noticed the issue when editing the md in visual studio with 
Markdown Editor: the attributes were passed directly in the html, but 
apparently html5 only support CSS pixels for size not in or cm.
So, maybe putting the size in CSS pixel is more consistent ?

Le mercredi 17 mai 2017 12:57:50 UTC+2, John MacFarlane a écrit :
>
> If you don't want the attributes at all, you can specify 
>
>     -t markdown-link_attributes 
>
> Here's the relevant code from the docx reader: 
>
>     extentToAttr :: Extent -> Attr 
>     extentToAttr (Just (w, h)) = 
>       ("", [], [("width", showDim w), ("height", showDim h)] ) 
>       where 
>         showDim d = show (d / 914400) ++ "in" 
>     extentToAttr _ = nullAttr 
>
> As you can see, the image dimensions are specified in EMU: 
> 914400 EMU = 1 in. 
> 360000 EMU = 1 cm. 
>
> I guess we could convert to cm instead of inches, and maybe 
> that makes more sense? 
>
> Also, we could limit the number of significant decimal 
> places. 
>
> Any thoughts? 
>
>
> +++ 'laperouse laperouse' via pandoc-discuss [May 16 17 06:34 ]: 
> >   Hello, 
> >   when I generate an MD from a docx file, images have a width in inches 
> >   like this: 
> >   
> ![](documentation-media/media/image19.png){width="3.4692344706911635in" 
> >   height="1.3647736220472442in"} 
> >   Is it possible to remove this information or to have it in pixel or 
> >   cm/mm ? 
> >   Thanks, 
> >   Laperouse. 
> > 
> >   -- 
> >   You received this message because you are subscribed to the Google 
> >   Groups "pandoc-discuss" group. 
> >   To unsubscribe from this group and stop receiving emails from it, send 
> >   an email to [1]pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>. 
> >   To post to this group, send email to 
> >   [2]pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>. 
> >   To view this discussion on the web visit 
> >   [3]
> https://groups.google.com/d/msgid/pandoc-discuss/c167fca6-e90b-4183- 
> >   ab31-f1f657c40189%40googlegroups.com. 
> >   For more options, visit [4]https://groups.google.com/d/optout. 
> > 
> >References 
> > 
> >   1. mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:> 
> >   2. mailto:pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:> 
> >   3. 
> https://groups.google.com/d/msgid/pandoc-discuss/c167fca6-e90b-4183-ab31-f1f657c40189-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org?utm_medium=email&utm_source=footer 
> >   4. https://groups.google.com/d/optout 
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/4053cf3a-127e-4fc4-afa2-df969e80c71a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 6685 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: pandoc docx to md generate image with size in inches
       [not found]         ` <4053cf3a-127e-4fc4-afa2-df969e80c71a-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2017-05-17 18:43           ` John MacFarlane
       [not found]             ` <20170517184326.GA30301-l/d5Ua9yGnxXsXJlQylH7w@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: John MacFarlane @ 2017-05-17 18:43 UTC (permalink / raw)
  To: 'laperouse laperouse' via pandoc-discuss

+++ 'laperouse laperouse' via pandoc-discuss [May 17 17 09:43 ]:
>   yes, cm or mm would make more sense.
>   However, I noticed the issue when editing the md in visual studio with
>   Markdown Editor: the attributes were passed directly in the html, but
>   apparently html5 only support CSS pixels for size not in or cm.
>   So, maybe putting the size in CSS pixel is more consistent ?

Pixels isn't very good when you're targeting print formats.
And there's no non-arbitrary conversion from EMU to pixels.

This page shows broad support for various units in CSS:
https://www.w3schools.com/cssref/css_units.asp
em, ex, %, px, cm, mm, in, pt, pc


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: pandoc docx to md generate image with size in inches
       [not found]             ` <20170517184326.GA30301-l/d5Ua9yGnxXsXJlQylH7w@public.gmane.org>
@ 2017-05-18  9:33               ` 'laperouse laperouse' via pandoc-discuss
  2017-05-18 17:18               ` 'laperouse laperouse' via pandoc-discuss
                                 ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: 'laperouse laperouse' via pandoc-discuss @ 2017-05-18  9:33 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2626 bytes --]

Apparently, CSS pixels are different from device pixel.
In your link, at the bottom, they say CSS pixel may render to multiple 
device pixels if necessary.
So, they arbitrarily define CSS pixel to 1/96 in.
By chance, pandoc --dpi option also defines dots to 1/96 in.
So, you can convert EMU to in and in to dots thanks to --dpi (which I would 
assume to be CSS pixels).

In the html5 spec (see chatper 4.7.16), I understand they say only CSS 
pixels are supported in inline width or height option on img, which is 
precisely what pandoc produces with attributes in md.
https://www.w3.org/TR/html5/embedded-content-0.html#attr-dim-width

the CSS units are only valid within CSS not in inline attribute, in HTML5.

if you try to validate this sample, on w3.org, then it says only digits are 
allowed on width and height.
<!DOCTYPE html>
<html>
<head>
<title>aefeaz</title>
</head>
<body>
<img width="100in" height="120in"/>
</body>
</html>

https://validator.w3.org/nu/#textarea

If you wrap it inside a style attribute then it is allowed.

I agree with you on the fact that pixel is not suited for print. 
In that case, using inline style attribute to wrap width and height 
attributes allow other CSS units, though it is more verbose.



Le mercredi 17 mai 2017 20:43:42 UTC+2, John MacFarlane a écrit :
>
> +++ 'laperouse laperouse' via pandoc-discuss [May 17 17 09:43 ]: 
> >   yes, cm or mm would make more sense. 
> >   However, I noticed the issue when editing the md in visual studio with 
> >   Markdown Editor: the attributes were passed directly in the html, but 
> >   apparently html5 only support CSS pixels for size not in or cm. 
> >   So, maybe putting the size in CSS pixel is more consistent ? 
>
> Pixels isn't very good when you're targeting print formats. 
> And there's no non-arbitrary conversion from EMU to pixels. 
>
> This page shows broad support for various units in CSS: 
> https://www.w3schools.com/cssref/css_units.asp 
> em, ex, %, px, cm, mm, in, pt, pc 
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/80096919-cedd-49be-9a7b-190de60dcb8e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 4109 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: pandoc docx to md generate image with size in inches
       [not found]             ` <20170517184326.GA30301-l/d5Ua9yGnxXsXJlQylH7w@public.gmane.org>
  2017-05-18  9:33               ` 'laperouse laperouse' via pandoc-discuss
@ 2017-05-18 17:18               ` 'laperouse laperouse' via pandoc-discuss
  2017-05-19  7:32               ` 'laperouse laperouse' via pandoc-discuss
  2017-05-20  7:29               ` Andrew Dunning
  3 siblings, 0 replies; 8+ messages in thread
From: 'laperouse laperouse' via pandoc-discuss @ 2017-05-18 17:18 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2428 bytes --]

true, but if I reprocess the md file with markdig for instance then the 
attributes are simply passed trough and not reprocessed like pandoc is 
doing.
I read in the doc that for img width and height you convert the values to 
pixels if the output is html, so these attributes are not strictly html.
If they were strictly html5, then the value would be in CSS pixel which is 
different from device pixel.
In your link, at the bottom they say that CSS pixels are arbitrarily 
defined as 1/96 of in.
In the HTML5 spec they say inline attributes for size on img are in CSS 
pixels. you cannot use in or cm.
You have to embed them into a style attribute to have access to CSS 
possibility.
So, it would be good to have an option to produce those width height 
attributes in a pure html5 compliant so that we can reprocess the md with 
other parsers, but also keep possibility to work with cm or in for print.

Thanks.

Le mercredi 17 mai 2017 20:43:42 UTC+2, John MacFarlane a écrit :
>
> +++ 'laperouse laperouse' via pandoc-discuss [May 17 17 09:43 ]: 
> >   yes, cm or mm would make more sense. 
> >   However, I noticed the issue when editing the md in visual studio with 
> >   Markdown Editor: the attributes were passed directly in the html, but 
> >   apparently html5 only support CSS pixels for size not in or cm. 
> >   So, maybe putting the size in CSS pixel is more consistent ? 
>
> Pixels isn't very good when you're targeting print formats. 
> And there's no non-arbitrary conversion from EMU to pixels. 
>
> This page shows broad support for various units in CSS: 
> https://www.w3schools.com/cssref/css_units.asp 
> <https://www.google.com/url?q=https%3A%2F%2Fwww.w3schools.com%2Fcssref%2Fcss_units.asp&sa=D&sntz=1&usg=AFQjCNH77smfsm1DhU2Tio2NU7EsJcEEAg> 
> em, ex, %, px, cm, mm, in, pt, pc 
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/a022643c-4c66-45e2-bb4a-cdea88b9c823%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 3540 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: pandoc docx to md generate image with size in inches
       [not found]             ` <20170517184326.GA30301-l/d5Ua9yGnxXsXJlQylH7w@public.gmane.org>
  2017-05-18  9:33               ` 'laperouse laperouse' via pandoc-discuss
  2017-05-18 17:18               ` 'laperouse laperouse' via pandoc-discuss
@ 2017-05-19  7:32               ` 'laperouse laperouse' via pandoc-discuss
  2017-05-20  7:29               ` Andrew Dunning
  3 siblings, 0 replies; 8+ messages in thread
From: 'laperouse laperouse' via pandoc-discuss @ 2017-05-19  7:32 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2087 bytes --]

true for print.
However, those attributes looks like html but are not really in the sense 
that pandoc is reprocessing them and convert them to pixels for HTML.
By the way, CSS pixels are different from device pixels. In your link, at 
the bottom, they are saying that it is arbitrarily defined by 1/96 of in.
In HTML5 specs, they say that inline attributes for size are always in CSS 
pixels. The other uoms are for CSS only, so you have to wrap into a style 
attribute to be valid.
So, if I process the md with some other parsers, they may not do the same 
conversion as pandoc, and just output the attributes to html which is not 
good.
So, it would be good to have an option to produce pure html5 attributes.
Thanks.

Le mercredi 17 mai 2017 20:43:42 UTC+2, John MacFarlane a écrit :
>
> +++ 'laperouse laperouse' via pandoc-discuss [May 17 17 09:43 ]: 
> >   yes, cm or mm would make more sense. 
> >   However, I noticed the issue when editing the md in visual studio with 
> >   Markdown Editor: the attributes were passed directly in the html, but 
> >   apparently html5 only support CSS pixels for size not in or cm. 
> >   So, maybe putting the size in CSS pixel is more consistent ? 
>
> Pixels isn't very good when you're targeting print formats. 
> And there's no non-arbitrary conversion from EMU to pixels. 
>
> This page shows broad support for various units in CSS: 
> https://www.w3schools.com/cssref/css_units.asp 
> em, ex, %, px, cm, mm, in, pt, pc 
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/9b900138-db1c-43bb-abae-2df888d68d03%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 3224 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: pandoc docx to md generate image with size in inches
       [not found]             ` <20170517184326.GA30301-l/d5Ua9yGnxXsXJlQylH7w@public.gmane.org>
                                 ` (2 preceding siblings ...)
  2017-05-19  7:32               ` 'laperouse laperouse' via pandoc-discuss
@ 2017-05-20  7:29               ` Andrew Dunning
  3 siblings, 0 replies; 8+ messages in thread
From: Andrew Dunning @ 2017-05-20  7:29 UTC (permalink / raw)
  To: pandoc-discuss

[-- Attachment #1: Type: text/plain, Size: 92 bytes --]

For tidiness, what if it were written in millimetres and also rounded to one decimal place?

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-05-20  7:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-16 13:34 pandoc docx to md generate image with size in inches 'laperouse laperouse' via pandoc-discuss
     [not found] ` <c167fca6-e90b-4183-ab31-f1f657c40189-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-05-17 10:56   ` John MacFarlane
     [not found]     ` <20170517105656.GE12816-BKjuZOBx5Kn2N3qrpRCZGbhGAdq7xJNKhPhL2mjWHbk@public.gmane.org>
2017-05-17 16:43       ` 'laperouse laperouse' via pandoc-discuss
     [not found]         ` <4053cf3a-127e-4fc4-afa2-df969e80c71a-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-05-17 18:43           ` John MacFarlane
     [not found]             ` <20170517184326.GA30301-l/d5Ua9yGnxXsXJlQylH7w@public.gmane.org>
2017-05-18  9:33               ` 'laperouse laperouse' via pandoc-discuss
2017-05-18 17:18               ` 'laperouse laperouse' via pandoc-discuss
2017-05-19  7:32               ` 'laperouse laperouse' via pandoc-discuss
2017-05-20  7:29               ` Andrew Dunning

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).