public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Link issues in a Word docx document
@ 2017-01-25  9:08 Franco Fassio
       [not found] ` <edd179ad-c225-4654-bc08-fe1ed6cea04d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 2+ messages in thread
From: Franco Fassio @ 2017-01-25  9:08 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2567 bytes --]

This my first post, so hello to everybody and please let me thank the 
Pandoc developers for their work and for sharing it for free.

I'm writing because I spotted one MS Word docx document with hyperlinks 
issues when converted to html: links texts are just surrounded with <em> 
tags and not converted to html links.
I'm using a simple command line such as "pandoc -S in.docx -o out.html" 
(nothing changes with or without -S option).
Curiously, if I open the document with issues and save it again with Word, 
then Pandoc conversion works flawlessly.
Looking at document.xml inside the docx "zipped" file, a colleague of mine 
spotted several differences in hyperlink syntax (see attachment). This 
explains Pandoc behaviour but I suppose Pandoc should properly handle 
hyperlinks in both cases.

It seems a sort of Pandoc "bug" to me and, as instructed 
(http://pandoc.org/help.html), I'm writing to this group to raise the issue 
here. What do you think? Should I report this issue as a bug?

Thank you for your help.



-- 
*WARNING: our spam filters may occasionally eliminate legitimate e-mails 
from clients. *
*If your e-mail contains important instructions, please ensure that we 
acknowledge receipt of those instructions.*
..............................................................................................................................................................................................................
This e-mail is confidential and may contain attorney privileged information 
intended for the addressee(s) only.
Questo e-mail è riservato e tutelato dal segreto professionale ed è rivolto 
esclusivamente ai destinatari identificati.
Ce courriel est confidentiel, il est couvert par le secret professionnel et 
entièrement réservé aux seuls destinataires identifiés.
Este e-mail es confidencial y está protegido por el secreto profesional y 
dirigido exclusivamente a los destinatarios identificados.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/edd179ad-c225-4654-bc08-fe1ed6cea04d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 3359 bytes --]

[-- Attachment #2: linkissue.png --]
[-- Type: image/png, Size: 392798 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Link issues in a Word docx document
       [not found] ` <edd179ad-c225-4654-bc08-fe1ed6cea04d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2017-01-25 11:34   ` John MacFarlane
  0 siblings, 0 replies; 2+ messages in thread
From: John MacFarlane @ 2017-01-25 11:34 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

The document on the right in your image uses

<w:instrText> HYPERLINK ... </w:instrText>

rather than <w:hyperlink>.

Pandoc handles <w:hyperlink> but not the <w:instrText> version.
It should be fairly simple to add support for the latter.
Please submit an issue on the bug tracker,
https://github.com/jgm/pandoc/issues

+++ Franco Fassio [Jan 25 17 01:08 ]:
>   This my first post, so hello to everybody and please let me thank the
>   Pandoc developers for their work and for sharing it for free.
>   I'm writing because I spotted one MS Word docx document with hyperlinks
>   issues when converted to html: links texts are just surrounded with
>   <em> tags and not converted to html links.
>   I'm using a simple command line such as "pandoc -S in.docx -o out.html"
>   (nothing changes with or without -S option).
>   Curiously, if I open the document with issues and save it again with
>   Word, then Pandoc conversion works flawlessly.
>   Looking at document.xml inside the docx "zipped" file, a colleague of
>   mine spotted several differences in hyperlink syntax (see attachment).
>   This explains Pandoc behaviour but I suppose Pandoc should properly
>   handle hyperlinks in both cases.
>   It seems a sort of Pandoc "bug" to me and, as instructed
>   (http://pandoc.org/help.html), I'm writing to this group to raise the
>   issue here. What do you think? Should I report this issue as a bug?
>   Thank you for your help.
>
>   WARNING: our spam filters may occasionally eliminate legitimate e-mails
>   from clients.
>
>   If your e-mail contains important instructions, please ensure that we
>   acknowledge receipt of those instructions.
>
>   ............................................................
>   ............................................................
>   ............................................................
>   ..........................
>
>   This e-mail is confidential and may contain attorney privileged
>   information intended for the addressee(s) only.
>
>   Questo e-mail è riservato e tutelato dal segreto professionale ed è
>   rivolto esclusivamente ai destinatari identificati.
>
>   Ce courriel est confidentiel, il est couvert par le secret
>   professionnel et entièrement réservé aux seuls destinataires
>   identifiés.
>
>   Este e-mail es confidencial y está protegido por el secreto profesional
>   y dirigido exclusivamente a los destinatarios identificados.
>
>   --
>   You received this message because you are subscribed to the Google
>   Groups "pandoc-discuss" group.
>   To unsubscribe from this group and stop receiving emails from it, send
>   an email to [1]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>   To post to this group, send email to
>   [2]pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>   To view this discussion on the web visit
>   [3]https://groups.google.com/d/msgid/pandoc-discuss/edd179ad-c225-4654-
>   bc08-fe1ed6cea04d%40googlegroups.com.
>   For more options, visit [4]https://groups.google.com/d/optout.
>
>References
>
>   1. mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>   2. mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>   3. https://groups.google.com/d/msgid/pandoc-discuss/edd179ad-c225-4654-bc08-fe1ed6cea04d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org?utm_medium=email&utm_source=footer
>   4. https://groups.google.com/d/optout


-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/20170125113434.GC32209%40Administrateurs-iMac-3.local.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-01-25 11:34 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-25  9:08 Link issues in a Word docx document Franco Fassio
     [not found] ` <edd179ad-c225-4654-bc08-fe1ed6cea04d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-01-25 11:34   ` John MacFarlane

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).