public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Why does pandoc 2.5 change a hypen in to the code for
@ 2018-12-13  5:56 T. Kurt Bond
       [not found] ` <CAN1EhV9aVHCaK5pdYYP8efWB6o2WLpGMr3j4W3kDgVGqRvXNzw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: T. Kurt Bond @ 2018-12-13  5:56 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 1552 bytes --]

Using pandoc 2.5 under Mac OS X with pandoc -t ms --output=test.ms test.md this
Markdown code:

# Maple Trunk![Maple Trunk](Images/maple-trunk.eps)

Turns into this MS code:

.SH 1Maple Trunk.pdfhref O 1 "Maple Trunk".pdfhref M
"maple-trunk".PSPIC -C "Images/maple\-trunk.eps".ce 1000Maple Trunk.ce
0

Notice that pandoc has changed the innocent hyphen in Images/maple-trunk.eps
 to \-, a backslash hyphen, which is a actually a special character in
groff, a typographical minus. Then pandoc -t ms --output=test.pdf test.md
 causes groff in the end to complain:

<standard input>:67: a special character is not allowed in a name
<standard input>:67: can't open `Images/maple': No such file or directory

and makes the picture inclusion fail.

This used to work fine.

Why does pandoc change the hypen into a troff code for a minus? I can’t see
any reason for it.
-- 
T. Kurt Bond, tkurtbond-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAN1EhV9aVHCaK5pdYYP8efWB6o2WLpGMr3j4W3kDgVGqRvXNzw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 4672 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Why does pandoc 2.5 change a hypen in to the code for
       [not found] ` <CAN1EhV9aVHCaK5pdYYP8efWB6o2WLpGMr3j4W3kDgVGqRvXNzw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-12-13 14:54   ` John MacFarlane
       [not found]     ` <m2sgz14gui.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: John MacFarlane @ 2018-12-13 14:54 UTC (permalink / raw)
  To: T. Kurt Bond, pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw


In roff:

\- is the ASCII minus/hyphen character.
-  is a unicode hyphen.

Pandoc just treats all hyphens as \- (ASCII
minus/hyphen). Why?  Because this is often what is
wanted.  From the groff man page:

  \-     Minus sign.  Also use this to display syntax elements that
         require the ASCII hyphen-minus character, for example command-
         line options and C language operators.  The unescaped ‘-’
         input character is not appropriate for these cases because it
         may render as a hyphen on some output devices.

If we didn't escape the -s, then when someone writes
`--help` in their man page, it would render with
unicode hyphens, and would not be copy-pasteable etc.

That said, I agree that the escaping shouldn't occur
in the filename context you give.

One potential solution would be to escape - as \- only
when it's in a code block or span.  I assume people
use these for command-line options, but that may not
be right.

Or we could just distinguish this filename context for
special treatment.

One way or the other, I think we need an issue on the tracker.

"T. Kurt Bond" <tkurtbond-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Using pandoc 2.5 under Mac OS X with pandoc -t ms --output=test.ms test.md this
> Markdown code:
>
> # Maple Trunk![Maple Trunk](Images/maple-trunk.eps)
>
> Turns into this MS code:
>
> .SH 1Maple Trunk.pdfhref O 1 "Maple Trunk".pdfhref M
> "maple-trunk".PSPIC -C "Images/maple\-trunk.eps".ce 1000Maple Trunk.ce
> 0
>
> Notice that pandoc has changed the innocent hyphen in Images/maple-trunk.eps
>  to \-, a backslash hyphen, which is a actually a special character in
> groff, a typographical minus. Then pandoc -t ms --output=test.pdf test.md
>  causes groff in the end to complain:
>
> <standard input>:67: a special character is not allowed in a name
> <standard input>:67: can't open `Images/maple': No such file or directory
>
> and makes the picture inclusion fail.
>
> This used to work fine.
>
> Why does pandoc change the hypen into a troff code for a minus? I can’t see
> any reason for it.
> -- 
> T. Kurt Bond, tkurtbond-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
>
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAN1EhV9aVHCaK5pdYYP8efWB6o2WLpGMr3j4W3kDgVGqRvXNzw%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/m2sgz14gui.fsf%40johnmacfarlane.net.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Why does pandoc 2.5 change a hypen in to the code for
       [not found]     ` <m2sgz14gui.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
@ 2018-12-13 16:07       ` T. Kurt Bond
       [not found]         ` <CAN1EhV9g4p5zb-vOerV-QYGO+xfDCDXnuDwnALNZWCwnNMpB0g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: T. Kurt Bond @ 2018-12-13 16:07 UTC (permalink / raw)
  To: jgm-TVLZxgkOlNX2fBVCVOL8/A; +Cc: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 5267 bytes --]

Take the following Markdown file and formated it with pandoc -w ms
--output=test.pdf :

# Trying out Hyphen-Minus in pandoc -w ms

This is a hyphen, -, which pandoc -w ms turns into a Unicode minus sign.

Use the command ``ls -a`` to see all the dotfiles in a directory.


Open up the PDF and copy and paste the command into your shell and try it.
It will complain that

ls: −a: No such file or directory


Because the character before the "a" is a Unicode MINUS SIGN, code point
0x2212, not a ASCII hyphen, also know as the Unicode HYPHEN-MINUS, code
point 45.

This is the exact opposite of what you want.

The practice of using \- in man pages for the option indicator started
because \- looks better typeset, and in nroff it was interpreted as an
ASCII minus anyway, since that's all the old terminals could display.  But
if you look at groff's an-old.tmac maco file you'll find near the bottom
the following section:

.\" For UTF-8, map some characters conservatively for the sake
.\" of easy cut and paste.
.
.if '\*[.T]'utf8' \{\
.  rchar \- - ' `
.
.  char \- \N'45'
.  char  - \N'45'
.  char  ' \N'39'
.  char  ` \N'96'
.\}

This section *removes* the \- character (along with some others) and
explicitly redefines it as code point 45, the HYPHEN-MINUS.  They had to do
this for groff's -Tutf8 output mode, because so many man pages use \- for
the option character and people looking at man pages in UTF-8 were getting
the wrong character when they cut and pasted.

On Thu, Dec 13, 2018 at 9:54 AM John MacFarlane <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> wrote:

>
> In roff:
>
> \- is the ASCII minus/hyphen character.
> -  is a unicode hyphen.
>
> Pandoc just treats all hyphens as \- (ASCII
> minus/hyphen). Why?  Because this is often what is
> wanted.  From the groff man page:
>
>   \-     Minus sign.  Also use this to display syntax elements that
>          require the ASCII hyphen-minus character, for example command-
>          line options and C language operators.  The unescaped ‘-’
>          input character is not appropriate for these cases because it
>          may render as a hyphen on some output devices.
>
> If we didn't escape the -s, then when someone writes
> `--help` in their man page, it would render with
> unicode hyphens, and would not be copy-pasteable etc.
>
> That said, I agree that the escaping shouldn't occur
> in the filename context you give.
>
> One potential solution would be to escape - as \- only
> when it's in a code block or span.  I assume people
> use these for command-line options, but that may not
> be right.
>
> Or we could just distinguish this filename context for
> special treatment.
>
> One way or the other, I think we need an issue on the tracker.
>
> "T. Kurt Bond" <tkurtbond-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>
> > Using pandoc 2.5 under Mac OS X with pandoc -t ms --output=test.ms
> test.md this
> > Markdown code:
> >
> > # Maple Trunk![Maple Trunk](Images/maple-trunk.eps)
> >
> > Turns into this MS code:
> >
> > .SH 1Maple Trunk.pdfhref O 1 "Maple Trunk".pdfhref M
> > "maple-trunk".PSPIC -C "Images/maple\-trunk.eps".ce 1000Maple Trunk.ce
> > 0
> >
> > Notice that pandoc has changed the innocent hyphen in
> Images/maple-trunk.eps
> >  to \-, a backslash hyphen, which is a actually a special character in
> > groff, a typographical minus. Then pandoc -t ms --output=test.pdf test.md
> >  causes groff in the end to complain:
> >
> > <standard input>:67: a special character is not allowed in a name
> > <standard input>:67: can't open `Images/maple': No such file or directory
> >
> > and makes the picture inclusion fail.
> >
> > This used to work fine.
> >
> > Why does pandoc change the hypen into a troff code for a minus? I can’t
> see
> > any reason for it.
> > --
> > T. Kurt Bond, tkurtbond-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
> >
> > --
> > You received this message because you are subscribed to the Google
> Groups "pandoc-discuss" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> > To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> > To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/CAN1EhV9aVHCaK5pdYYP8efWB6o2WLpGMr3j4W3kDgVGqRvXNzw%40mail.gmail.com
> .
> > For more options, visit https://groups.google.com/d/optout.
>


-- 
T. Kurt Bond, tkurtbond-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAN1EhV9g4p5zb-vOerV-QYGO%2BxfDCDXnuDwnALNZWCwnNMpB0g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 8750 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Why does pandoc 2.5 change a hypen in to the code for
       [not found]         ` <CAN1EhV9g4p5zb-vOerV-QYGO+xfDCDXnuDwnALNZWCwnNMpB0g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-12-13 19:25           ` John MacFarlane
  0 siblings, 0 replies; 4+ messages in thread
From: John MacFarlane @ 2018-12-13 19:25 UTC (permalink / raw)
  To: T. Kurt Bond; +Cc: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw


I tried your experiment, and you are right.  The \-
turn into unicode minus signs.

Now I'm really confused, though!  The passage I quoted
from the man page says the opposite:

\- Minus sign. Also use this to display syntax
   elements that require the ASCII hyphen-minus
   character, for example command-line options and C
   language operators. The unescaped ‘-’ input character
   is not appropriate for these cases because it may
   render as a hyphen on some output devices.

But now I can't find this passage in my man pages!
Perhaps I was finding an outdated version of the man
page on the internet?

Anyway, I'll make this change right away.


"T. Kurt Bond" <tkurtbond-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Take the following Markdown file and formated it with pandoc -w ms
> --output=test.pdf :
>
> # Trying out Hyphen-Minus in pandoc -w ms
>
> This is a hyphen, -, which pandoc -w ms turns into a Unicode minus sign.
>
> Use the command ``ls -a`` to see all the dotfiles in a directory.
>
>
> Open up the PDF and copy and paste the command into your shell and try it.
> It will complain that
>
> ls: −a: No such file or directory
>
>
> Because the character before the "a" is a Unicode MINUS SIGN, code point
> 0x2212, not a ASCII hyphen, also know as the Unicode HYPHEN-MINUS, code
> point 45.
>
> This is the exact opposite of what you want.
>
> The practice of using \- in man pages for the option indicator started
> because \- looks better typeset, and in nroff it was interpreted as an
> ASCII minus anyway, since that's all the old terminals could display.  But
> if you look at groff's an-old.tmac maco file you'll find near the bottom
> the following section:
>
> .\" For UTF-8, map some characters conservatively for the sake
> .\" of easy cut and paste.
> .
> .if '\*[.T]'utf8' \{\
> .  rchar \- - ' `
> .
> .  char \- \N'45'
> .  char  - \N'45'
> .  char  ' \N'39'
> .  char  ` \N'96'
> .\}
>
> This section *removes* the \- character (along with some others) and
> explicitly redefines it as code point 45, the HYPHEN-MINUS.  They had to do
> this for groff's -Tutf8 output mode, because so many man pages use \- for
> the option character and people looking at man pages in UTF-8 were getting
> the wrong character when they cut and pasted.
>
> On Thu, Dec 13, 2018 at 9:54 AM John MacFarlane <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> wrote:
>
>>
>> In roff:
>>
>> \- is the ASCII minus/hyphen character.
>> -  is a unicode hyphen.
>>
>> Pandoc just treats all hyphens as \- (ASCII
>> minus/hyphen). Why?  Because this is often what is
>> wanted.  From the groff man page:
>>
>>   \-     Minus sign.  Also use this to display syntax elements that
>>          require the ASCII hyphen-minus character, for example command-
>>          line options and C language operators.  The unescaped ‘-’
>>          input character is not appropriate for these cases because it
>>          may render as a hyphen on some output devices.
>>
>> If we didn't escape the -s, then when someone writes
>> `--help` in their man page, it would render with
>> unicode hyphens, and would not be copy-pasteable etc.
>>
>> That said, I agree that the escaping shouldn't occur
>> in the filename context you give.
>>
>> One potential solution would be to escape - as \- only
>> when it's in a code block or span.  I assume people
>> use these for command-line options, but that may not
>> be right.
>>
>> Or we could just distinguish this filename context for
>> special treatment.
>>
>> One way or the other, I think we need an issue on the tracker.
>>
>> "T. Kurt Bond" <tkurtbond-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>
>> > Using pandoc 2.5 under Mac OS X with pandoc -t ms --output=test.ms
>> test.md this
>> > Markdown code:
>> >
>> > # Maple Trunk![Maple Trunk](Images/maple-trunk.eps)
>> >
>> > Turns into this MS code:
>> >
>> > .SH 1Maple Trunk.pdfhref O 1 "Maple Trunk".pdfhref M
>> > "maple-trunk".PSPIC -C "Images/maple\-trunk.eps".ce 1000Maple Trunk.ce
>> > 0
>> >
>> > Notice that pandoc has changed the innocent hyphen in
>> Images/maple-trunk.eps
>> >  to \-, a backslash hyphen, which is a actually a special character in
>> > groff, a typographical minus. Then pandoc -t ms --output=test.pdf test.md
>> >  causes groff in the end to complain:
>> >
>> > <standard input>:67: a special character is not allowed in a name
>> > <standard input>:67: can't open `Images/maple': No such file or directory
>> >
>> > and makes the picture inclusion fail.
>> >
>> > This used to work fine.
>> >
>> > Why does pandoc change the hypen into a troff code for a minus? I can’t
>> see
>> > any reason for it.
>> > --
>> > T. Kurt Bond, tkurtbond-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> Groups "pandoc-discuss" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>> an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> > To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> > To view this discussion on the web visit
>> https://groups.google.com/d/msgid/pandoc-discuss/CAN1EhV9aVHCaK5pdYYP8efWB6o2WLpGMr3j4W3kDgVGqRvXNzw%40mail.gmail.com
>> .
>> > For more options, visit https://groups.google.com/d/optout.
>>
>
>
> -- 
> T. Kurt Bond, tkurtbond-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/yh480kr2elcjqh.fsf%40johnmacfarlane.net.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-12-13 19:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-13  5:56 Why does pandoc 2.5 change a hypen in to the code for T. Kurt Bond
     [not found] ` <CAN1EhV9aVHCaK5pdYYP8efWB6o2WLpGMr3j4W3kDgVGqRvXNzw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-12-13 14:54   ` John MacFarlane
     [not found]     ` <m2sgz14gui.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2018-12-13 16:07       ` T. Kurt Bond
     [not found]         ` <CAN1EhV9g4p5zb-vOerV-QYGO+xfDCDXnuDwnALNZWCwnNMpB0g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-12-13 19:25           ` John MacFarlane

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).