public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Replace '^' (caret) with `\textasciicircum` instead of `\^{}` in latex?
@ 2023-08-18  1:33 Jonathan Whiteley
  0 siblings, 0 replies; only message in thread
From: Jonathan Whiteley @ 2023-08-18  1:33 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2568 bytes --]

While using Rmarkdown to pdf (via LaTex), I noticed an odd behaviour:  
within code chunks (with syntax highlighting), the caret character ('^" aka 
"hat" aka "circumflex") was changed into something else in the pdf output 
(specifically ASCII code 136 or the "Modifier letter circumflex accent").  
This is an issue because it means that *copying* the formatted code from 
the pdf results in text that cannot be pasted and used in the same language 
(i.e., command-line) without triggering errors.  In other words, the input 
cannot be reproduced from the output, which is a problem for code examples.

It looks like pandoc is substituting every instance of '^' in the input 
file with "\^{}" --- except in plain fenced code chunks, which are output 
into a `verbatim` environment in the LaTeX output.

My question is: why not substitute '^' with `\textasciicircum` instead?
1. It would be more consistent with the other substitutions in 
`Writers/LaTeX/Util.hs` (line 121)
2. I think it's more user-friendly, since most users would expect to get 
the same character in the output as in the input.
3. It is a better semantic representation of a lone caret character.

My understanding is that `\^{}` is a command for adding a circumflex accent 
to a letter (which can be specified within the curly braces), but with no 
arguments, that LaTeX command merely typesets an accent *with no letter 
underneath*.  That becomes ASCII character code 136 (a circumflex accent 
*modifier*) in the pdf output, whereas `\textasciicircum` produces ASCII 
code 94 ("Caret - circumflex") in the pdf output, which more closely 
matches the input.  I know they might *look* the same in the pdf output 
(depending on the font), but as I described earlier, they are 
*functionally* and *semantically* different.

In practice, the escaping is not strictly necessary in the various `\Verb` 
environments and commands used by pandoc to format code.  In fact, I notice 
it is not escaped at all in a plain fenced code block, which is converted 
to a `verbatim` environment in the LaTeX output.  But if you are going to 
escape it, why is `\^{}` preferred over `\textasciicircum`?

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/745952ed-e718-46e0-9eca-7df92345f7e4n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 3012 bytes --]

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2023-08-18  1:33 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-18  1:33 Replace '^' (caret) with `\textasciicircum` instead of `\^{}` in latex? Jonathan Whiteley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).