I'm getting backslashes and grave accents in my files when converting to md from rtf. (These are files I previously exported from Apple's Pages app to RTF.) The backslashes come at the ends of lines and also before multiple periods (such as an ellipsis), and the grave accents surround characters including smart quotes and em-dashes and also appear at the start of most (but not all) lines in the document. The command I'm using to go from rtf to md  is:

for f in *.rtf; do pandoc --wrap=none "$f" -s -o "${f%.rtf}.md"; done

When I export the same files from Pages to docx and then convert to md, I don't get the grave accents, but I do get some backslashes at the ends of lines. I also get [“]{dir="rtl"} where there's a left double quotation mark and [’]{dir="rtl"} where there's a smart apostrophe (i.e. a right single quotation mark). This code is something in HTML to do with scripts such as Arabic that are read right-to-left—I'm clueless as to what that has to do my documents, which use only English and were never in HTML. The command I'm using to go from docx to md is:

for f in *.docx; do pandoc --wrap=none -t markdown-smart "$f" -s -o "${f%.docx}.md"; done

(I have to use -t markdown-smart, or the smart quotes aren't preserved. But I have a similar issue if I leave it out: I get ["]{dir="rtl"} and [']{dir="rtl"}.)


Does anyone have any thoughts on what might be going on? Clearly there are issues with these files—though not ones that are apparent in Pages—but I have no idea what the issues are.

--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/0e8fb0aa-d500-4152-ab39-a4314ab5d27dn%40googlegroups.com.