public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* "title" style ignored when converting from docx to markdown
@ 2020-11-01 14:31 Juan M.
       [not found] ` <e485b64e-f091-4820-8ed5-d3f38afc610bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: Juan M. @ 2020-11-01 14:31 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1514 bytes --]


In short, I was trying to convert a bunch of .docx files to markdown, and *pandoc 
seems to not include the text that uses the style "Title" from Microsoft 
Word when converting* (it doesn't even add the text to the output file), 
only "Headings". Check the screenshots to compare the original file and the 
output.

For the sake of simplicity, I tend to just use the styles "Title", "Heading 
1", "Heading 2" and so on, in order to reduce the friction when writing 
(see "docx" screenshot). Most of my files follow this structure, and 
although I can manually alter them, as well as start to write using only 
"Heading" styles, *I would like to know if pandoc has some sort of command 
that would fix this situation, or make the alterations simpler*. 

The command I am using is the following (Windows PowerShell):
*pandoc "document title.docx" -f docx -t markdown --atx-headers -o 
"document title.md"*

I chose to use the atx headings (with # instead of ===) because Visual 
Studio Code does not display the other one (setex), though I have no idea 
why.

Thanks in advance for any responses,
Juan.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/e485b64e-f091-4820-8ed5-d3f38afc610bn%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 1977 bytes --]

[-- Attachment #2: markdown-preview.png --]
[-- Type: image/png, Size: 92304 bytes --]

[-- Attachment #3: docx.png --]
[-- Type: image/png, Size: 72527 bytes --]

[-- Attachment #4: markdown-souce.png --]
[-- Type: image/png, Size: 93150 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: "title" style ignored when converting from docx to markdown
       [not found] ` <e485b64e-f091-4820-8ed5-d3f38afc610bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2020-11-01 21:46   ` John MacFarlane
       [not found]     ` <m2lffkn425.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: John MacFarlane @ 2020-11-01 21:46 UTC (permalink / raw)
  To: Juan M., pandoc-discuss


Are you using the -s (--standalone) option?

"Juan M." <jmrcsbrg-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> In short, I was trying to convert a bunch of .docx files to markdown, and *pandoc 
> seems to not include the text that uses the style "Title" from Microsoft 
> Word when converting* (it doesn't even add the text to the output file), 
> only "Headings". Check the screenshots to compare the original file and the 
> output.
>
> For the sake of simplicity, I tend to just use the styles "Title", "Heading 
> 1", "Heading 2" and so on, in order to reduce the friction when writing 
> (see "docx" screenshot). Most of my files follow this structure, and 
> although I can manually alter them, as well as start to write using only 
> "Heading" styles, *I would like to know if pandoc has some sort of command 
> that would fix this situation, or make the alterations simpler*. 
>
> The command I am using is the following (Windows PowerShell):
> *pandoc "document title.docx" -f docx -t markdown --atx-headers -o 
> "document title.md"*
>
> I chose to use the atx headings (with # instead of ===) because Visual 
> Studio Code does not display the other one (setex), though I have no idea 
> why.
>
> Thanks in advance for any responses,
> Juan.
>
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/e485b64e-f091-4820-8ed5-d3f38afc610bn%40googlegroups.com.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: "title" style ignored when converting from docx to markdown
       [not found]     ` <m2lffkn425.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
@ 2020-11-02  3:41       ` Juan M.
  0 siblings, 0 replies; 3+ messages in thread
From: Juan M. @ 2020-11-02  3:41 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 3016 bytes --]

I confess I had not seen this command before (read all sections mentioning 
"title" on the manual, but should have looked into header as well haha).

However, when I tried it produced a header that apparently VSCode does not 
recognize, and is therefore shown only on the source file (not the 
preview). The expected behavior I hoped was for the title to assume the 
position of the first level heading (after all, it is the only one with a 
line underneath it). 
Should I be reading the .md file with something else? 

Apologies for my ignorance, but what would the command do? Upon searching 
the manual, the section "Metadata blocks" does mention one use for it, 
being to produce an html file or, god forbid, LaTeX; but for now just the 
md file seems enough for me.

On Sunday, November 1, 2020 at 6:46:44 PM UTC-3 John MacFarlane wrote:

>
> Are you using the -s (--standalone) option?
>
> "Juan M." <jmrc...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>
> > In short, I was trying to convert a bunch of .docx files to markdown, 
> and *pandoc 
> > seems to not include the text that uses the style "Title" from Microsoft 
> > Word when converting* (it doesn't even add the text to the output file), 
> > only "Headings". Check the screenshots to compare the original file and 
> the 
> > output.
> >
> > For the sake of simplicity, I tend to just use the styles "Title", 
> "Heading 
> > 1", "Heading 2" and so on, in order to reduce the friction when writing 
> > (see "docx" screenshot). Most of my files follow this structure, and 
> > although I can manually alter them, as well as start to write using only 
> > "Heading" styles, *I would like to know if pandoc has some sort of 
> command 
> > that would fix this situation, or make the alterations simpler*. 
> >
> > The command I am using is the following (Windows PowerShell):
> > *pandoc "document title.docx" -f docx -t markdown --atx-headers -o 
> > "document title.md"*
> >
> > I chose to use the atx headings (with # instead of ===) because Visual 
> > Studio Code does not display the other one (setex), though I have no 
> idea 
> > why.
> >
> > Thanks in advance for any responses,
> > Juan.
> >
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "pandoc-discuss" group.
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/pandoc-discuss/e485b64e-f091-4820-8ed5-d3f38afc610bn%40googlegroups.com
> .
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/5ee1dc74-bf61-4182-9d3c-bbad51c71cb2n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 4297 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-11-02  3:41 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-01 14:31 "title" style ignored when converting from docx to markdown Juan M.
     [not found] ` <e485b64e-f091-4820-8ed5-d3f38afc610bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-11-01 21:46   ` John MacFarlane
     [not found]     ` <m2lffkn425.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
2020-11-02  3:41       ` Juan M.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).