* Docx to Markdown and Front Matter @ 2021-04-22 9:04 Doeke Zanstra [not found] ` <100da112-ed0d-4618-b949-721a3079538bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 3+ messages in thread From: Doeke Zanstra @ 2021-04-22 9:04 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 1225 bytes --] I'm converting docx to markdown, and I need a YAML front matter heading just before the markdown. I recently learnt to do this with the --stand-alone argument. However, it is opaque how this exactly works. I only get front matter when the first paragraph is styled with the "Title" style (actually the Dutch localized "Titel" style). Are there other options available to get more meta-data out of the Word document? Via Word on macOS via the menu Archive > Properties > Summary, there are all kinds of meta data which could be useful as front matter: - Titel - Subject - Author - Manager - Company - Category - Keywords - Remarks - Hyperlink base Can this be used? Or are there other ways to get meta-data out of Word? Or would this need a feature request in pandoc? Thanks in advance, Doeke Zanstra -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/100da112-ed0d-4618-b949-721a3079538bn%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 1794 bytes --] ^ permalink raw reply [flat|nested] 3+ messages in thread
[parent not found: <100da112-ed0d-4618-b949-721a3079538bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Docx to Markdown and Front Matter [not found] ` <100da112-ed0d-4618-b949-721a3079538bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2021-04-26 6:01 ` John MacFarlane [not found] ` <m235vdr3d6.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org> 0 siblings, 1 reply; 3+ messages in thread From: John MacFarlane @ 2021-04-26 6:01 UTC (permalink / raw) To: Doeke Zanstra, pandoc-discuss There are already some issues on the tracker that seem relevant, e.g. #3109, #3034 Doeke Zanstra <doeke-5rSQWjF5bFWbyly6AaOUig@public.gmane.org> writes: > I'm converting docx to markdown, and I need a YAML front matter heading > just before the markdown. I recently learnt to do this with the > --stand-alone argument. > > However, it is opaque how this exactly works. I only get front matter when > the first paragraph is styled with the "Title" style (actually the Dutch > localized "Titel" style). > > Are there other options available to get more meta-data out of the Word > document? Via Word on macOS via the menu Archive > Properties > Summary, > there are all kinds of meta data which could be useful as front matter: > > - Titel > - Subject > - Author > - Manager > - Company > - Category > - Keywords > - Remarks > - Hyperlink base > > Can this be used? Or are there other ways to get meta-data out of Word? > Or would this need a feature request in pandoc? > > Thanks in advance, > Doeke Zanstra > > > -- > You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/100da112-ed0d-4618-b949-721a3079538bn%40googlegroups.com. ^ permalink raw reply [flat|nested] 3+ messages in thread
[parent not found: <m235vdr3d6.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>]
* Re: Docx to Markdown and Front Matter [not found] ` <m235vdr3d6.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org> @ 2021-04-26 7:37 ` BPJ 0 siblings, 0 replies; 3+ messages in thread From: BPJ @ 2021-04-26 7:37 UTC (permalink / raw) To: pandoc-discuss; +Cc: Doeke Zanstra [-- Attachment #1: Type: text/plain, Size: 2783 bytes --] At one point I experimented with a LibreOffice (or was it as long ago as OpenOffice?) macro which pulled out metadata and put it as lines of KEY: VALUE pairs at the top of the text. Not really successful but you might have better luck with python-docx. Den mån 26 apr. 2021 08:02John MacFarlane <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> skrev: > > There are already some issues on the tracker that seem > relevant, e.g. #3109, #3034 > > Doeke Zanstra <doeke-5rSQWjF5bFWbyly6AaOUig@public.gmane.org> writes: > > > I'm converting docx to markdown, and I need a YAML front matter heading > > just before the markdown. I recently learnt to do this with the > > --stand-alone argument. > > > > However, it is opaque how this exactly works. I only get front matter > when > > the first paragraph is styled with the "Title" style (actually the Dutch > > localized "Titel" style). > > > > Are there other options available to get more meta-data out of the Word > > document? Via Word on macOS via the menu Archive > Properties > Summary, > > there are all kinds of meta data which could be useful as front matter: > > > > - Titel > > - Subject > > - Author > > - Manager > > - Company > > - Category > > - Keywords > > - Remarks > > - Hyperlink base > > > > Can this be used? Or are there other ways to get meta-data out of Word? > > Or would this need a feature request in pandoc? > > > > Thanks in advance, > > Doeke Zanstra > > > > > > -- > > You received this message because you are subscribed to the Google > Groups "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/100da112-ed0d-4618-b949-721a3079538bn%40googlegroups.com > . > > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/m235vdr3d6.fsf%40MacBook-Pro.hsd1.ca.comcast.net > . > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhCdeN25Gbebh%3DnW-viyn2NozKwNjZqHyJxq7j5qj4WrUg%40mail.gmail.com. [-- Attachment #2: Type: text/html, Size: 4179 bytes --] ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-04-26 7:37 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-04-22 9:04 Docx to Markdown and Front Matter Doeke Zanstra [not found] ` <100da112-ed0d-4618-b949-721a3079538bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2021-04-26 6:01 ` John MacFarlane [not found] ` <m235vdr3d6.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org> 2021-04-26 7:37 ` BPJ
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).