public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: BPJ <melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: pandoc-discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Cc: Doeke Zanstra <doeke-5rSQWjF5bFWbyly6AaOUig@public.gmane.org>
Subject: Re: Docx to Markdown and Front Matter
Date: Mon, 26 Apr 2021 09:37:30 +0200	[thread overview]
Message-ID: <CADAJKhCdeN25Gbebh=nW-viyn2NozKwNjZqHyJxq7j5qj4WrUg@mail.gmail.com> (raw)
In-Reply-To: <m235vdr3d6.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 2783 bytes --]

At one point I experimented with a LibreOffice (or was it as long ago as
OpenOffice?) macro which pulled out metadata and put it as lines of KEY:
VALUE pairs at the top of the text. Not really successful but you might
have better luck with python-docx.


Den mån 26 apr. 2021 08:02John MacFarlane <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> skrev:

>
> There are already some issues on the tracker that seem
> relevant, e.g. #3109, #3034
>
> Doeke Zanstra <doeke-5rSQWjF5bFWbyly6AaOUig@public.gmane.org> writes:
>
> > I'm converting docx to markdown, and I need a YAML front matter heading
> > just before the markdown. I recently learnt to do this with the
> > --stand-alone argument.
> >
> > However, it is opaque how this exactly works. I only get front matter
> when
> > the first paragraph is styled with the "Title" style (actually the Dutch
> > localized "Titel" style).
> >
> > Are there other options available to get more meta-data out of the Word
> > document? Via Word on macOS via the menu Archive > Properties > Summary,
> > there are all kinds of meta data which could be useful as front matter:
> >
> > - Titel
> > - Subject
> > - Author
> > - Manager
> > - Company
> > - Category
> > - Keywords
> > - Remarks
> > - Hyperlink base
> >
> > Can this be used? Or are there other ways to get meta-data out of Word?
> > Or would this need a feature request in pandoc?
> >
> > Thanks in advance,
> > Doeke Zanstra
> >
> >
> > --
> > You received this message because you are subscribed to the Google
> Groups "pandoc-discuss" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> > To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/100da112-ed0d-4618-b949-721a3079538bn%40googlegroups.com
> .
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/m235vdr3d6.fsf%40MacBook-Pro.hsd1.ca.comcast.net
> .
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhCdeN25Gbebh%3DnW-viyn2NozKwNjZqHyJxq7j5qj4WrUg%40mail.gmail.com.

[-- Attachment #2: Type: text/html, Size: 4179 bytes --]

      parent reply	other threads:[~2021-04-26  7:37 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-22  9:04 Doeke Zanstra
     [not found] ` <100da112-ed0d-4618-b949-721a3079538bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2021-04-26  6:01   ` John MacFarlane
     [not found]     ` <m235vdr3d6.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
2021-04-26  7:37       ` BPJ [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADAJKhCdeN25Gbebh=nW-viyn2NozKwNjZqHyJxq7j5qj4WrUg@mail.gmail.com' \
    --to=melroch-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=doeke-5rSQWjF5bFWbyly6AaOUig@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).