From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/28228 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: BPJ Newsgroups: gmane.text.pandoc Subject: Re: Docx to Markdown and Front Matter Date: Mon, 26 Apr 2021 09:37:30 +0200 Message-ID: References: <100da112-ed0d-4618-b949-721a3079538bn@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="00000000000073273505c0db3856" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="9229"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Doeke Zanstra To: pandoc-discuss Original-X-From: pandoc-discuss+bncBCWMVYEK54FRBRW3TGCAMGQEJESI3HA-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mon Apr 26 09:37:44 2021 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-lf1-f60.google.com ([209.85.167.60]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1lavoO-00024Q-Hu for gtp-pandoc-discuss@m.gmane-mx.org; Mon, 26 Apr 2021 09:37:44 +0200 Original-Received: by mail-lf1-f60.google.com with SMTP id a5-20020a19ca050000b02901b72116329asf1687651lfg.1 for ; Mon, 26 Apr 2021 00:37:44 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1619422664; cv=pass; d=google.com; s=arc-20160816; b=nNgz3k2f3SDk1BrUL3+j5OsDJUhehPI/oU6bX5B1N1ge9hRDIzcGIvZvxdKJkN61CU IZWEueRY+XWmVJ/aZHX/iarRN08I0v8KA54dhaaikNcD24PoW6tPalU1gvLA9JzA+bI8 oqGPzTGJtnfRNWll4k6tuISNG7n3nVMYMwya+/VcbNHFJfFIkKUkHWKa4SbOz5BJWPp4 g0gomS4xUb4e3Doo/KlikdMYuGJr7PlVE/ELUZrKnj4UxZdvw7n7NuI5qQANIWd35Y7N mOoWDTNg0Tv9TWin/2sY8rLmUPPprAb0+tuUq2Qp2kxcOothOZir9E//JaNDT5LinaAD muWA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:cc:to:subject:message-id :date:from:in-reply-to:references:mime-version:sender:dkim-signature :dkim-signature; bh=nZB/JVsGVp7MyOKJLlWhR75xFF+4vz8nfKBiC11LdiQ=; b=mXUVKEiS6ncqaoVjkefMAJpdqQZMMnTABw2tjLvI2notcJTZ4tgaZRE/DkqXs/v/0J A5crcs807ig+OkiW416Ekjtv8J+WiO++1JWUakVi0T4tA2znS3lwbCHg16u1Dr3KjX8n 99y5XQRm17j4lsk3WKdzEZSdk3B/2SZq335nbLhFQom706hQ4JKCBAokn+PdxHoi2xcI K6aCFlooCl0nZJfsUBfJ2FAAsOc960BhQo/QBG+Kag3AmEGCF74B74WS269TPTJiU2Ze mxY32pbpV/M1OWhJjiQE6EHQfzhgovefm7vAs61IOCbnh1Z583ocMLZQP1UGf9h38xHC gICg== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=bkYjShbW; spf=pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2a00:1450:4864:20::136 as permitted sender) smtp.mailfrom=melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:mime-version:references:in-reply-to:from:date:message-id :subject:to:cc:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=nZB/JVsGVp7MyOKJLlWhR75xFF+4vz8nfKBiC11LdiQ=; b=pDW1WYDTGdVlz7qWA1V0KLLOVnr/mBjk6LiBBnV+eRztW/ZPiMhar7iCydi3SA6yXP lfCCucoPa1xFChDnwj0TzNXT65ETChNbHtf+D22+OVXlQvodG2IjCIk8ZxPIgYlN6TFf 4KFFFG3MPkaJNNMol+HRzPhN0LDKqUtAU6uyZ5fwSuY1AXy4RfG9svgVuGsAJpKM+5Qq My6+AgC+4YZtqpIO0cF7+RW7Xx9anXUjVZbsHJXwwb9p1NRNoTF3pcVG8B3KiSPiU0aa RnXOgZ90xZZqZ5D0d1H0hg9rjjbRFbl2bD3INkElGKUHQ4JUsWRpPnlk7zmiBDgWCFYP Jn3A== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=nZB/JVsGVp7MyOKJLlWhR75xFF+4vz8nfKBiC11LdiQ=; b=fO6zAmjZauxgpXESTET9Xr18gpMLfJBtO+a2IFf5VWntyZYKvBmLnehnnIZbxU69FJ DPn1v2HrLbrgpsZ10FReM/L5ZcAmuN3SDaFaHEoZHkNT0yczB4lQxsAo0MuLj88G4JwM ILxT12zNjLisM+/l9NFkIFf5B5yEfDT1fmxk5hwbJVCglMY69NZYye4KkZLlojAPDHIu TpqtdvUzxWnM8iyA5hUCCoPB4aPlruFMr7jEv7skvaHTd50RBaIac8PgNLiudia4CFzO AkiHC6557AqURINJE4hgpPt4LE9qQramiKpz0GZIh/GUk3mVofHDzhHk1BolMvgn5y6N jqkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:mime-version:references:in-reply-to:from :date:message-id:subject:to:cc:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=nZB/JVsGVp7MyOKJLlWhR75xFF+4vz8nfKBiC11LdiQ=; b=Z1u+2B5AwEbKI+T4rmKNGNPMb2L34h7ZXL7tXkMOwW5jcUP5eQmNxQuI4Aa5sIEhdJ hYhGmsmu+pJQ6t2/EpY5vahdCzXLakwJaF68PNB0RMGDxPzaKyePA27429sn5Vhwc6hV COhdH9JKM42ih5gt0v1FJLZW3eibghQCxuTL/7JfwVoZhviion9wGb9SMAUammpBrnRv VPXWcKMW2GYjhpywMaYyXjyoRlhrT372wGYvOlweSbYkZi5sgkw9DqHpgxT8ppVZlunH qo2qTJ+sl5p8RUIiZVx+Xp45LRfGgNUwZU7y7Q1gu41B9ShZKANQqiDEP+wMKcVr2MIS uGlg== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM530iw7Rx0rBIug8VvvId4z0I6+WiNGQGrJXuRg0i528alUJL+oeU Zzczq1RVwB9+qySnburiGZo= X-Google-Smtp-Source: ABdhPJxsMLvj3QwhRaWl/+RHe3USgEVoHwAVH5uOF4tZT+jjuHywi5DHV6Zv+m9i7CDtHFtu7szvQA== X-Received: by 2002:a05:6512:691:: with SMTP id t17mr3374530lfe.486.1619422664113; Mon, 26 Apr 2021 00:37:44 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a2e:9f57:: with SMTP id v23ls2509828ljk.4.gmail; Mon, 26 Apr 2021 00:37:41 -0700 (PDT) X-Received: by 2002:a2e:b4f5:: with SMTP id s21mr6891014ljm.320.1619422661317; Mon, 26 Apr 2021 00:37:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1619422661; cv=none; d=google.com; s=arc-20160816; b=MZbUkberjRbl7PZzcJaHf+j+w1fq8l70WcILt1q82g2oHilfr7OQY8BSqf1Z9NGbCD apVC8aOJ+QjoYI7nBEfkM/4CfyOTHYdDtiM+VBl1g2U5dS8Zji2TzcaawTytuwUtZKTv 4dtU/A0ycchrlDDOMavxRXBAppSadxelkpFan8mtfUpqaKoRabx4eL3l6RpxIJPUz6XP J3hxj0+uz2bvxLr6feScPvCg2vI0TRkf4GM2ym8Affa/Xhp5SirMhgrdULLNWgegWbsp 5EF+VJRKQ0bPvcf3ZusAE/uJu+bT+fqjyObylA393CsYTppPZMw2QdxNJYeBklCbB1V1 KIhA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=akiBoe7xMaqRZLzdhNa/i/zCvtbqufYXR4YGcdT0BJo=; b=XSLkun+FmJuM4hVQ9wBMeYqHmjLw7Wsfdn/KDvRJJmMoPNWoW/kINkQMBJDsnSmQbp Zb5Q4O3iAlxpCOc43P9PhZxWIglMNlIlvG68NpBS23DuBH9FZio+VX9gq8snbvbuG8Vs zcNiRd8T5HKCUPopl2izAta9PG4lcTmhw85XW1TDVkaNNYIi+reZKP85Hgmb9Fs3Clus 0X/dlsAKYeXHHwT3XNjhBho3AZadhY9OI3PzjlgdpcVDfshoxds0WGIMi9qXBmBIYmDP b84nY4R1Y9DEmb7FheGFE94hS/1LJGlYHhLKheG40SWxITm+7SmNneWgxwrcIwH4EtjV b0yQ== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=bkYjShbW; spf=pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2a00:1450:4864:20::136 as permitted sender) smtp.mailfrom=melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Original-Received: from mail-lf1-x136.google.com (mail-lf1-x136.google.com. [2a00:1450:4864:20::136]) by gmr-mx.google.com with ESMTPS id z33si254993lfu.12.2021.04.26.00.37.41 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 26 Apr 2021 00:37:41 -0700 (PDT) Received-SPF: pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2a00:1450:4864:20::136 as permitted sender) client-ip=2a00:1450:4864:20::136; Original-Received: by mail-lf1-x136.google.com with SMTP id y4so46587836lfl.10 for ; Mon, 26 Apr 2021 00:37:41 -0700 (PDT) X-Received: by 2002:a19:6558:: with SMTP id c24mr12009663lfj.313.1619422660993; Mon, 26 Apr 2021 00:37:40 -0700 (PDT) In-Reply-To: X-Original-Sender: melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=bkYjShbW; spf=pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2a00:1450:4864:20::136 as permitted sender) smtp.mailfrom=melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:28228 Archived-At: --00000000000073273505c0db3856 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable At one point I experimented with a LibreOffice (or was it as long ago as OpenOffice?) macro which pulled out metadata and put it as lines of KEY: VALUE pairs at the top of the text. Not really successful but you might have better luck with python-docx. Den m=C3=A5n 26 apr. 2021 08:02John MacFarlane skrev: > > There are already some issues on the tracker that seem > relevant, e.g. #3109, #3034 > > Doeke Zanstra writes: > > > I'm converting docx to markdown, and I need a YAML front matter heading > > just before the markdown. I recently learnt to do this with the > > --stand-alone argument. > > > > However, it is opaque how this exactly works. I only get front matter > when > > the first paragraph is styled with the "Title" style (actually the Dutc= h > > localized "Titel" style). > > > > Are there other options available to get more meta-data out of the Word > > document? Via Word on macOS via the menu Archive > Properties > Summary= , > > there are all kinds of meta data which could be useful as front matter: > > > > - Titel > > - Subject > > - Author > > - Manager > > - Company > > - Category > > - Keywords > > - Remarks > > - Hyperlink base > > > > Can this be used? Or are there other ways to get meta-data out of Word? > > Or would this need a feature request in pandoc? > > > > Thanks in advance, > > Doeke Zanstra > > > > > > -- > > You received this message because you are subscribed to the Google > Groups "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/100da112-ed0d-4618-b949-= 721a3079538bn%40googlegroups.com > . > > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/m235vdr3d6.fsf%40MacBook= -Pro.hsd1.ca.comcast.net > . > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/CADAJKhCdeN25Gbebh%3DnW-viyn2NozKwNjZqHyJxq7j5qj4WrUg%40mail= .gmail.com. --00000000000073273505c0db3856 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
At one point I experimented with a LibreOffice (or was it= as long ago as OpenOffice?) macro which pulled out metadata and put it as = lines of KEY: VALUE pairs at the top of the text. Not really successful but= you might have better luck with python-docx.


De= n m=C3=A5n 26 apr. 2021 08:02John MacFarlane <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> skrev:

There are already some issues on the tracker that seem
relevant, e.g. #3109, #3034

Doeke Zanstra <doeke-5rSQWjF5bFWbyly6AaOUig@public.gmane.org> writes:

> I'm converting docx to markdown, and I need a YAML front matter he= ading
> just before the markdown. I recently learnt to do this with the
> --stand-alone argument.
>
> However, it is opaque how this exactly works. I only get front matter = when
> the first paragraph is styled with the "Title" style (actual= ly the Dutch
> localized "Titel" style).
>
> Are there other options available to get more meta-data out of the Wor= d
> document? Via Word on macOS via the menu Archive > Properties > = Summary,
> there are all kinds of meta data which could be useful as front matter= :
>
> - Titel
> - Subject
> - Author
> - Manager
> - Company
> - Category
> - Keywords
> - Remarks
> - Hyperlink base
>
> Can this be used? Or are there other ways to get meta-data out of Word= ?
> Or would this need a feature request in pandoc?
>
> Thanks in advance,
> Doeke Zanstra
>
>
> --
> You received this message because you are subscribed to the Google Gro= ups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send= an email to pandoc-discuss+unsubscribe@googlegr= oups.com.
> To view this discussion on the web visit https://group= s.google.com/d/msgid/pandoc-discuss/100da112-ed0d-4618-b949-721a3079538bn%4= 0googlegroups.com.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe@googlegroups.= com.
To view this discussion on the web visit https://groups.google.com/= d/msgid/pandoc-discuss/m235vdr3d6.fsf%40MacBook-Pro.hsd1.ca.comcast.net= .

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://group= s.google.com/d/msgid/pandoc-discuss/CADAJKhCdeN25Gbebh%3DnW-viyn2NozKwNjZqH= yJxq7j5qj4WrUg%40mail.gmail.com.
--00000000000073273505c0db3856--