From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/33327 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Bastien DUMONT Newsgroups: gmane.text.pandoc Subject: Re: Docx reader and numbered customized styles Date: Wed, 15 Nov 2023 08:06:33 +0000 Message-ID: References: <53f12b55-0d77-42de-bba2-b88e91f59eecn@googlegroups.com> <6bc0ec42-4f2b-4832-8b08-827b913669cen@googlegroups.com> <5652a76c-59ab-4056-ac00-92732e13698en@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="14532"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBDCINCES2QJRBDPY2GVAMGQEUXGU5HQ-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Wed Nov 15 09:06:41 2023 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-io1-f56.google.com ([209.85.166.56]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1r3Av3-0003eZ-0U for gtp-pandoc-discuss@m.gmane-mx.org; Wed, 15 Nov 2023 09:06:41 +0100 Original-Received: by mail-io1-f56.google.com with SMTP id ca18e2360f4ac-7a95b842954sf643960139f.0 for ; Wed, 15 Nov 2023 00:06:40 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700035600; cv=pass; d=google.com; s=arc-20160816; b=rN45SQ9vlRhVXRX7k7JXku14fGkhV0usj13OOtTO7SfYu8sNPuj0/ywIAb/mumvApB Ael7CS9EWHI4EMgMZNxWfUNqxf0Ks2mbCp9Scrtb/YUWqBmlYrlJK5S+kfGsOY31tW9x 0HHXri8LSolisJoiJwcvq7+UfztH0a6Uu9C0xplCd2hP/hLfM7F58YuuZ6uUiclBjscF eeaQycRPrQIGsmf7RX9qY1ME0Xe2oeW7/TsmU4K4K5kf3vasKZKTsXn8Qzq33oxVWv7j dPy56glM5dVy7WvZPtIpw9Bm16ychQgzbwRALBrSu85z3LETd3o7Lv4wW5QoE7Hvbc2U W9bA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:to:from:date:sender:dkim-signature; bh=Xys+qe7rlpUDDcO/+MrQNrzKlYAnIzyREwmtQcf4eo4=; fh=m01AhCNo7xUywHldCVYouaJypLlN7JgtNYbImzBf4N4=; b=k7CiSITsNM++2udtiCoxAL8dY6PmI2YO6MT0Atch4Fhwy4Q/kSgA7RmImZO+wXDmvi Rr6tGQe3GYpdMkpOZA+L4UqsS3ob55iAYWI2+QOuN3uJ5l4MxyM4tqBMcb7G2v09ukTC 02ymihNnxtXhEeMb5pvNE/A0JebhgEmrWkfYbpyx/wBYFwmAEN2RWP1I7N+OvIvMB6hK lqznonOCLhRvH/gE9PMn8WqDTDn8cIpY8s3QUc1gO66nP5LK5zsyxV8SC7+7v3QG17NN sz1rnswXh3up2qdg42byT6wBuBs1XVHYtYJ6b9v2ht6vY0Bb1iGeH9X35UGXlvBLDZcb QNzg== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@posteo.net header.s=2017 header.b=FSlj9c7T; spf=pass (google.com: domain of bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org designates 185.67.36.66 as permitted sender) smtp.mailfrom=bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=posteo.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20230601; t=1700035600; x=1700640400; darn=m.gmane-mx.org; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to :x-original-authentication-results:x-original-sender:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:to:from:date:sender:from:to:cc :subject:date:message-id:reply-to; bh=Xys+qe7rlpUDDcO/+MrQNrzKlYAnIzyREwmtQcf4eo4=; b=CU0XeBk/xiMTAAtdF52Nq66ASAR8phF5ogXTbgG0yM0KV8MZ0iyjswHKLGceC1oTmx nM4fPMtonep8G45JvhEvi3JuE2c+ZuwHuC17SeTYk05OCEYP5tTMFLeIh5Jc4hNAzInN Msh22a/d2WfKBTGy3ek0qJdx9f6bGuRcn+T/+7fgnNoq9XTM3IRjjBMCGT92KbgJ0/xN KyrmXCkZDZD2EPgEV8kWFNoaE6nBYXejA26uoqf7iCyxnPe4yh2pcikRd4baLC+Zv3SQ X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700035600; x=1700640400; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :x-spam-checked-in-group:list-id:mailing-list:precedence:reply-to :x-original-authentication-results:x-original-sender:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:to:from:date:x-beenthere :x-gm-message-state:sender:from:to:cc:subject:date:message-id :reply-to; bh=Xys+qe7rlpUDDcO/+MrQNrzKlYAnIzyREwmtQcf4eo4=; b=bPVF7t2zHSK3xlva6Xn53DSSnRtv7A13CNQxl3OeJMXd0Qma+SHgG4GZPQUPAHn/Md rDsxyOOwPna+clW/UWdyUGtlXr+kaNXhQCyGTZ0rJHv7dA7GI9g9CSX3vCf7F5Ji72Kp HzUtqDEdIUdQ6nxKwkLrml/9Jzpw5jo4CxbGbS9CqbziL8UvGxED0Jl0fLh3SyDXAYXz 9g/BFHa4ZjLiVKLBgczBCcBeRuwp Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOJu0YxawWcZYRT9d3sJyITmeZ4emPL04i6I9fkqOp7Yvrqw+afuO4Ot 2YaJZEdeq+Sbf/K5Wz1bMnk= X-Google-Smtp-Source: AGHT+IF55rsIPtw9BkqXy4Ft1gIIgA8WC7fz/LasOLsUZH++Nzji5Bw3dTLnqRQ1wzcmL9/XO0p68w== X-Received: by 2002:a05:6e02:1789:b0:359:4223:5729 with SMTP id y9-20020a056e02178900b0035942235729mr16978902ilu.1.1700035599810; Wed, 15 Nov 2023 00:06:39 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a92:4b0a:0:b0:359:4b03:d945 with SMTP id m10-20020a924b0a000000b003594b03d945ls4233188ilg.1.-pod-prod-09-us; Wed, 15 Nov 2023 00:06:36 -0800 (PST) X-Received: by 2002:a5d:91d4:0:b0:792:43b4:dc2 with SMTP id k20-20020a5d91d4000000b0079243b40dc2mr13698629ior.3.1700035596403; Wed, 15 Nov 2023 00:06:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700035596; cv=none; d=google.com; s=arc-20160816; b=Uv3WAcKAlIAZjqT51quWxc0cWZ+5FNapr3Q/Z5b1wH0a0KduUuAUB7pzhzSZFtT+Nw JE7SOqPOMU9+YvQ9QdKDpD4U4Zg2Qlk8kbwbf3XBXi51F3QNNRtTVE7KfAqMn/BfHvKx oSqjlLTqYdIEg6ZxDc3/FYa1UHW9SeFEMoCXik9R9Ps8Fo7NpNiG3y1v960xKse5PAMq Fjn8IQQuUx35F20qrjRjFDawrBdPC6XDvi0yeNk+U0P/jV/6qRk620gT9Ta7RZxWpUTy N5WVATBHaVYN3jVTZRreNBMKTpnRK+4ElMuud6alLxEu/u7PRbai6wyOINzQF2qid8d1 PPEg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:to:from:date :dkim-signature; bh=uksS7/v3RLEJJ05dT0znivA/m/gGCBLlty0kSvfaVN0=; fh=m01AhCNo7xUywHldCVYouaJypLlN7JgtNYbImzBf4N4=; b=ULBXIK5zZk8aMyP+jNcvv2W1T5hHnnb8qkXFyNSFBO32laHagPOiBcYiBCUWgx6rYr b1v5ppts5yVqLPXm4AKzLKBzUrXPAYg0Z8FbcttObmbxehc2tWO5yZCcZrM8s6cdOmz9 rWCrOK2842zEI4voW7mUaEwjca62reyhrBotbKykRcjlUgKD9H7oM0e2LYejxxW3TBp7 YC5SEXiHrlp61achr4oFgDICQfBEpm3dOt5uGTqlp9mtFXsERXjoKdM9yP2Uueiwvpnx sLtQaWb1Wb8OEmPK1u50b3wrYVI6oeWK+KtyGcYchh+Nd413WqJMi7sJyZDscLukGqeb EtEg== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@posteo.net header.s=2017 header.b=FSlj9c7T; spf=pass (google.com: domain of bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org designates 185.67.36.66 as permitted sender) smtp.mailfrom=bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=posteo.net Original-Received: from mout02.posteo.de (mout02.posteo.de. [185.67.36.66]) by gmr-mx.google.com with ESMTPS id y20-20020a5d94d4000000b007a692b26f2bsi90696ior.3.2023.11.15.00.06.36 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 00:06:36 -0800 (PST) Received-SPF: pass (google.com: domain of bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org designates 185.67.36.66 as permitted sender) client-ip=185.67.36.66; Original-Received: from submission (posteo.de [185.67.36.169]) by mout02.posteo.de (Postfix) with ESMTPS id D656B240101 for ; Wed, 15 Nov 2023 09:06:34 +0100 (CET) Original-Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4SVbNB3VWTz6tw1 for ; Wed, 15 Nov 2023 09:06:34 +0100 (CET) Content-Disposition: inline In-Reply-To: X-Original-Sender: bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@posteo.net header.s=2017 header.b=FSlj9c7T; spf=pass (google.com: domain of bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org designates 185.67.36.66 as permitted sender) smtp.mailfrom=bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=posteo.net Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:33327 Archived-At: There is a pandoc.zip module. I never used it, but it seems that you can ge= t the uncompressed content of one file from the archive in Lua without writ= ing it to the disk. Le Wednesday 15 November 2023 =C3=A0 05:32:00AM, Ioan Muntean a =C3=A9crit = : > Bastien, > Thanks! This looks helpful. I will try to play with the lua ByteStringRea= der > and then lpeg. The first question I have is how do I deal with a docx fil= e as a > zip file in LUA? Or should I unzip the docx first in a pipeline .bat comm= and? > Thanks in advance! > Ioan >=20 > =E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81= =E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2= =94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94= =81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81= =E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2= =94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94= =81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81= =E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2= =94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94=81=E2=94= =81=E2=94=81=E2=94=81=E2=94=81=E2=94=81 > From: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org o= n > behalf of Bastien DUMONT > Sent: Tuesday, November 14, 2023 4:12 PM > To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org > Subject: Re: Docx reader and numbered customized styles > =20 > I guess that it could involve writing a custom reader for the docx format= that > would do `pandoc.read(input, 'docx')` to get the Pandoc AST of the docume= nt, > uncompress the DOCX file, read the "styles" file, and set global metadata= in > the AST matching the configuration of the Headings styles. Then, this met= adata > may be used by a filter while exporting to LaTeX. >=20 > Well, I think that it would be easier to rename the heading styles or to = insert > some information at the beginning of your file to be processed and remove= d by > the filter for LaTeX export. >=20 > Le Tuesday 14 November 2023 =C3=A0 01:55:56PM, Ioan Muntean a =C3=A9crit = : > > Hi Bastien > > I have a related question that is not immediately connected to special > styles, > > but the Headings 1, Headings 2 etc. > > In my MS Word document, Headings 1 and so on are numbered with a specif= ic set > > of multilist levels. I am curious whether there is a way to pass the ty= pe of > > numbering from Headings 1 style in Word to markdown or later to Latex. = I work > > often with LUA filters, but in the -t native format of docx, Headings d= o not > > have any specification, online numbered list or special paragraphs. So = how do > > we recover the numbering of Headings styles? > > One way to deal with it would be to rename Headings 1 to headingsnumber= ed 1 > and > > deal with that special style. Is there any other way to do this? > > Thanks in advance! > > Ioan > > > > On Thursday, October 26, 2023 at 11:49:05=E2=80=AFAM UTC-5 Bastien DUMO= NT wrote: > > > > > So is the -f docx+styles working with the docx reader, too? If so= , how? > > > > -f docx+styles means =E2=80=9Cuse the docx reader and enable the = =E2=80=98styles=E2=80=99 > > extension=E2=80=9D, so yes! As is written in the manual, it renders= the styles as > > divs and spans with a =E2=80=9Ccustom-style=E2=80=9D attribute. You= will have to use a > > filter to convert some of these divs and spans to whatever code you= want > in > > your LaTeX file. > > > > Or are you talking about customized lists, not custom styles? > > > > -- > > You received this message because you are subscribed to the Google Grou= ps > > "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send = an > email > > to [1]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit [2]https://groups.google.com/d= /msgid > / > > pandoc-discuss/5652a76c-59ab-4056-ac00-92732e13698en%40googlegroups.com= . > > > > References: > > > > [1] [1]mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org > > [2] [2] https://groups.google.com/d/msgid/pandoc-discuss/ > 5652a76c-59ab-4056-ac00-92732e13698en%40googlegroups.com?utm_medium=3Dema= il& > utm_source=3Dfooter >=20 > -- > You received this message because you are subscribed to a topic in the Go= ogle > Groups "pandoc-discuss" group. > To unsubscribe from this topic, visit [3] https://groups.google.com/d/top= ic/ > pandoc-discuss/7BCIWpu8em0/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit [4] https://groups.google.com/d/= msgid/ > pandoc-discuss/ZVPw1A54Xry2zGHT%40localhost. >=20 > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an= email > to [5]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit [6]https://groups.google.com/d/m= sgid/ > pandoc-discuss/ > SN7PR15MB5635AB592F88C9B03F88840DF9B1A%40SN7PR15MB5635.namprd15.prod.outl= ook.com > . >=20 > References: >=20 > [1] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org > [2] https://groups.google.com/d/msgid/pandoc-discuss/5652a76c-59ab-4056-a= c00-92732e13698en%40googlegroups.com?utm_medium=3Demail&utm_source=3Dfooter > [3] https://groups.google.com/d/topic/pandoc-discuss/7BCIWpu8em0/unsubscr= ibe > [4] https://groups.google.com/d/msgid/pandoc-discuss/ZVPw1A54Xry2zGHT%40l= ocalhost > [5] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org > [6] https://groups.google.com/d/msgid/pandoc-discuss/SN7PR15MB5635AB592F8= 8C9B03F88840DF9B1A%40SN7PR15MB5635.namprd15.prod.outlook.com?utm_medium=3De= mail&utm_source=3Dfooter --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/ZVR8CSsE-bsZR5wJ%40localhost.