From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/32447 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Julien Dutant Newsgroups: gmane.text.pandoc Subject: Re: XML-ID when converting to markdown Date: Thu, 6 Apr 2023 07:22:17 -0700 (PDT) Message-ID: References: <941a4fdb-f161-42e5-856b-d98e88db882dn@googlegroups.com> <115179cd-21e7-4e29-aea0-add708149ce0n@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_1370_1618589114.1680790937272" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="10158"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBC5Y3356IYIJVK53UEDBUBAY537CY-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Thu Apr 06 16:22:22 2023 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-ot1-f56.google.com ([209.85.210.56]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1pkQVJ-0002TG-Rk for gtp-pandoc-discuss@m.gmane-mx.org; Thu, 06 Apr 2023 16:22:21 +0200 Original-Received: by mail-ot1-f56.google.com with SMTP id q6-20020a05683022c600b0069f96cb2758sf14203333otc.0 for ; Thu, 06 Apr 2023 07:22:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; t=1680790940; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:references:in-reply-to:message-id:to:from:date :sender:from:to:cc:subject:date:message-id:reply-to; bh=sufaZNkZr6sa/2lgIlHmeIWRF71iJMvFdZEB/ZbniuY=; b=FlfYjA1Skz2+6XBMH6EDqDYZcKKjLMnTDb+yKU+2yCjPE+fbd/kPG4UBjPjUenn0rk KuY5HyprhmHhn8q+h1EOXVMGDW8lj8JTiKxiTv3pDtUqOi5KO4p1z9jvkDlsLQAUx4LX 5hhL2sCDWQfPajOk6XS+xar0Ix3ln+0j1BRk1Pcswh3HmYZL23yfM96+fHFS0cJqyeHi Q2cboS+08sL+x+Usd40XyDFv1uEO8/eqCtKD33fmAvUu1TvkZ0bUeBZEJU1OsOUVJOpf N9dVNGcvwD5Q4odZPRfTnzmyr9R0KWiyleRyZShIwKka/m7+vIuRRSPKU+BdeUfXG7Cu 51Bw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680790940; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:references:in-reply-to:message-id:to:from:date :from:to:cc:subject:date:message-id:reply-to; bh=sufaZNkZr6sa/2lgIlHmeIWRF71iJMvFdZEB/ZbniuY=; b=iOOcndtNUlQrlTK9dazaq/2Uva9Ba8BI+w6wqF8u5+d5pcZvD2K0tGn3+XpMispM4Z 0tY4HjmUqmEC2Nep+XVgvvMeekuNZsRMKcklv1WgN2vPizWfG/OdujqbwLArgWuH8ddf BcKA4sxL9gIZAm3NwVSK+0PHQaqAVPjDba4lbiocAU+Zdbq6yRYImy7//M4jADvf+mkA yrfamvv4HKnKr/UefST6ir4wnf8C7nBkswgBQlzcnNJtm3tiko7G5KDp3QiZX0CrNwP0 n7vuz3JR5dY+qaxwOyZFCGlHg50zwXseMxhWr3dOQbDTJ+RX7SgxDSCw6uL4tBd3VusJ pVhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680790940; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :x-spam-checked-in-group:list-id:mailing-list:precedence:reply-to :x-original-sender:mime-version:subject:references:in-reply-to :message-id:to:from:date:x-gm-message-state:sender:from:to:cc :subject:date:message-id:reply-to; bh=sufaZNkZr6sa/2lgIlHmeIWRF71iJMvFdZEB/ZbniuY=; b=qIghalWX28v4Kx5aylHZ/ban8FSiiGztHe6nHVEJctHkdA1e5AntQOJhFtbu7eEhzs eYczu1QHu3+qMXzZLn+m1M18qc6xoYSRUoU8yyZ21FmHE1Twu7Y18WIebmoTc0xicUnM h9EYHzFF4kCIa2y2A7Isc0wqKvMltKt5io6ijwM/8xj1NsYd3W1H1XiwLa2nT7BFA1aF bb9msz7igcqOAem8/ni2oWlOt21WKFAPC3p2ieP6DO2Pb16v7r7ufQQgi4JcPBnOZVLa 5HNxx8R6ZgskQQYvSQwePj1K7Uv22WaEZjRYDu4rFdOH5npLXA1OQnLpW/Becs/G8OkP Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AAQBX9e5O70tSlVkpOlM3vulO8r4Nx/8oi4tfgAaE5cdZZmeSx0kZVGz MR4uQcAR6txosRR3K3xzZv+ehg== X-Google-Smtp-Source: AKy350Zftg9qwINOsdREFc+t2fSPPf5mX0x81tM0IX8Oi+GVnxQ/s5eMeHtQxFHHaSrPMwJxKYb46A== X-Received: by 2002:a05:6870:508:b0:17e:103d:75c3 with SMTP id j8-20020a056870050800b0017e103d75c3mr4700242oao.0.1680790940739; Thu, 06 Apr 2023 07:22:20 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6870:cb4b:b0:17a:da3c:65bb with SMTP id oz11-20020a056870cb4b00b0017ada3c65bbls8043875oab.7.-pod-prod-gmail; Thu, 06 Apr 2023 07:22:18 -0700 (PDT) X-Received: by 2002:a05:6870:1310:b0:180:3c00:9a9d with SMTP id 16-20020a056870131000b001803c009a9dmr4536849oab.7.1680790937876; Thu, 06 Apr 2023 07:22:17 -0700 (PDT) In-Reply-To: <115179cd-21e7-4e29-aea0-add708149ce0n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> X-Original-Sender: julien.dutant-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:32447 Archived-At: ------=_Part_1370_1618589114.1680790937272 Content-Type: multipart/alternative; boundary="----=_Part_1371_1083913595.1680790937272" ------=_Part_1371_1083913595.1680790937272 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Oops, remove the "print(id)" line in the filter script above, it was meant= =20 for debugging. On Thursday, April 6, 2023 at 3:20:48=E2=80=AFPM UTC+1 Julien Dutant wrote: > This filter will remove all empty Spans with id starting with x. Save as= =20 > "removeXSpans.lua': > > function Span(el) > return #el.content =3D=3D 0 and el.identifier:match('^x%d') and pandoc.Sp= ace() > or el > end > > And run pandoc with the `-L removeXSpans.lua` option, e.g.=20 > pandoc -f dockbook sourcefile -t markdown -o outfile.md -L removeXSpans.l= ua > > Result: > # 1 Introduction > > However, this will break any link to #x1-10001. If there are internal=20 > links in the doc (e.g. from the table of > content) that you need to preserve, you need a filter that produces=20 > instead: > # 1 Introduction {#x1-10001} > > Perhaps this will work (it'd help to have a sample docbook source), saved= =20 > as removeXSpans.lua and used as above > > function Header(hd) > local id =3D '' > hd.content =3D hd.content:walk { > Span =3D function(el) > if #el.content =3D=3D 0 and el.identifier:match('^x%d') then > id =3D el.identifier > return pandoc.Space() > end > end > } > print(id) > if id ~=3D '' then=20 > hd.identifier =3D id > return hd > end > end > > On Wednesday, April 5, 2023 at 11:38:07=E2=80=AFPM UTC+1 hcf wrote: > >> I'm converting from DocBook to Markdown. >> >> In DocBook there are xml:id tags. When I convert to markdown these are= =20 >> rendered as=20 >> >> []{#x1-10001}. >> >> >> A markdown heading look like this when converting from DocBook. >> >> >> # 1[]{#x1-10001}Introduction >> >> >> Is there a way to turn this off? >> >> >> best regards >> >> hcf >> > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/a7814858-ae37-4788-acd3-6566b3e70bd1n%40googlegroups.com. ------=_Part_1371_1083913595.1680790937272 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Oops, remove the "print(id)" line in the filter script above, it was meant = for debugging.

On Thursday, April 6, 2023 at 3:20:48=E2=80=AFPM UTC+1 Jul= ien Dutant wrote:
This filter will remove all empty Spans with id starting with x. = Save as "removeXSpans.lua':

function Span= (el)
return #el.content =3D=3D 0= and el.identifier:match('^x%d') and pandoc.Space()
or el
end

And ru= n pandoc with the `-L removeXSpans.lua` option, e.g.
pandoc = -f dockbook sourcefile -t markdown -o outfile.md -L removeXSpans.lua

Result:
# 1 Introduction

However, this will break any link to #x1-10001. If there are intern= al links in the doc (e.g. from the table of
content) that you nee= d to preserve, you need a filter that produces instead:
= # 1 Introduction {#x1-10001}

Perhaps thi= s will work (it'd help to have a sample docbook source), saved as remov= eXSpans.lua and used as above

function Header(= hd)
local id =3D ''
<= span style=3D"white-space:pre"> hd.content =3D hd.content:walk {
= Span =3D function(el)
if #el.content =3D=3D 0 and el.identifie= r:match('^x%d') then
id =3D el.identifier
return= pandoc.Space()
end
end
= }
print(id)
if id ~=3D '' then
hd.identifier =3D id
return hd
en= d
end

On Wednesday, April 5, 2023 at 11:38:07=E2=80=AFPM = UTC+1 hcf wrote:
I'= ;m converting from DocBook to Markdown.

In DocBook there= are xml:id tags. When I convert to markdown these are rendered as=C2=A0

[]{#x1-= 10001}.


A markdown heading =C2=A0look like this when converting from DocBook= .

# 1= []{#x1-10001}Introduction


Is there a way to turn this off?


best regards

hcf

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d= /msgid/pandoc-discuss/a7814858-ae37-4788-acd3-6566b3e70bd1n%40googlegroups.= com.
------=_Part_1371_1083913595.1680790937272-- ------=_Part_1370_1618589114.1680790937272--