From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/32453 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: hcf Newsgroups: gmane.text.pandoc Subject: Re: XML-ID when converting to markdown Date: Fri, 7 Apr 2023 06:16:08 -0700 (PDT) Message-ID: <88b9fb33-d8cf-4221-af8d-22c26c3c5033n@googlegroups.com> References: <941a4fdb-f161-42e5-856b-d98e88db882dn@googlegroups.com> <115179cd-21e7-4e29-aea0-add708149ce0n@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_2004_1951152669.1680873368380" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="32997"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBCV5TQHO3QPRBGNPYCQQMGQE6MIKCKA-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Fri Apr 07 15:16:12 2023 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-ot1-f58.google.com ([209.85.210.58]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1pklwq-0008Qu-ID for gtp-pandoc-discuss@m.gmane-mx.org; Fri, 07 Apr 2023 15:16:12 +0200 Original-Received: by mail-ot1-f58.google.com with SMTP id x20-20020a9d7054000000b006a149b4ad1csf12277643otj.23 for ; Fri, 07 Apr 2023 06:16:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20221208; t=1680873371; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:references:in-reply-to:message-id:to:from:date :sender:from:to:cc:subject:date:message-id:reply-to; bh=7FA+O9pQtKnJrCWUnIxsEJsNjW56u6YGFR72SaLmOV8=; b=E11nPwMPRT19z0ZbvMUzcT9vbhmLDPcZOt7AQbh8WK/QLzLXkQxuZn+cV7ph2fajFZ dKbxcREgG137ePy9m/ep2M3RoGAw8uFZKmZGql3bsTpHy+n66wQKWY9Ej+lzuB5Xc9Dt 4qn2mTBX2RoV51Bt/V0v8iyfi1e1M62DkISEeszniCZxzydGIZesjkYPRwwHwrx4vaX0 gpr4ya0qHiGPahCpenpNz9D894tB6KbbV5IH5EKkuB629aR4m4i/KFKcuDXVhi2G21a2 7XYH1pNKBbDTRDsBgXcDkMvcrhgFZzJdijegMIBiXrdmht6qrBRoDlHlB3ezr9UJ59dz LLEg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680873371; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:references:in-reply-to:message-id:to:from:date :from:to:cc:subject:date:message-id:reply-to; bh=7FA+O9pQtKnJrCWUnIxsEJsNjW56u6YGFR72SaLmOV8=; b=Afx6OGuZamHL1+ugXxE7qIlJVP7klVn4ETCbPHNTGaOiek32HXP7+q8deLyS1/biXo uVJdVu6KGakA5E+lH/gNmpO+gB4X8iA/VINgOXLDOA10Fgb7zfDEV/e9djkeEDEFNpPl YCww+fbztmM5Z6+HbwJ1VLpsq81YcQPuxho6q5m6iEixBiB8WjQZMLQYqCvUH7Q80ftC O+9L7DJRmKSNWpK0lHiVhY3cX/WniBEHjF1QJiaTQ7w5xsvW7MJt2v6buPmCoLExzNpR LEyF89YH7Bbhk+qefc39z3HADRgyyJVAFF3gGwomQ+YRDG0Xnt2yFByDQ03HqFMAduCJ dGpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680873371; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :x-spam-checked-in-group:list-id:mailing-list:precedence:reply-to :x-original-sender:mime-version:subject:references:in-reply-to :message-id:to:from:date:x-beenthere:x-gm-message-state:sender:from :to:cc:subject:date:message-id:reply-to; bh=7FA+O9pQtKnJrCWUnIxsEJsNjW56u6YGFR72SaLmOV8=; b=nBXQlf0Uk/Oy5PNdmE+ZGVkNflBjKAwTFxyoMA9mYd1Y8X4TTMPfe3lyEWW7Z3LmPH EBJedJgBOhi8C7TOmAXpLf5B+KzqZbHSroH3+kRGu9wcRAu7jtotUmsoI9aus8Yc49/8 TNdXE9pt5AT9E9fGArkGyertnUJGcHxY9YF+Ksw6Sz0GypRx3D5R/OtG1bOUhEmIS4c4 ocBGy/3goZy8YXlsOMmRXsmRw/C8RpJ1MGyPElwxcj11ifCbar4iJiR4htCSeqNFig6j v+nZxT+LAeWilLQEe7pIWN0Tnvylt14x4PacPeK+NEbKu1Pl6HQMp2Q/iCG6pReN Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AAQBX9dTkF79/9pE+LqG6F8BvrZi22hGbVDLsyfxsR+e01RpHUsxdVCQ UrA0oV6IkuTxWmXZshnGzZk= X-Google-Smtp-Source: AKy350Yjv6FEHz8LdHhMzgrWEcmrZi0H+TmipmLyDZZENbF5pWItHA6JTVBsmcE1Wnx8PhguuMqJQQ== X-Received: by 2002:a05:6808:3307:b0:383:fef9:6cac with SMTP id ca7-20020a056808330700b00383fef96cacmr608463oib.9.1680873371376; Fri, 07 Apr 2023 06:16:11 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6820:1a0b:b0:525:74c2:ff62 with SMTP id bq11-20020a0568201a0b00b0052574c2ff62ls1177110oob.4.-pod-prod-gmail; Fri, 07 Apr 2023 06:16:09 -0700 (PDT) X-Received: by 2002:a05:6820:16a7:b0:525:2b3b:7453 with SMTP id bc39-20020a05682016a700b005252b3b7453mr1140799oob.0.1680873368966; Fri, 07 Apr 2023 06:16:08 -0700 (PDT) In-Reply-To: X-Original-Sender: farsethas-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:32453 Archived-At: ------=_Part_2004_1951152669.1680873368380 Content-Type: multipart/alternative; boundary="----=_Part_2005_1757173418.1680873368381" ------=_Part_2005_1757173418.1680873368381 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thanks! This works. torsdag 6. april 2023 kl. 16:22:17 UTC+2 skrev Julien Dutant: > Oops, remove the "print(id)" line in the filter script above, it was mean= t=20 > for debugging. > > On Thursday, April 6, 2023 at 3:20:48=E2=80=AFPM UTC+1 Julien Dutant wrot= e: > >> This filter will remove all empty Spans with id starting with x. Save as= =20 >> "removeXSpans.lua': >> >> function Span(el) >> return #el.content =3D=3D 0 and el.identifier:match('^x%d') and pandoc.S= pace() >> or el >> end >> >> And run pandoc with the `-L removeXSpans.lua` option, e.g.=20 >> pandoc -f dockbook sourcefile -t markdown -o outfile.md -L=20 >> removeXSpans.lua >> >> Result: >> # 1 Introduction >> >> However, this will break any link to #x1-10001. If there are internal=20 >> links in the doc (e.g. from the table of >> content) that you need to preserve, you need a filter that produces=20 >> instead: >> # 1 Introduction {#x1-10001} >> >> Perhaps this will work (it'd help to have a sample docbook source), save= d=20 >> as removeXSpans.lua and used as above >> >> function Header(hd) >> local id =3D '' >> hd.content =3D hd.content:walk { >> Span =3D function(el) >> if #el.content =3D=3D 0 and el.identifier:match('^x%d') then >> id =3D el.identifier >> return pandoc.Space() >> end >> end >> } >> print(id) >> if id ~=3D '' then=20 >> hd.identifier =3D id >> return hd >> end >> end >> >> On Wednesday, April 5, 2023 at 11:38:07=E2=80=AFPM UTC+1 hcf wrote: >> >>> I'm converting from DocBook to Markdown. >>> >>> In DocBook there are xml:id tags. When I convert to markdown these are= =20 >>> rendered as=20 >>> >>> []{#x1-10001}. >>> >>> >>> A markdown heading look like this when converting from DocBook. >>> >>> >>> # 1[]{#x1-10001}Introduction >>> >>> >>> Is there a way to turn this off? >>> >>> >>> best regards >>> >>> hcf >>> >> --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/88b9fb33-d8cf-4221-af8d-22c26c3c5033n%40googlegroups.com. ------=_Part_2005_1757173418.1680873368381 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thanks! This works.

torsdag 6. april 2023 kl. 16:22:17 UTC+2 skrev Julien= Dutant:
Oops= , remove the "print(id)" line in the filter script above, it was = meant for debugging.

On Thursday, April 6, 2023 at 3:20:48=E2=80=AFPM UTC+1 J= ulien Dutant wrote:
This filter will remove all empty Spans with id starting with x. Save as= "removeXSpans.lua':

function Span(el) return #el.content =3D=3D 0 and el= .identifier:match('^x%d') and pandoc.Space()
or el
end

And run pandoc= with the `-L removeXSpans.lua` option, e.g.
pandoc -f dockb= ook sourcefile -t markdown -o outfile.md -L removeXSpans.lua
=
Result:
# 1 Introduction

= However, this will break any link to #x1-10001. If there are internal links= in the doc (e.g. from the table of
content) that you need to pre= serve, you need a filter that produces instead:
# 1 Intr= oduction {#x1-10001}

Perhaps this will w= ork (it'd help to have a sample docbook source), saved as removeXSpans.= lua and used as above

function Header(hd)
<= span style=3D"white-space:pre"> local id =3D ''
hd.content =3D hd.content:walk {
Span =3D function(el)
if #el.content =3D=3D 0 and el.identifier:match(= '^x%d') then
id =3D = el.identifier
return pandoc.= Space()
end
end
}
print(id)
if id ~=3D '' then
hd.identifier =3D id
return hd
end
end<= br>

On Wednesday, April 5, 2023 at 11:38:07=E2=80=AFPM UTC+1 hc= f wrote:
I'm conve= rting from DocBook to Markdown.

In DocBook there are xml= :id tags. When I convert to markdown these are rendered as=C2=A0
=

[]{#x1-10001}.<= br>

<= br>

A= markdown heading =C2=A0look like this when converting from DocBook.=


=

# 1[]{#x1-1= 0001}Introduction


Is there a way to turn this off?


best regards

hcf

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d= /msgid/pandoc-discuss/88b9fb33-d8cf-4221-af8d-22c26c3c5033n%40googlegroups.= com.
------=_Part_2005_1757173418.1680873368381-- ------=_Part_2004_1951152669.1680873368380--