From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/32446 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Julien Dutant Newsgroups: gmane.text.pandoc Subject: Re: XML-ID when converting to markdown Date: Thu, 6 Apr 2023 07:20:48 -0700 (PDT) Message-ID: <115179cd-21e7-4e29-aea0-add708149ce0n@googlegroups.com> References: <941a4fdb-f161-42e5-856b-d98e88db882dn@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_1242_487743558.1680790848118" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="1102"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBC5Y3356IYIMDKV3UEDBUBCXTYNYA-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Thu Apr 06 16:20:55 2023 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-oa1-f62.google.com ([209.85.160.62]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1pkQTs-000AXf-HB for gtp-pandoc-discuss@m.gmane-mx.org; Thu, 06 Apr 2023 16:20:52 +0200 Original-Received: by mail-oa1-f62.google.com with SMTP id 586e51a60fabf-17afa2c993csf20647036fac.2 for ; Thu, 06 Apr 2023 07:20:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; t=1680790851; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:references:in-reply-to:message-id:to:from:date :sender:from:to:cc:subject:date:message-id:reply-to; bh=XR5uLpz82e52xhfKh69QalzH+fcJEsYw/RAziK8IBOM=; b=hl2zuvWnHbXqFJxAFW1YbvlrbdHEQei7MQCxrEaOLsogUcinPXxGuylwfWIxPSiQtI jGjJYqGwplPG2L5KyzAWS49ETGf5JehDWYd4VCBNem1gYiNaJB1Ew8L2upfDb++l+haD aNQcyn8bggan6qtayUbuW5vgJWxqfOtTKj5X8fcQJRmfMqpjxhu3AuEUGKPKn5sKUEhW qPuZgCN9fJKrud4NDIgzfaIJRrGBLgv6Jbleve/mLgWN1OEo8wWoP92olqd+S5UpU5te /RWT1F5fhHq1D8IuxAxlMbQ000wpiqjhPaA4sWfWPrp8CCLQ45mlbt3WLQWOBrupVbew XSBg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680790851; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:references:in-reply-to:message-id:to:from:date :from:to:cc:subject:date:message-id:reply-to; bh=XR5uLpz82e52xhfKh69QalzH+fcJEsYw/RAziK8IBOM=; b=npm2RPaDmmDz0zzRiU/FFmDo8UiCJAE8NtqXbwGKa2zNIDb7zn14WUr/omN0JMXdPB UaBlZjQIqLuo6qNkD/41bHL3tJwMHrbCzuDO0/xPycFG6+Fxo/riMdNJU21V5ztOi2HG 0u/iLwG6T9CpugDMnLpYSVAsqHdo3IXfTr9Gg7fCCv17MlYK/uQfU7xxFLJpwQQxp3qy QzM1h2gjRInKRhHBpLZ8btT0jkOr2vNsNzh9qA525vdJAtpc845wLe2jagn/5Yrjijqn fHD1ayOn4tRmXf5E0ibswAoj7LZ6RYvlTysPGmEKephoYyMm4VVqyj0A9wroM2igcNtJ Rt4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680790851; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :x-spam-checked-in-group:list-id:mailing-list:precedence:reply-to :x-original-sender:mime-version:subject:references:in-reply-to :message-id:to:from:date:x-gm-message-state:sender:from:to:cc :subject:date:message-id:reply-to; bh=XR5uLpz82e52xhfKh69QalzH+fcJEsYw/RAziK8IBOM=; b=DOms9fWSJcJAM2XUHeiBg981HxsHwZxC82x6vKOeVoVaAgppl8Y8vl5SZiCDvAoEEQ puNLHzfXL5UoltlhZ1qx7/h+hqVDJIG759/5rU95RjaI3BUIBvOtQNKN4iLXwl70MFlI lZ0V4MCBivinp/TvxjT1pRPlhryPiNSDDNq9ZUfhvyq6/S8N1j1tEImNRYPGn+PCt4R8 Tz6s3Z4Zjjd7K78EYCFI0EL9/EyDrPTAIr9D+h7aaYqshZzROmEVTdltz4AYCW82dAz2 Ytc7Roec7v9It2fov9iOvhiFrFIYz33SG2mS/ofSdWIV8yYFjC4mEdl+vVDv4LCvtvmO Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AAQBX9d4746lZvFUUmskyROnjfpkWEf9ScqZNiIiB6t0A+jjRUVqRizA EHEZhK0fO70FtviJEBbVs74= X-Google-Smtp-Source: AKy350Z7GNPFmhnRbj2+muwF34ifVwme2fFPt3vzO5+b9gBBUXPOivDrDhhKZrcY6AQN0Lg1MbfkAA== X-Received: by 2002:a9d:7310:0:b0:6a3:1eab:dbdf with SMTP id e16-20020a9d7310000000b006a31eabdbdfmr3220366otk.6.1680790851388; Thu, 06 Apr 2023 07:20:51 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6830:600c:b0:698:cfb:c76 with SMTP id bx12-20020a056830600c00b006980cfb0c76ls3835727otb.2.-pod-prod-gmail; Thu, 06 Apr 2023 07:20:49 -0700 (PDT) X-Received: by 2002:a9d:7e96:0:b0:69f:5701:de09 with SMTP id m22-20020a9d7e96000000b0069f5701de09mr3154608otp.6.1680790848868; Thu, 06 Apr 2023 07:20:48 -0700 (PDT) In-Reply-To: <941a4fdb-f161-42e5-856b-d98e88db882dn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> X-Original-Sender: julien.dutant-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:32446 Archived-At: ------=_Part_1242_487743558.1680790848118 Content-Type: multipart/alternative; boundary="----=_Part_1243_1973436539.1680790848118" ------=_Part_1243_1973436539.1680790848118 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable This filter will remove all empty Spans with id starting with x. Save as=20 "removeXSpans.lua': function Span(el) return #el.content =3D=3D 0 and el.identifier:match('^x%d') and pandoc.Spac= e() or el end And run pandoc with the `-L removeXSpans.lua` option, e.g.=20 pandoc -f dockbook sourcefile -t markdown -o outfile.md -L removeXSpans.lua Result: # 1 Introduction However, this will break any link to #x1-10001. If there are internal links= =20 in the doc (e.g. from the table of content) that you need to preserve, you need a filter that produces instead= : # 1 Introduction {#x1-10001} Perhaps this will work (it'd help to have a sample docbook source), saved= =20 as removeXSpans.lua and used as above function Header(hd) local id =3D '' hd.content =3D hd.content:walk { Span =3D function(el) if #el.content =3D=3D 0 and el.identifier:match('^x%d') then id =3D el.identifier return pandoc.Space() end end } print(id) if id ~=3D '' then=20 hd.identifier =3D id return hd end end On Wednesday, April 5, 2023 at 11:38:07=E2=80=AFPM UTC+1 hcf wrote: > I'm converting from DocBook to Markdown. > > In DocBook there are xml:id tags. When I convert to markdown these are=20 > rendered as=20 > > []{#x1-10001}. > > > A markdown heading look like this when converting from DocBook. > > > # 1[]{#x1-10001}Introduction > > > Is there a way to turn this off? > > > best regards > > hcf > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/115179cd-21e7-4e29-aea0-add708149ce0n%40googlegroups.com. ------=_Part_1243_1973436539.1680790848118 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
This filter will remove all empty Spans with id starting with x. Save = as "removeXSpans.lua':

function Span(el)
return #el.content =3D=3D 0 and el.= identifier:match('^x%d') and pandoc.Space()
or el
end

And run pandoc = with the `-L removeXSpans.lua` option, e.g.
pandoc -f dock= book sourcefile -t markdown -o outfile.md -L removeXSpans.lua

Result:
# 1 Introduction

However, this will break any link to #x1-10001. If there are interna= l links in the doc (e.g. from the table of
content) that you need= to preserve, you need a filter that produces instead:
# 1 Introduction {#x1-10001}

Perhap= s this will work (it'd help to have a sample docbook source), saved as remo= veXSpans.lua and used as above

function He= ader(hd)
local id =3D ''
hd.content =3D hd.content:walk = {
Span =3D function(el) if #el.content =3D=3D 0 and= el.identifier:match('^x%d') then
= id =3D el.identifier
= return pandoc.Space()
end
end
}
= print(id)
if id ~=3D= '' then
hd.identifier =3D= id
return hd
end
end

=
On Wednes= day, April 5, 2023 at 11:38:07=E2=80=AFPM UTC+1 hcf wrote:
I'm converting from DocBo= ok to Markdown.

In DocBook there are xml:id tags. When I= convert to markdown these are rendered as=C2=A0

[]{#x1-10001}.


A markdown headin= g =C2=A0look like this when converting from DocBook.


# 1[]{#x1-10001}Introductio= n

Is = there a way to turn this off?


best regards

hcf

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d= /msgid/pandoc-discuss/115179cd-21e7-4e29-aea0-add708149ce0n%40googlegroups.= com.
------=_Part_1243_1973436539.1680790848118-- ------=_Part_1242_487743558.1680790848118--