From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/26265 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Butch Newsgroups: gmane.text.pandoc Subject: =?UTF-8?Q?Converting_everything_that=E2=80=99s_inside_a_specific_div_(?= =?UTF-8?Q?including_other_div)_while_excluding_everything_else?= Date: Tue, 29 Sep 2020 13:45:27 -0700 (PDT) Message-ID: Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_626_2111327166.1601412327473" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="28378"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBDDOL2UDUYHRB2FZZ35QKGQE5DX5MJA-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Tue Sep 29 22:45:33 2020 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-oi1-f183.google.com ([209.85.167.183]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1kNMVA-0007Ex-5Q for gtp-pandoc-discuss@m.gmane-mx.org; Tue, 29 Sep 2020 22:45:32 +0200 Original-Received: by mail-oi1-f183.google.com with SMTP id 6sf2140075oix.6 for ; Tue, 29 Sep 2020 13:45:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:date:from:to:message-id:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-subscribe:list-unsubscribe; bh=UMwTj3vwj4Vky0mASjycojb2gDCWaSsxsWo/X1FCMrg=; b=hLMvJ4SLQJehCkMbhHJiyXGNMZDEBPXPNq6nB8GKiCO/ltEy+Vh54brvGsotajGWMC aZAjCWbveoirOXesdbA4OFgORAtIY8nI1IDltWIogb7++APFVK3BibA3w0HOTed684Ub 9s7chGDBsi6eFcagDE4vESUU/CEeTLczAz5KLCwjqQ33MszdfOp1e86zT+ct/Ag2Rfeo DthuCyXjQD47Y7wMuQ+yjSnlrf6C8zrMATlxXBfgjOxPv0jMtYx0tvBjaDnbEAmenohK Rr6iWeWDR1qyV/jdkOoQAdNRiLPL65h8HXUIK0p8ofngdUCeyBOjVeYAyLTZILfiUHCF 1wEg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:message-id:subject:mime-version:x-original-sender :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=UMwTj3vwj4Vky0mASjycojb2gDCWaSsxsWo/X1FCMrg=; b=vM4l9P7tnizV971Q9wuYvElZ0UjcyQ+Z2e9L9HkBL6LEJQMYQdlM92sxjKTNuVKcKR d/aQxcRwaFOV8x2M0prWINAmW8Ng5BASSg1QhB843uLI6cQ70IYuSvII0PV4QJJBYtr9 ea9hRZKh9Bs1+Nuu9vkLXlina3NR9PLBKRF6sw4PFbeAT0YnnitT4ARIlZ5UhNEm3nHk IOKVhUCaFKBFKNc35DL73vQn+BQ2svfn6TRSkUrRZ1rgW+hDAIfRNcLvysazPLbQJSy+ kzsfWw0+0xiwYSU2pPVt1eVUyYKn4/6epWPFE9oQp0Z/xgZ6rGPb8DyDrT3GafNIY5+E GcJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:date:from:to:message-id:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=UMwTj3vwj4Vky0mASjycojb2gDCWaSsxsWo/X1FCMrg=; b=rAUQ2SSzkauO3f3RLl5S+FLJOryS7dErZrij0484ONXCb7t/LcIN4NKYTvFFG4s6jc I88Osgqgl9mmiW4+j+X06hP16Amxfhab+DP55y5tcUO4ZyabaBWhGEXXHyTaB89ECd+3 oY6KZB+vqiweB4DGC6CgCb1Vh+1JMipQ3x9Nen7J4ie+hJIBfeB8yvNCoT8lsbCcu50M 9eilPNWvmc7u0gzYLc05PDjEEIYUdkNyrXRfDcC6idgVPjaUTFPNbX02DZ9gi4lwbLSf 7fuvRvFYYKzO573lbSzFW0EhC2FIsoJg2SVqWmBkLpZfguIjgrLF+Vcm+mhymgpXBFlD QMyQ== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM5328RWMzL6LPsdfCHZKMXlIRczi86csCNn78tDYhYvQ0HJ1wwMrw upPtfzB46zhh60c32G7QGjw= X-Google-Smtp-Source: ABdhPJxLhdlggqczGjHmwHKvUC3zVovXiKJONadxm3hi7yPzwDoZ0Bfe1rBmiWLh/SIcz3uHzaHL/g== X-Received: by 2002:a9d:4b99:: with SMTP id k25mr4208066otf.281.1601412331131; Tue, 29 Sep 2020 13:45:31 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6830:4d2:: with SMTP id s18ls1454006otd.0.gmail; Tue, 29 Sep 2020 13:45:28 -0700 (PDT) X-Received: by 2002:a05:6830:1d9a:: with SMTP id y26mr3849800oti.168.1601412328193; Tue, 29 Sep 2020 13:45:28 -0700 (PDT) X-Original-Sender: idiosyncraticwriter-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:26265 Archived-At: ------=_Part_626_2111327166.1601412327473 Content-Type: multipart/alternative; boundary="----=_Part_627_674145204.1601412327473" ------=_Part_627_674145204.1601412327473 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello, I am trying to convert specific parts of an HTML file to Markdown. I want= =20 to convert everything that=E2=80=99s inside a specific div (including other= div)=20 while excluding everything else. Is that possible? Here is an example. I want to take this:

This is the outer text.

This is the inner text.

This is the hidden text.

And convert it so I have this: ::: {.show} This is the outer text. ::: {.inner} This is the inner text. ::: ::: I.e., I want to convert everything that=E2=80=99s inside
=20 (including other div) and to exclude everything else in the document. If I use a filter like this: function Div(el) if el.classes[1] =3D=3D "show" then return el else return {} end end The resulting Markdown will be: ::: {.show} This is the outer text. ::: Which is kind of expected. So what can I do to include in the conversion=20 not only
, but also all the other div inside it? The actual HTML files I want to convert are very large, so I can=E2=80=99t = list all=20 the classes I want to include (or exclude from) in the conversion. Thanks in advance. --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/ee79a1ca-efb1-463c-ace9-5398c8e623e3n%40googlegroups.com. ------=_Part_627_674145204.1601412327473 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello,

I am trying to convert specific p= arts of an HTML file to Markdown. I want to convert everything that=E2=80= =99s inside a specific div (including other div) while excluding everything= else. Is that possible?

Here is an example. I wan= t to take this:

<div= class=3D"show">
  &nbs= p; <p>This is the outer text.</p>
    <div class=3D"inner">
        <p>This is th= e inner text.</p>
  =   </div>
</div>= ;
<div class=3D"hide">
    <p>This i= s the hidden text.</p>
&l= t;/div>

And convert it so I have this:

::: {.show}
=
This is the outer text.

<= span style=3D"font-family: "Courier New";">::: {.inner}
This is the inner text.
=
:::
:::

I.e., I want to convert everything= that=E2=80=99s inside <div class=3D"show">= ; (including other div) and to exclude everything else in the docume= nt.

If I use a filter like this:

function Div(el)
    if el.classes[1] =3D=3D "show" then
        return = el
    else
        return {}
    end
end

The result= ing Markdown will be:

:= :: {.show}
This is the outer te= xt.
:::

Which is kind of expected. So what can I do to include in the conv= ersion not only <div class=3D"show">, but also all the other div inside it?

The actua= l HTML files I want to convert are very large, so I can=E2=80=99t list all = the classes I want to include (or exclude from) in the conversion.

Thanks in advance.


--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d= /msgid/pandoc-discuss/ee79a1ca-efb1-463c-ace9-5398c8e623e3n%40googlegroups.= com.
------=_Part_627_674145204.1601412327473-- ------=_Part_626_2111327166.1601412327473--