From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/30490
Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail
From: Kris Wilk <kris-AwXHIjbJCMCw5LPnMra/2Q@public.gmane.org>
Newsgroups: gmane.text.pandoc
Subject: RTF to Markdown questions
Date: Wed, 27 Apr 2022 11:02:20 -0700 (PDT)
Message-ID: <aecd40a2-09db-4e1b-96ad-752973375e0cn@googlegroups.com>
Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
Mime-Version: 1.0
Content-Type: multipart/mixed; 
	boundary="----=_Part_1516_1231300688.1651082540493"
Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214";
	logging-data="20022"; mail-complaints-to="usenet@ciao.gmane.io"
To: pandoc-discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Original-X-From: pandoc-discuss+bncBCC5D7WV5UIRBL4KU2JQMGQEFJE6O2I-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Wed Apr 27 20:02:26 2022
Return-path: <pandoc-discuss+bncBCC5D7WV5UIRBL4KU2JQMGQEFJE6O2I-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org
Original-Received: from mail-oi1-f189.google.com ([209.85.167.189])
	by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128)
	(Exim 4.92)
	(envelope-from <pandoc-discuss+bncBCC5D7WV5UIRBL4KU2JQMGQEFJE6O2I-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>)
	id 1njlzd-00052c-Vq
	for gtp-pandoc-discuss@m.gmane-mx.org; Wed, 27 Apr 2022 20:02:26 +0200
Original-Received: by mail-oi1-f189.google.com with SMTP id o8-20020acad708000000b00322487ea641sf1273482oig.7
        for <gtp-pandoc-discuss@m.gmane-mx.org>; Wed, 27 Apr 2022 11:02:25 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=googlegroups.com; s=20210112;
        h=sender:date:from:to:message-id:subject:mime-version
         :x-original-sender:reply-to:precedence:mailing-list:list-id
         :list-post:list-help:list-archive:list-subscribe:list-unsubscribe;
        bh=BAq1Jocl+dd0dhlAMNMQXOEHbeXNAscNNbwtxyDG28w=;
        b=rYBlOM8QTJIcqWiQDjksJftdI0LqyWK/h44VF4A/C9cbs/SV/u+9UR668bKVL5/ptf
         L+YkmNzod5spqOT9yKNibScEh4fmISHaZ3NJSvPR0xJyNwKcm7SSxDH61CGy0fz3Z3xd
         x//mNPxYkve9k/43/0rU/riAh9sQ4t4275PK8cENdt/W1Y42HKAy9wwSY3yc7s6BdzA7
         3Iu1V2WO9IhRd2mZB+Tz/3zos8iwgKph/gTo68fBxvMsY8tkUB5Blx9ykS030DFOEulV
         pybskLAFIT0HSgZqBJltHKhbMebt5gjanz9T4BvpViXpsWwhCsnulGgY+3lqWodd/8QZ
         DO6w==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=reefnet.ca; s=google;
        h=date:from:to:message-id:subject:mime-version:x-original-sender
         :reply-to:precedence:mailing-list:list-id:list-post:list-help
         :list-archive:list-subscribe:list-unsubscribe;
        bh=BAq1Jocl+dd0dhlAMNMQXOEHbeXNAscNNbwtxyDG28w=;
        b=bieRt1Q1PTHfXRswjRScBTHQQcWL/GA7O8hqymuKoRB5ip/k63ToiOag+VvPj83QVg
         a/ns00vnCfQRjEj2nbmV5T/g8zAZIkcx8FIWdlp+33IWpBRcBXeVzYdi41lY21a2hsi8
         cVxmHGs40k1k20dkbVBMTN5g1D97z/Gm1NdKA=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=sender:x-gm-message-state:date:from:to:message-id:subject
         :mime-version:x-original-sender:reply-to:precedence:mailing-list
         :list-id:x-spam-checked-in-group:list-post:list-help:list-archive
         :list-subscribe:list-unsubscribe;
        bh=BAq1Jocl+dd0dhlAMNMQXOEHbeXNAscNNbwtxyDG28w=;
        b=PtxzCl+6PWnM8oOv69YO8RHsoC5iCt6GMIdm6TJMXIRlocqxKvSwKH3L3ktaaMEfW3
         3+L0JqhX4k5QfdLVU1SZFjG87LTv6NfqP2Esmh07Ty6WTfBxrEFktMnsmNGrYA2LpLE5
         q8/jr9Q8xvuj5OHK5+c2SKItmL4V8d3Lb23ZedseIzQ9WEPnGHb05z2c8VbiOfsj4Yga
         FjsyNuJAr6eIzhOFkwcZW9Mw0+z+uNsz74+IYOJfXuWWAl5fCrEoBgh6RbGBpPUV6DY1
         +JyHdkSr0Ov4ER1nLJ9EQdQ4X8P2tI/KZntDXBP9UAkSj1UdE27wuSrd5HI8FnKzRk7l
         2Esw==
Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
X-Gm-Message-State: AOAM5305TUvCovX9yABZoJkLXqH40mZwtgf4hQlG6J8nIuSFeIO2iizo
	D3/jBecUIGbQeytb7Q/UAlo=
X-Google-Smtp-Source: ABdhPJwshjL/SQz0NUugcALtfyP3aj08gtXJb+UvATJXiTvZaNFNkeoN6uYwlAkkYBVnFlqoAvN+WA==
X-Received: by 2002:a05:6808:20a8:b0:322:9f53:2003 with SMTP id s40-20020a05680820a800b003229f532003mr13517441oiw.243.1651082544929;
        Wed, 27 Apr 2022 11:02:24 -0700 (PDT)
X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
Original-Received: by 2002:a05:6870:a1a0:b0:d7:1d2b:ec1a with SMTP id
 a32-20020a056870a1a000b000d71d2bec1als6474713oaf.3.gmail; Wed, 27 Apr 2022
 11:02:21 -0700 (PDT)
X-Received: by 2002:a05:6870:42c7:b0:e9:11d2:b259 with SMTP id z7-20020a05687042c700b000e911d2b259mr10440667oah.272.1651082541107;
        Wed, 27 Apr 2022 11:02:21 -0700 (PDT)
X-Original-Sender: kris-AwXHIjbJCMCw5LPnMra/2Q@public.gmane.org
Precedence: list
Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
List-ID: <pandoc-discuss.googlegroups.com>
X-Google-Group-Id: 1007024079513
List-Post: <https://groups.google.com/group/pandoc-discuss/post>, <mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
List-Help: <https://groups.google.com/support/>, <mailto:pandoc-discuss+help-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
List-Archive: <https://groups.google.com/group/pandoc-discuss
List-Subscribe: <https://groups.google.com/group/pandoc-discuss/subscribe>, <mailto:pandoc-discuss+subscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
List-Unsubscribe: <mailto:googlegroups-manage+1007024079513+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>,
 <https://groups.google.com/group/pandoc-discuss/subscribe>
Xref: news.gmane.io gmane.text.pandoc:30490
Archived-At: <http://permalink.gmane.org/gmane.text.pandoc/30490>

------=_Part_1516_1231300688.1651082540493
Content-Type: multipart/alternative; 
	boundary="----=_Part_1517_1879203557.1651082540493"

------=_Part_1517_1879203557.1651082540493
Content-Type: text/plain; charset="UTF-8"

Sorry if anyone gets this twice, had to correct my formatting...

I'm trying to use pandoc (for the first time) to convert some RTF files to 
markdown. My goal is to extract the text with ***bold*** and **italics** 
preserved and no other formatting.

Simply converting with "pandoc in.rtf -o out.md" produces a markdown file 
that's not quite what I need. For instance, here's a line from the output:

**[Scientific Name]{.underline}: ***Aplysia parvula *Morch, 1863

FIRST and foremost, pandoc tries to preserve the underlined text, which I 
don't want. Can this be disabled? I've tried the "bracketed_spans" and "
native_spans" extensions but this still processes the underlines as:

**<u>Scientific Name</u>: ***Aplysia parvula *Morch, 1863

SECOND, at least when I view this in VSCode's markdown preview, the bold 
and emphasis are not presented correctly, I guess because they touch each 
other or have spaces (or both?)? It displays correctly if it's:

**Scientific Name:** *Aplysia parvula* Morch, 1863

I realize that the text in the RTF might have the bold/italic tagged 
weirdly but is there a way to deal with this or am I just stuck? I have 
about 500 such files to process, so I'm looking for automated methods.

Thanks in advance for any help you can provide!

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/aecd40a2-09db-4e1b-96ad-752973375e0cn%40googlegroups.com.

------=_Part_1517_1879203557.1651082540493
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div>Sorry if anyone gets this twice, had to correct my formatting...</div>=
<div><br></div><div>I'm trying to use pandoc (for the first time) to conver=
t some RTF files to markdown. My goal is to extract the text with <b>**bold=
**</b> and <i>*italics*</i> preserved and no other formatting.</div><div><b=
r></div><div>Simply converting with <font face=3D"Courier New">"pandoc in.r=
tf -o out.md</font>" produces a markdown file that's not quite what I need.=
 For instance, here's a line from the output:</div><div><br></div><div><fon=
t face=3D"Courier New">**[Scientific Name]{.underline}: ***Aplysia parvula =
*Morch, 1863<br></font><br></div><div>FIRST and foremost, pandoc tries to p=
reserve the underlined text, which I don't want. Can this be disabled? I've=
 tried the "<font face=3D"Courier New">bracketed_spans</font>" and "<font f=
ace=3D"Courier New">native_spans</font>" extensions but this still processe=
s the underlines as:</div><div><br></div><div><div><font face=3D"Courier Ne=
w">**&lt;u&gt;Scientific Name&lt;/u&gt;: ***Aplysia parvula *Morch, 1863</f=
ont></div></div><div><br></div><div>SECOND, at least when I view this in VS=
Code's markdown preview, the bold and emphasis are not presented correctly,=
 I guess because they touch each other or have spaces (or both?)? It displa=
ys correctly if it's:</div><div><br></div><div><font face=3D"Courier New">*=
*Scientific Name:** *Aplysia parvula* Morch, 1863<br></font></div><div><br>=
</div><div>I realize that the text in the RTF might have the bold/italic ta=
gged weirdly but is there a way to deal with this or am I just stuck? I hav=
e about 500 such files to process, so I'm looking for automated methods.<br=
></div><div><br></div><div>Thanks in advance for any help you can provide!<=
/div>

<p></p>

-- <br />
You received this message because you are subscribed to the Google Groups &=
quot;pandoc-discuss&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org">pand=
oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org</a>.<br />
To view this discussion on the web visit <a href=3D"https://groups.google.c=
om/d/msgid/pandoc-discuss/aecd40a2-09db-4e1b-96ad-752973375e0cn%40googlegro=
ups.com?utm_medium=3Demail&utm_source=3Dfooter">https://groups.google.com/d=
/msgid/pandoc-discuss/aecd40a2-09db-4e1b-96ad-752973375e0cn%40googlegroups.=
com</a>.<br />

------=_Part_1517_1879203557.1651082540493--

------=_Part_1516_1231300688.1651082540493--