From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/32687
Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail
From: J <lixichen-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Newsgroups: gmane.text.pandoc
Subject: Re: markdown - pdf two way sync
Date: Tue, 23 May 2023 01:50:13 -0700 (PDT)
Message-ID: <7ede943a-4651-480d-9d87-45eb0cb1f19bn@googlegroups.com>
References: <aeb78b6e-ed06-4d86-ac4b-0a6d7385fbf7@googlegroups.com>
 <20160503183852.GA21146@protagoras.berkeley.edu>
 <d6c717a4-11c4-47f7-88b7-37ff0f9b394f@googlegroups.com>
Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
Mime-Version: 1.0
Content-Type: multipart/mixed; 
	boundary="----=_Part_872_1270724188.1684831813613"
Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214";
	logging-data="23598"; mail-complaints-to="usenet@ciao.gmane.io"
To: pandoc-discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Original-X-From: pandoc-discuss+bncBDMYDKFOZAOBBRX4WGRQMGQED7ICVMQ-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Tue May 23 10:50:18 2023
Return-path: <pandoc-discuss+bncBDMYDKFOZAOBBRX4WGRQMGQED7ICVMQ-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org
Original-Received: from mail-oo1-f61.google.com ([209.85.161.61])
	by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128)
	(Exim 4.92)
	(envelope-from <pandoc-discuss+bncBDMYDKFOZAOBBRX4WGRQMGQED7ICVMQ-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>)
	id 1q1Nik-0005t1-25
	for gtp-pandoc-discuss@m.gmane-mx.org; Tue, 23 May 2023 10:50:18 +0200
Original-Received: by mail-oo1-f61.google.com with SMTP id 006d021491bc7-5525f2a1f0csf3371167eaf.0
        for <gtp-pandoc-discuss@m.gmane-mx.org>; Tue, 23 May 2023 01:50:17 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=googlegroups.com; s=20221208; t=1684831817; x=1687423817;
        h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post
         :list-id:mailing-list:precedence:reply-to:x-original-sender
         :mime-version:subject:references:in-reply-to:message-id:to:from:date
         :sender:from:to:cc:subject:date:message-id:reply-to;
        bh=hoHefVl/b75lC8UXL8owD7/vq3JDkwXiuqxJAsO4OTA=;
        b=j71bOc4X4TdJGESZjQJo1GhB+wM7p5P2jDM9Y4Cyx7gvzauxlZ98TDmKKR+O/8RJ9N
         43bsP2Zce/AzU6URBccY5KoQ2LOJwJ89KDvsemi/CW3pBhDbY5BqQ3GHM4HKmvfJQf7E
         16rtzlgwxSZf1QLY+Udr/uDCnGuCZkQ2Q1iiGy1iIpbyHF6u13qZ387HFfcK3vqroOhj
         lCvv8cBF529APx1smloAqYdYtoE2Yot9RAHoEFxHHrMwX0WDXWSkMBRvpeqQBD99n1Lx
         ZNot0wdh9NNHeeAcUcGUyRHCdDv6dQ2TGLDBZdcBf4N0ZhBoT/ri6D+jbO0sYf/dYPRu
         sZhQ==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20221208; t=1684831817; x=1687423817;
        h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post
         :list-id:mailing-list:precedence:reply-to:x-original-sender
         :mime-version:subject:references:in-reply-to:message-id:to:from:date
         :from:to:cc:subject:date:message-id:reply-to;
        bh=hoHefVl/b75lC8UXL8owD7/vq3JDkwXiuqxJAsO4OTA=;
        b=fhnMvQ3qSVbsmZQ1+j9Oh7QAa8Jxs5TjfIWxupNCsvTgnH/7JPeAnpHvwwC3icP0bY
         7Ta1ejz82vZz/TAVvECuq6P4i0LubXHmjJQ9pexhDGIFQN20tvIZ1ZmJKvt8TYm+ZeOo
         cA3YTjE4SMWkwQxIS0D9gdnsZvGBnFbdoynven23SUvOm+oPcHHDbBaQjRJy/RMXJGQS
         TkSPkP8aFF35eCdFGoRro4UlTZuvxGa2YuFDxZ9WKVe/OQB0L9QFHtHpbxh9yAwXry+X
         VIEnnegqH863bfGXvkXGSHaXJJjIjZyb8CSSSgfu3qQS7BIGF7YsrkknhgOrVgrLzUVT
         7kIg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20221208; t=1684831817; x=1687423817;
        h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post
         :x-spam-checked-in-group:list-id:mailing-list:precedence:reply-to
         :x-original-sender:mime-version:subject:references:in-reply-to
         :message-id:to:from:date:x-beenthere:x-gm-message-state:sender:from
         :to:cc:subject:date:message-id:reply-to;
        bh=hoHefVl/b75lC8UXL8owD7/vq3JDkwXiuqxJAsO4OTA=;
        b=bekBY7MwgD9uSN02QlSi6Fv1XUgeEAIwCncMjeQ4Sjit0PtKSkMMe1Q1ziLFNUIO9O
         9Y0cax/f7M7+0XTGBGkYAdF5xuHzJwOU+1UaI5w9DDz6GlkCVZk6hwvvrk9BOweBF1Wh
         Ua2v7mN6w4UrdftgHK9vtLYRwbdV7TSUMTW4ghj3yRESn9UBfyfTBdYgSQn8JOgsQDnj
         NSXuMl3varKvcDqT+WVWDr07N3MDX25oravCywFvjDWWIf4EHdpCbjghe7hP58C3I63j
         XjOPQX7ShzCRLX+V2dxcRXRGCKIWpOofV1PVPLtFMB/ueTsmii 
Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
X-Gm-Message-State: AC+VfDz1cDarDJQ1C1NZGoAGQgdLYRUHp5YkfEh1nS/rmdGoFE2K/qJn
	+vUP5QHyRC4DxXYNvOPsJs0=
X-Google-Smtp-Source: ACHHUZ5W+i38IPW4uClHuzVaMuEPdML/gh8cxYf6sDOitCvsalFGUU8wPEhIgxE176tL02g+a8l/DQ==
X-Received: by 2002:a05:6871:b25:b0:187:7f29:c1 with SMTP id fq37-20020a0568710b2500b001877f2900c1mr5893000oab.0.1684831816888;
        Tue, 23 May 2023 01:50:16 -0700 (PDT)
X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
Original-Received: by 2002:a05:6871:6a94:b0:19a:c3ee:b136 with SMTP id
 zf20-20020a0568716a9400b0019ac3eeb136ls25766oab.1.-pod-prod-02-us; Tue, 23
 May 2023 01:50:14 -0700 (PDT)
X-Received: by 2002:a05:6830:613:b0:6a6:8b7:d48 with SMTP id w19-20020a056830061300b006a608b70d48mr3172444oti.7.1684831814188;
        Tue, 23 May 2023 01:50:14 -0700 (PDT)
In-Reply-To: <d6c717a4-11c4-47f7-88b7-37ff0f9b394f-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
X-Original-Sender: lixichen-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
Precedence: list
Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
List-ID: <pandoc-discuss.googlegroups.com>
X-Google-Group-Id: 1007024079513
List-Post: <https://groups.google.com/group/pandoc-discuss/post>, <mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
List-Help: <https://groups.google.com/support/>, <mailto:pandoc-discuss+help-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
List-Archive: <https://groups.google.com/group/pandoc-discuss
List-Subscribe: <https://groups.google.com/group/pandoc-discuss/subscribe>, <mailto:pandoc-discuss+subscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
List-Unsubscribe: <mailto:googlegroups-manage+1007024079513+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>,
 <https://groups.google.com/group/pandoc-discuss/subscribe>
Xref: news.gmane.io gmane.text.pandoc:32687
Archived-At: <http://permalink.gmane.org/gmane.text.pandoc/32687>

------=_Part_872_1270724188.1684831813613
Content-Type: multipart/alternative; 
	boundary="----=_Part_873_1749358326.1684831813613"

------=_Part_873_1749358326.1684831813613
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Is it possible to sync between markdown and pdf, or between markdown and=20
docx nowadays ? Many thanks !

On Saturday, May 14, 2016 at 11:50:25=E2=80=AFPM UTC+8 BH wrote:

> On Tuesday, May 3, 2016 at 2:39:05 PM UTC-4, John MacFarlane wrote:
>>
>> Pandoc doesn't store any source-mapping information in=20
>> the AST.=20
>>
>
> This is, of course, right, but two-way sync (in both directions) is=20
> possible nonetheless (albeit in a way that's quite fragile). Here's the=
=20
> rough idea (both requiring that pdfsync be used when compiling the .pdf=
=20
> file):
>
> 1. For backward search (from .pdf to .md): have the .pdf viewer send the=
=20
> source (.tex) file and line number to a script that (a) reads in the=20
> relevant line from the .tex file, (b) extracts a reasonable-sized chunk o=
f=20
> text, (c) searches for that text in the corresponding markdown file, and=
=20
> finally (d) sends the line number of that text to the text editor.
>
> 2. For forward search (from .md to .pdf) do the opposite: have the text=
=20
> editor send the source (.md) file and line number to a script that (a)=20
> reads in the relevant line from the .md file, (b) extracts a=20
> reasonable-sized chunk of text, (c) searches for that text in the=20
> corresponding .tex file, and finally (d) sends the line number of that te=
xt=20
> to the .pdf viewer.
>
> Clearly, a problem lies in step (b) in both cases: how do you locate the=
=20
> relevant text to use in searching the corresponding file for? What I've=
=20
> done differs between (1) and (2).
>
> In (2), I find it easiest to take the whole line of markdown, strip off=
=20
> any initial markdown codes (such as those for enumerated lists), run it=
=20
> through pandoc (using exactly the same options I use to generate the .pdf=
=20
> in the first place) to convert to LaTeX, and then search for this in the=
=20
> .tex file.
>
> In (1), I find I need to use a different strategy, since .tex -> .md=20
> conversion in pandoc often fails to produce a match in the .md file. (Thi=
s=20
> is partly because of the way I have extended markdown using some filters.=
)=20
> So here I try to locate a stretch of text in the .tex file that does not=
=20
> contain any LaTeX commands as follows: I try searching for the first=20
> occurrence of '\' in the relevant line, and if this occurs deep enough in=
to=20
> the text, I grab text from the beginning of the line to that point. If no=
t,=20
> I look for a stretch of text after that '\' that occurs between '{' and=
=20
> '}'. Usually this is good enough to find a unique match in the markdown=
=20
> file.
>
> As I said, this is fragile: it won't work in every case, but for me it=20
> works about 90-95% of the time, which is good enough for my purposes.
>

--=20
You received this message because you are subscribed to the Google Groups "=
pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/=
pandoc-discuss/7ede943a-4651-480d-9d87-45eb0cb1f19bn%40googlegroups.com.

------=_Part_873_1749358326.1684831813613
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Is it possible to sync between markdown and pdf, or between markdown and do=
cx nowadays ? Many thanks !<br /><br /><div class=3D"gmail_quote"><div dir=
=3D"auto" class=3D"gmail_attr">On Saturday, May 14, 2016 at 11:50:25=E2=80=
=AFPM UTC+8 BH wrote:<br/></div><blockquote class=3D"gmail_quote" style=3D"=
margin: 0 0 0 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-lef=
t: 1ex;"><div dir=3D"ltr">On Tuesday, May 3, 2016 at 2:39:05 PM UTC-4, John=
 MacFarlane wrote:<blockquote class=3D"gmail_quote" style=3D"margin:0;margi=
n-left:0.8ex;border-left:1px #ccc solid;padding-left:1ex">Pandoc doesn&#39;=
t store any source-mapping information in
<br>the AST.
<br></blockquote><div><br></div></div><div dir=3D"ltr"><div><div>This is, o=
f course, right, but two-way sync (in both directions) is possible nonethel=
ess (albeit in a way that&#39;s quite fragile). Here&#39;s the rough idea (=
both requiring that pdfsync be used when compiling the .pdf file):</div><di=
v><br></div><div>1. For backward search (from .pdf to .md): have the .pdf v=
iewer send the source (.tex) file and line number to a script that (a) read=
s in the relevant line from the .tex file, (b) extracts a reasonable-sized =
chunk of text, (c) searches for that text in the corresponding markdown fil=
e, and finally (d) sends the line number of that text to the text editor.</=
div><div><br></div><div>2. For forward search (from .md to .pdf) do the opp=
osite: have the text editor send the source (.md) file and line number to a=
 script that (a) reads in the relevant line from the .md file, (b) extracts=
 a reasonable-sized chunk of text, (c) searches for that text in the corres=
ponding .tex file, and finally (d) sends the line number of that text to th=
e .pdf viewer.</div><div><br></div><div>Clearly, a problem lies in step (b)=
 in both cases: how do you locate the relevant text to use in searching the=
 corresponding file for? What I&#39;ve done differs between (1) and (2).</d=
iv><div><br></div><div>In (2), I find it easiest to take the whole line of =
markdown, strip off any initial markdown codes (such as those for enumerate=
d lists), run it through pandoc (using exactly the same options I use to ge=
nerate the .pdf in the first place) to convert to LaTeX, and then search fo=
r this in the .tex file.</div><div><br></div><div>In (1), I find I need to =
use a different strategy, since .tex -&gt; .md conversion in pandoc often f=
ails to produce a match in the .md file. (This is partly because of the way=
 I have extended markdown using some filters.) So here I try to locate a st=
retch of text in the .tex file that does not contain any LaTeX commands as =
follows: I try searching for the first occurrence of &#39;\&#39; in the rel=
evant line, and if this occurs deep enough into the text, I grab text from =
the beginning of the line to that point. If not, I look for a stretch of te=
xt after that &#39;\&#39; that occurs between &#39;{&#39; and &#39;}&#39;. =
Usually this is good enough to find a unique match in the markdown file.</d=
iv><div><br></div><div>As I said, this is fragile: it won&#39;t work in eve=
ry case, but for me it works about 90-95% of the time, which is good enough=
 for my purposes.</div></div></div></blockquote></div>

<p></p>

-- <br />
You received this message because you are subscribed to the Google Groups &=
quot;pandoc-discuss&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org">pand=
oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org</a>.<br />
To view this discussion on the web visit <a href=3D"https://groups.google.c=
om/d/msgid/pandoc-discuss/7ede943a-4651-480d-9d87-45eb0cb1f19bn%40googlegro=
ups.com?utm_medium=3Demail&utm_source=3Dfooter">https://groups.google.com/d=
/msgid/pandoc-discuss/7ede943a-4651-480d-9d87-45eb0cb1f19bn%40googlegroups.=
com</a>.<br />

------=_Part_873_1749358326.1684831813613--

------=_Part_872_1270724188.1684831813613--