From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/32687 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: J Newsgroups: gmane.text.pandoc Subject: Re: markdown - pdf two way sync Date: Tue, 23 May 2023 01:50:13 -0700 (PDT) Message-ID: <7ede943a-4651-480d-9d87-45eb0cb1f19bn@googlegroups.com> References: <20160503183852.GA21146@protagoras.berkeley.edu> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_872_1270724188.1684831813613" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="23598"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBDMYDKFOZAOBBRX4WGRQMGQED7ICVMQ-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Tue May 23 10:50:18 2023 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-oo1-f61.google.com ([209.85.161.61]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1q1Nik-0005t1-25 for gtp-pandoc-discuss@m.gmane-mx.org; Tue, 23 May 2023 10:50:18 +0200 Original-Received: by mail-oo1-f61.google.com with SMTP id 006d021491bc7-5525f2a1f0csf3371167eaf.0 for ; Tue, 23 May 2023 01:50:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20221208; t=1684831817; x=1687423817; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:references:in-reply-to:message-id:to:from:date :sender:from:to:cc:subject:date:message-id:reply-to; bh=hoHefVl/b75lC8UXL8owD7/vq3JDkwXiuqxJAsO4OTA=; b=j71bOc4X4TdJGESZjQJo1GhB+wM7p5P2jDM9Y4Cyx7gvzauxlZ98TDmKKR+O/8RJ9N 43bsP2Zce/AzU6URBccY5KoQ2LOJwJ89KDvsemi/CW3pBhDbY5BqQ3GHM4HKmvfJQf7E 16rtzlgwxSZf1QLY+Udr/uDCnGuCZkQ2Q1iiGy1iIpbyHF6u13qZ387HFfcK3vqroOhj lCvv8cBF529APx1smloAqYdYtoE2Yot9RAHoEFxHHrMwX0WDXWSkMBRvpeqQBD99n1Lx ZNot0wdh9NNHeeAcUcGUyRHCdDv6dQ2TGLDBZdcBf4N0ZhBoT/ri6D+jbO0sYf/dYPRu sZhQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684831817; x=1687423817; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:references:in-reply-to:message-id:to:from:date :from:to:cc:subject:date:message-id:reply-to; bh=hoHefVl/b75lC8UXL8owD7/vq3JDkwXiuqxJAsO4OTA=; b=fhnMvQ3qSVbsmZQ1+j9Oh7QAa8Jxs5TjfIWxupNCsvTgnH/7JPeAnpHvwwC3icP0bY 7Ta1ejz82vZz/TAVvECuq6P4i0LubXHmjJQ9pexhDGIFQN20tvIZ1ZmJKvt8TYm+ZeOo cA3YTjE4SMWkwQxIS0D9gdnsZvGBnFbdoynven23SUvOm+oPcHHDbBaQjRJy/RMXJGQS TkSPkP8aFF35eCdFGoRro4UlTZuvxGa2YuFDxZ9WKVe/OQB0L9QFHtHpbxh9yAwXry+X VIEnnegqH863bfGXvkXGSHaXJJjIjZyb8CSSSgfu3qQS7BIGF7YsrkknhgOrVgrLzUVT 7kIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684831817; x=1687423817; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :x-spam-checked-in-group:list-id:mailing-list:precedence:reply-to :x-original-sender:mime-version:subject:references:in-reply-to :message-id:to:from:date:x-beenthere:x-gm-message-state:sender:from :to:cc:subject:date:message-id:reply-to; bh=hoHefVl/b75lC8UXL8owD7/vq3JDkwXiuqxJAsO4OTA=; b=bekBY7MwgD9uSN02QlSi6Fv1XUgeEAIwCncMjeQ4Sjit0PtKSkMMe1Q1ziLFNUIO9O 9Y0cax/f7M7+0XTGBGkYAdF5xuHzJwOU+1UaI5w9DDz6GlkCVZk6hwvvrk9BOweBF1Wh Ua2v7mN6w4UrdftgHK9vtLYRwbdV7TSUMTW4ghj3yRESn9UBfyfTBdYgSQn8JOgsQDnj NSXuMl3varKvcDqT+WVWDr07N3MDX25oravCywFvjDWWIf4EHdpCbjghe7hP58C3I63j XjOPQX7ShzCRLX+V2dxcRXRGCKIWpOofV1PVPLtFMB/ueTsmii Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AC+VfDz1cDarDJQ1C1NZGoAGQgdLYRUHp5YkfEh1nS/rmdGoFE2K/qJn +vUP5QHyRC4DxXYNvOPsJs0= X-Google-Smtp-Source: ACHHUZ5W+i38IPW4uClHuzVaMuEPdML/gh8cxYf6sDOitCvsalFGUU8wPEhIgxE176tL02g+a8l/DQ== X-Received: by 2002:a05:6871:b25:b0:187:7f29:c1 with SMTP id fq37-20020a0568710b2500b001877f2900c1mr5893000oab.0.1684831816888; Tue, 23 May 2023 01:50:16 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6871:6a94:b0:19a:c3ee:b136 with SMTP id zf20-20020a0568716a9400b0019ac3eeb136ls25766oab.1.-pod-prod-02-us; Tue, 23 May 2023 01:50:14 -0700 (PDT) X-Received: by 2002:a05:6830:613:b0:6a6:8b7:d48 with SMTP id w19-20020a056830061300b006a608b70d48mr3172444oti.7.1684831814188; Tue, 23 May 2023 01:50:14 -0700 (PDT) In-Reply-To: X-Original-Sender: lixichen-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:32687 Archived-At: ------=_Part_872_1270724188.1684831813613 Content-Type: multipart/alternative; boundary="----=_Part_873_1749358326.1684831813613" ------=_Part_873_1749358326.1684831813613 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Is it possible to sync between markdown and pdf, or between markdown and=20 docx nowadays ? Many thanks ! On Saturday, May 14, 2016 at 11:50:25=E2=80=AFPM UTC+8 BH wrote: > On Tuesday, May 3, 2016 at 2:39:05 PM UTC-4, John MacFarlane wrote: >> >> Pandoc doesn't store any source-mapping information in=20 >> the AST.=20 >> > > This is, of course, right, but two-way sync (in both directions) is=20 > possible nonetheless (albeit in a way that's quite fragile). Here's the= =20 > rough idea (both requiring that pdfsync be used when compiling the .pdf= =20 > file): > > 1. For backward search (from .pdf to .md): have the .pdf viewer send the= =20 > source (.tex) file and line number to a script that (a) reads in the=20 > relevant line from the .tex file, (b) extracts a reasonable-sized chunk o= f=20 > text, (c) searches for that text in the corresponding markdown file, and= =20 > finally (d) sends the line number of that text to the text editor. > > 2. For forward search (from .md to .pdf) do the opposite: have the text= =20 > editor send the source (.md) file and line number to a script that (a)=20 > reads in the relevant line from the .md file, (b) extracts a=20 > reasonable-sized chunk of text, (c) searches for that text in the=20 > corresponding .tex file, and finally (d) sends the line number of that te= xt=20 > to the .pdf viewer. > > Clearly, a problem lies in step (b) in both cases: how do you locate the= =20 > relevant text to use in searching the corresponding file for? What I've= =20 > done differs between (1) and (2). > > In (2), I find it easiest to take the whole line of markdown, strip off= =20 > any initial markdown codes (such as those for enumerated lists), run it= =20 > through pandoc (using exactly the same options I use to generate the .pdf= =20 > in the first place) to convert to LaTeX, and then search for this in the= =20 > .tex file. > > In (1), I find I need to use a different strategy, since .tex -> .md=20 > conversion in pandoc often fails to produce a match in the .md file. (Thi= s=20 > is partly because of the way I have extended markdown using some filters.= )=20 > So here I try to locate a stretch of text in the .tex file that does not= =20 > contain any LaTeX commands as follows: I try searching for the first=20 > occurrence of '\' in the relevant line, and if this occurs deep enough in= to=20 > the text, I grab text from the beginning of the line to that point. If no= t,=20 > I look for a stretch of text after that '\' that occurs between '{' and= =20 > '}'. Usually this is good enough to find a unique match in the markdown= =20 > file. > > As I said, this is fragile: it won't work in every case, but for me it=20 > works about 90-95% of the time, which is good enough for my purposes. > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/7ede943a-4651-480d-9d87-45eb0cb1f19bn%40googlegroups.com. ------=_Part_873_1749358326.1684831813613 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Is it possible to sync between markdown and pdf, or between markdown and do= cx nowadays ? Many thanks !

On Saturday, May 14, 2016 at 11:50:25=E2=80= =AFPM UTC+8 BH wrote:
On Tuesday, May 3, 2016 at 2:39:05 PM UTC-4, John= MacFarlane wrote:
Pandoc doesn'= t store any source-mapping information in
the AST.

This is, o= f course, right, but two-way sync (in both directions) is possible nonethel= ess (albeit in a way that's quite fragile). Here's the rough idea (= both requiring that pdfsync be used when compiling the .pdf file):

1. For backward search (from .pdf to .md): have the .pdf v= iewer send the source (.tex) file and line number to a script that (a) read= s in the relevant line from the .tex file, (b) extracts a reasonable-sized = chunk of text, (c) searches for that text in the corresponding markdown fil= e, and finally (d) sends the line number of that text to the text editor.

2. For forward search (from .md to .pdf) do the opp= osite: have the text editor send the source (.md) file and line number to a= script that (a) reads in the relevant line from the .md file, (b) extracts= a reasonable-sized chunk of text, (c) searches for that text in the corres= ponding .tex file, and finally (d) sends the line number of that text to th= e .pdf viewer.

Clearly, a problem lies in step (b)= in both cases: how do you locate the relevant text to use in searching the= corresponding file for? What I've done differs between (1) and (2).

In (2), I find it easiest to take the whole line of = markdown, strip off any initial markdown codes (such as those for enumerate= d lists), run it through pandoc (using exactly the same options I use to ge= nerate the .pdf in the first place) to convert to LaTeX, and then search fo= r this in the .tex file.

In (1), I find I need to = use a different strategy, since .tex -> .md conversion in pandoc often f= ails to produce a match in the .md file. (This is partly because of the way= I have extended markdown using some filters.) So here I try to locate a st= retch of text in the .tex file that does not contain any LaTeX commands as = follows: I try searching for the first occurrence of '\' in the rel= evant line, and if this occurs deep enough into the text, I grab text from = the beginning of the line to that point. If not, I look for a stretch of te= xt after that '\' that occurs between '{' and '}'. = Usually this is good enough to find a unique match in the markdown file.

As I said, this is fragile: it won't work in eve= ry case, but for me it works about 90-95% of the time, which is good enough= for my purposes.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d= /msgid/pandoc-discuss/7ede943a-4651-480d-9d87-45eb0cb1f19bn%40googlegroups.= com.
------=_Part_873_1749358326.1684831813613-- ------=_Part_872_1270724188.1684831813613--