From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/14953
Path: news.gmane.org!not-for-mail
From: BH <bewihelm-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Newsgroups: gmane.text.pandoc
Subject: Re: markdown - pdf two way sync
Date: Sat, 14 May 2016 08:50:25 -0700 (PDT)
Message-ID: <d6c717a4-11c4-47f7-88b7-37ff0f9b394f@googlegroups.com>
References: <aeb78b6e-ed06-4d86-ac4b-0a6d7385fbf7@googlegroups.com>
 <20160503183852.GA21146@protagoras.berkeley.edu>
Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: multipart/mixed; 
	boundary="----=_Part_2797_624272913.1463241025545"
X-Trace: ger.gmane.org 1463241029 23142 80.91.229.3 (14 May 2016 15:50:29 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Sat, 14 May 2016 15:50:29 +0000 (UTC)
To: pandoc-discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Original-X-From: pandoc-discuss+bncBDYNHQGSUICRBQUS3W4QKGQE2ADELLI-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Sat May 14 17:50:29 2016
Return-path: <pandoc-discuss+bncBDYNHQGSUICRBQUS3W4QKGQE2ADELLI-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Envelope-to: gtp-pandoc-discuss@m.gmane.org
Original-Received: from mail-ob0-f185.google.com ([209.85.214.185])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <pandoc-discuss+bncBDYNHQGSUICRBQUS3W4QKGQE2ADELLI-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>)
	id 1b1bpk-0007GM-1F
	for gtp-pandoc-discuss@m.gmane.org; Sat, 14 May 2016 17:50:28 +0200
Original-Received: by mail-ob0-f185.google.com with SMTP id n10sf17660215obb.0
        for <gtp-pandoc-discuss@m.gmane.org>; Sat, 14 May 2016 08:50:27 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=googlegroups.com; s=20120806;
        h=sender:date:from:to:message-id:in-reply-to:references:subject
         :mime-version:x-original-sender:reply-to:precedence:mailing-list
         :list-id:x-spam-checked-in-group:list-post:list-help:list-archive
         :list-subscribe:list-unsubscribe;
        bh=/E9YR0BAyaeuj1co2q4MRUteXybHSRyDrrdk8yAuNcM=;
        b=viIhLa4Jyuwxl9z9tTprp8QXt/f491qA4ytWZarLbqPnYosrf+IZ/oMyn5PKy2NVF0
         YWJ6RtChnz8yplCeQy7XvZpO/Xf+FokBSCr5kqHF7A5Ju9QaQ5dPmaFP8gyQHBHkFrTU
         CbtwucUe7m+HIgdkDBSWp3zFgKFCJU+ruBVM16ORj+aqDWUHcM22AGDzPkyprDYNKoKq
         OEBBLUTbAKQeZUCTIfA0I7bwBPPHDeioCxVAE0RW3YUQeo0NEUVlsxPaMO0Q3CKpZOUK
         jalSC/ImYoA0rhYzWCv+IOu4sOlCoL53ddN36Wv21aHnHeHM/5CTWskzIpEFhW0+WGEz
         2nZA==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20120113;
        h=date:from:to:message-id:in-reply-to:references:subject:mime-version
         :x-original-sender:reply-to:precedence:mailing-list:list-id
         :x-spam-checked-in-group:list-post:list-help:list-archive
         :list-subscribe:list-unsubscribe;
        bh=/E9YR0BAyaeuj1co2q4MRUteXybHSRyDrrdk8yAuNcM=;
        b=x9RqlA2pXvSolI9FkznHMotFBOamGqK4FacAddnempjcy8xrrYI8pFGcDo9FIIUv4R
         XS+FQk7JXhodcNWmgr/p1XdnVrJgq3OYXkSFpSSUFHjzGUt2UAX2AkFDfk/6W+54EV/c
         cm1UBGeZ99S33zJ/b9PNValelsSgoaL8fSx380SRlfARyT8XuEyT5sbT2M+0/lDyP77w
         DsZBWqeVj3uDiqP1GHOmxacocm9lVsR9KGDmWL+TON3SxSS7ABjDEOJDR0GG1PfoucOv
         Z+wkWZ3pho6R5kwRZm1OiieQjK7Xr0UrB02uM2APzDYWVQdrL3rCkeaAwbHAoLky5UC4
         X9rQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20130820;
        h=sender:x-gm-message-state:date:from:to:message-id:in-reply-to
         :references:subject:mime-version:x-original-sender:reply-to
         :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post
         :list-help:list-archive:list-subscribe:list-unsubscribe;
        bh=/E9YR0BAyaeuj1co2q4MRUteXybHSRyDrrdk8yAuNcM=;
        b=l4C+SsaOXOymy/Jk9Jxbkd+9VJYU/KerJz+/4Wv4LMWBCjzRcSkX2Qs2/hKHhVW1eD
         RWlNAmTp8v7mQZboOqeR4oXI1kREJiZsde66wNiM4vj6mrnXR29P6RkGMdpEPujrkNmn
         QcRSacIFbAfIx7lLWolJJJ+4zXgC1QqUtFOKOyPfrmPNQpQ8ufaqcWDIHnkIG3UXvSg0
         S6V+7A6uOJCy7L6fD2layAeAahvSo7y3RPx0HI7ZlZ/S4Udn3Jip4gtf4MdRLog/m05u
         JVZWNF0IIHA0sjaj9n/oI4NEiP1IfX0MIHNmtY1LKa4dZ/jRruB059kINEBEXlZgtMPO
         OBUA==
Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
X-Gm-Message-State: AOPr4FUhMv4H+cb5lm99Gg6uASr2ucwmu9dea3bgX4bKDTns6y8kwT36jRBOaTQeihMLkQ==
X-Received: by 10.157.5.180 with SMTP id 49mr252948otd.10.1463241027075;
        Sat, 14 May 2016 08:50:27 -0700 (PDT)
X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
Original-Received: by 10.157.8.248 with SMTP id 111ls1252253otf.51.gmail; Sat, 14 May
 2016 08:50:26 -0700 (PDT)
X-Received: by 10.157.61.8 with SMTP id a8mr252803otc.7.1463241026174;
        Sat, 14 May 2016 08:50:26 -0700 (PDT)
In-Reply-To: <20160503183852.GA21146-nFAEphtLEs/fysO+viCLMa55KtNWUUjk@public.gmane.org>
X-Original-Sender: bewihelm-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
Precedence: list
Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
List-ID: <pandoc-discuss.googlegroups.com>
X-Spam-Checked-In-Group: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
X-Google-Group-Id: 1007024079513
List-Post: <https://groups.google.com/group/pandoc-discuss/post>, <mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
List-Help: <https://groups.google.com/support/>, <mailto:pandoc-discuss+help-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
List-Archive: <https://groups.google.com/group/pandoc-discuss
List-Subscribe: <https://groups.google.com/group/pandoc-discuss/subscribe>, <mailto:pandoc-discuss+subscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
List-Unsubscribe: <mailto:googlegroups-manage+1007024079513+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>,
 <https://groups.google.com/group/pandoc-discuss/subscribe>
Xref: news.gmane.org gmane.text.pandoc:14953
Archived-At: <http://permalink.gmane.org/gmane.text.pandoc/14953>

------=_Part_2797_624272913.1463241025545
Content-Type: multipart/alternative; 
	boundary="----=_Part_2798_1589762242.1463241025545"

------=_Part_2798_1589762242.1463241025545
Content-Type: text/plain; charset=UTF-8

On Tuesday, May 3, 2016 at 2:39:05 PM UTC-4, John MacFarlane wrote:
>
> Pandoc doesn't store any source-mapping information in 
> the AST. 
>

This is, of course, right, but two-way sync (in both directions) is 
possible nonetheless (albeit in a way that's quite fragile). Here's the 
rough idea (both requiring that pdfsync be used when compiling the .pdf 
file):

1. For backward search (from .pdf to .md): have the .pdf viewer send the 
source (.tex) file and line number to a script that (a) reads in the 
relevant line from the .tex file, (b) extracts a reasonable-sized chunk of 
text, (c) searches for that text in the corresponding markdown file, and 
finally (d) sends the line number of that text to the text editor.

2. For forward search (from .md to .pdf) do the opposite: have the text 
editor send the source (.md) file and line number to a script that (a) 
reads in the relevant line from the .md file, (b) extracts a 
reasonable-sized chunk of text, (c) searches for that text in the 
corresponding .tex file, and finally (d) sends the line number of that text 
to the .pdf viewer.

Clearly, a problem lies in step (b) in both cases: how do you locate the 
relevant text to use in searching the corresponding file for? What I've 
done differs between (1) and (2).

In (2), I find it easiest to take the whole line of markdown, strip off any 
initial markdown codes (such as those for enumerated lists), run it through 
pandoc (using exactly the same options I use to generate the .pdf in the 
first place) to convert to LaTeX, and then search for this in the .tex file.

In (1), I find I need to use a different strategy, since .tex -> .md 
conversion in pandoc often fails to produce a match in the .md file. (This 
is partly because of the way I have extended markdown using some filters.) 
So here I try to locate a stretch of text in the .tex file that does not 
contain any LaTeX commands as follows: I try searching for the first 
occurrence of '\' in the relevant line, and if this occurs deep enough into 
the text, I grab text from the beginning of the line to that point. If not, 
I look for a stretch of text after that '\' that occurs between '{' and 
'}'. Usually this is good enough to find a unique match in the markdown 
file.

As I said, this is fragile: it won't work in every case, but for me it 
works about 90-95% of the time, which is good enough for my purposes.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/d6c717a4-11c4-47f7-88b7-37ff0f9b394f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

------=_Part_2798_1589762242.1463241025545
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">On Tuesday, May 3, 2016 at 2:39:05 PM UTC-4, John MacFarla=
ne wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: =
0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">Pandoc doesn&#39;t st=
ore any source-mapping information in
<br>the AST.
<br></blockquote><div><br></div><div><div>This is, of course, right, but tw=
o-way sync (in both directions) is possible nonetheless (albeit in a way th=
at&#39;s quite fragile). Here&#39;s the rough idea (both requiring that pdf=
sync be used when compiling the .pdf file):</div><div><br></div><div>1. For=
 backward search (from .pdf to .md): have the .pdf viewer send the source (=
.tex) file and line number to a script that (a) reads in the relevant line =
from the .tex file, (b) extracts a reasonable-sized chunk of text, (c) sear=
ches for that text in the corresponding markdown file, and finally (d) send=
s the line number of that text to the text editor.</div><div><br></div><div=
>2. For forward search (from .md to .pdf) do the opposite: have the text ed=
itor send the source (.md) file and line number to a script that (a) reads =
in the relevant line from the .md file, (b) extracts a reasonable-sized chu=
nk of text, (c) searches for that text in the corresponding .tex file, and =
finally (d) sends the line number of that text to the .pdf viewer.</div><di=
v><br></div><div>Clearly, a problem lies in step (b) in both cases: how do =
you locate the relevant text to use in searching the corresponding file for=
? What I&#39;ve done differs between (1) and (2).</div><div><br></div><div>=
In (2), I find it easiest to take the whole line of markdown, strip off any=
 initial markdown codes (such as those for enumerated lists), run it throug=
h pandoc (using exactly the same options I use to generate the .pdf in the =
first place) to convert to LaTeX, and then search for this in the .tex file=
.</div><div><br></div><div>In (1), I find I need to use a different strateg=
y, since .tex -&gt; .md conversion in pandoc often fails to produce a match=
 in the .md file. (This is partly because of the way I have extended markdo=
wn using some filters.) So here I try to locate a stretch of text in the .t=
ex file that does not contain any LaTeX commands as follows: I try searchin=
g for the first occurrence of &#39;\&#39; in the relevant line, and if this=
 occurs deep enough into the text, I grab text from the beginning of the li=
ne to that point. If not, I look for a stretch of text after that &#39;\=
9; that occurs between &#39;{&#39; and &#39;}&#39;. Usually this is good en=
ough to find a unique match in the markdown file.</div><div><br></div><div>=
As I said, this is fragile: it won&#39;t work in every case, but for me it =
works about 90-95% of the time, which is good enough for my purposes.</div>=
</div></div>

<p></p>

-- <br />
You received this message because you are subscribed to the Google Groups &=
quot;pandoc-discuss&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org">pand=
oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:pandoc-discuss@googl=
egroups.com">pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org</a>.<br />
To view this discussion on the web visit <a href=3D"https://groups.google.c=
om/d/msgid/pandoc-discuss/d6c717a4-11c4-47f7-88b7-37ff0f9b394f%40googlegrou=
ps.com?utm_medium=3Demail&utm_source=3Dfooter">https://groups.google.com/d/=
msgid/pandoc-discuss/d6c717a4-11c4-47f7-88b7-37ff0f9b394f%40googlegroups.co=
m</a>.<br />
For more options, visit <a href=3D"https://groups.google.com/d/optout">http=
s://groups.google.com/d/optout</a>.<br />

------=_Part_2798_1589762242.1463241025545--
------=_Part_2797_624272913.1463241025545--