From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/26102 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: John MacFarlane Newsgroups: gmane.text.pandoc Subject: Re: Approach to converting large, custom, LaTeX document to restructured text Date: Fri, 11 Sep 2020 08:07:24 -0700 Message-ID: References: <9c40cd2c-9874-446b-8772-c8a99e377acan@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="28609"; mail-complaints-to="usenet@ciao.gmane.io" To: Jeremy Conlin , pandoc-discuss Original-X-From: pandoc-discuss+bncBCJZJHG45QDBBOVF535AKGQECS3OLEA-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Fri Sep 11 17:07:44 2020 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-pg1-f189.google.com ([209.85.215.189]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1kGkeN-0007Ix-D8 for gtp-pandoc-discuss@m.gmane-mx.org; Fri, 11 Sep 2020 17:07:43 +0200 Original-Received: by mail-pg1-f189.google.com with SMTP id c3sf6295239pgq.9 for ; Fri, 11 Sep 2020 08:07:43 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1599836862; cv=pass; d=google.com; s=arc-20160816; b=YYXUFCWeq2+8bqU62yZS0Slf724DfPj6hZaC7eXL0a/VkujYav17oDpuuTOZtqud6f E+DodP8+HnckTXdY8jQBZpoxmc6R2wJYTCL/upfetYAoyIQoEBsGEsM9RobzBDD0CpAA /r8VOKt3Q4lqcxoWDhWbLb3Q3POf5wc8t2Kw8K+AGeTvCrMMz2AY6dTIzLogT51fmmnk pbBAdTeP/uJ6HAVoKbcH3fm7ukR+6DgHYBf+mplujc8hrTMjXwl4H101eYeNnahXMQuA JJ3FrhqKzmZ4duRRZyB3ar7iAelEAQEUG3UT9eYK70t775HznLwg9KFTVjapm3nIrmku HZiw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:mime-version:message-id :date:references:in-reply-to:subject:to:from:sender:dkim-signature; bh=Bj7RfUq+YYi/8+4DM5sTWEZ1TwOx1bBQWxdkZVGKvlY=; b=ErlW+uAd7kmAt9+Sfu+5z5P+3HL0nt7TNu/17HxSZEWngaK5Z9oWO6TTzri0WYEAv7 FGLEFsnvZ/0uCNKrDlcxLHKIU8X5wmgn8oK/J38tJ3S+trNI1ziTBTwzGwabTmZAtrxp cF7p0WlUNUZ5gfiJoSo3p/VkhY1VighYfbNf8+J0Is5wr4d1jKXPyAWZpG0tqo5OIzcz z+0KhBGSzManSJNOmfKC9OyG6QvTL4K1+zg04/u92Y7eb5lzYPuTKmQ4SeyGrTUqZchI u+avd+OvmaY1KbvhtwpoO4/SJwQ9pr/MSbTgpiTc8Kk07bcDSX1xGOMOPU+dybN2syEf k2qA== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=lyRkFFOX; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::42d as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:from:to:subject:in-reply-to:references:date:message-id :mime-version:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=Bj7RfUq+YYi/8+4DM5sTWEZ1TwOx1bBQWxdkZVGKvlY=; b=O6MTMlPyLRImV3dkLEayjxIMBxuqcTJ5+x821WgdwS7xPrGB20tEs90BR5EH2JW4i5 1llkjcmHp3nkbrt617NAWc5i1TdrS/y2+Sk0yGMvaG9egTt2IOLVNTLcWSEhbCSON1l2 ZQ3QboUnEU1z9y+mxVPMKe9dUEkysPWmUdbvzZATAGLSSb+hXpitEvX/l/A6oqU9mLDI suCqtxv8ZslkVEMtdz8qJEWHJcq9zfyXx/rdAJb/hrvtp+hUDGCt7JWaWCIm9Z+QsNmP 5CvX3W+Fs831jMfd7o4GvLJ/g9ZkvwnqxfWmsgHxtZ4SFlw4TzDMvXn3ImxQqUSU5f+V rn4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:from:to:subject:in-reply-to:references :date:message-id:mime-version:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=Bj7RfUq+YYi/8+4DM5sTWEZ1TwOx1bBQWxdkZVGKvlY=; b=iS0laKmbOgHPy1dGgZ25pgyH8pUuFCrJ8dM+pCArP6+fS3NQ3UfAvVAQPNYTGyz5tN jO3V6dmOywZs9B7rv5wfmrzWJtwgQu8aZZbvn52tcV/XjJh3uBAYoFrSyQQ8baADIh50 NC+KOvz4xzD0cjq4zZx/RtXHJQKcFeeLcLuFvJVilXy0EFhxTzyWyVF4qriHM57GK9Ss ZjQR62KOSl7Xau/sibKzEBUDGTG8XSCW3lyocdFeCg4VKAffQaxTi0s6E0mmh7r/kiAc IoLTv+8HlGy+CXCQPIdz0EjNNPsX0SAnTBQzFNgoJa41HmSHNQZeY5wE17bVh92JkLWT ENgQ== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM5335e/jipbL+8yOPXuDhYgm+51RF19jg6xvCOHjf6LVV9nCWMrS/ DWH6ZnG1hsfXsBsOL324W4I= X-Google-Smtp-Source: ABdhPJzI4AtWb7cGBZ0qWjGS63i89Cxp2y5tZ/r+GJaw72rDUT1aSrPC9zOWEGh8Rbd4BayJ2Ay0Hg== X-Received: by 2002:a62:16d3:: with SMTP id 202mr2478421pfw.44.1599836862078; Fri, 11 Sep 2020 08:07:42 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a17:902:fe98:: with SMTP id x24ls1342184plm.6.gmail; Fri, 11 Sep 2020 08:07:37 -0700 (PDT) X-Received: by 2002:a17:90a:bc08:: with SMTP id w8mr2584478pjr.168.1599836857578; Fri, 11 Sep 2020 08:07:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599836857; cv=none; d=google.com; s=arc-20160816; b=XcBHdJbXJXLiur9EBZcYWGF8oHXv2+hzwXRr7s73p1CR7xxyTWsjLJt0fwTfA49Mcx alr0xmZcwx3S5jHgiUYzNGvZewlVoc9TRT3uBD1wuICgU4CDx7UqBXch76acnxarwbLg evSHd/Cy9BijjsgsfMespGShMLlsegH3khnkIt/8Fmn3nJAAUh4n7wgX5oI2+oHtKou6 48fw0XCTWAia3cvjOE1X5/+CjTreLPmLxk+C/9u1SKxE6MtvKo6mV8yxCVr9qk6dLglW jrJypezAjnreqioLhIr5jqyZmk1vXid0Pz3Imd2rQ4UPDdCDDU1oIHSRjw9mow06yTrW pdVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:message-id:date:references:in-reply-to:subject:to:from :dkim-signature; bh=+PQ9LeMoJuPLYGLRAFJkfbbsNRaf/QUsKPEtqOmPVUQ=; b=FgfGMUUbvTPeCheUZQiEh/3uq6DlBc39xMbABWO6VDL1wnnDMdj0CShH+LnOhOHptj FzittE7b1rhpfWh5Bp/FI4eLNlzQ0IylckWTFgHVZ/miFdEM4cqGRW2m4ahNqd2KD8k/ ZTkaeoSGvDXvckr/s5/vJy+A7x17ClOD6g+9PQ5feUm4AZbQeaphoI0b7flkmcO4AYH5 nnmqbOqqrA50I7Ej/uJGMZmb25CBXAhActThltdka+U8x+hzTqcfeo1AQ1APSJiAX8X6 0mdup9E5a6sGRlN2O61BvWSJRrLL7IQ+1iuh1XQNleTnjVAbmkPzrxALTaIusa6syykP 7JaA== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=lyRkFFOX; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::42d as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org Original-Received: from mail-pf1-x42d.google.com (mail-pf1-x42d.google.com. [2607:f8b0:4864:20::42d]) by gmr-mx.google.com with ESMTPS id iq17si120900pjb.3.2020.09.11.08.07.37 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 11 Sep 2020 08:07:37 -0700 (PDT) Received-SPF: pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::42d as permitted sender) client-ip=2607:f8b0:4864:20::42d; Original-Received: by mail-pf1-x42d.google.com with SMTP id o20so7542696pfp.11 for ; Fri, 11 Sep 2020 08:07:37 -0700 (PDT) X-Received: by 2002:a17:902:7896:b029:d0:89f1:9e33 with SMTP id q22-20020a1709027896b02900d089f19e33mr2706489pll.15.1599836856917; Fri, 11 Sep 2020 08:07:36 -0700 (PDT) Original-Received: from johnmacfarlane.net (li55-134.members.linode.com. [74.82.3.134]) by smtp.gmail.com with ESMTPSA id z23sm2620451pfg.220.2020.09.11.08.07.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Sep 2020 08:07:36 -0700 (PDT) Original-Received: by johnmacfarlane.net (Postfix, from userid 1000) id 18A72A2A1; Fri, 11 Sep 2020 11:07:25 -0400 (EDT) In-Reply-To: X-Original-Sender: jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=lyRkFFOX; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::42d as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:26102 Archived-At: Sorry, the messages aren't always that helpful. But you can try e.g. creating a document with the part up to, say, line 1800, then adding stuff gradually; this often tells you what the problem is. Jeremy Conlin writes: > Thank you for your response, John. > > Upon closer inspection, I think my initial assumptions were incorrect. I > thought pandoc had found a command/environment that it didn't understand, > but now it seems more obscure. > > I ran pandoc with this command: "pandoc File.tex -t json --verbose" and > got the following output > > ``` > (lots of messages about Skipped and Parsing unescaped '&') > [INFO] Skipped '\bottomrule' at line 1849 column 16 > [INFO] Skipped '\begin{tabular}' at line 1823 column 18 > [INFO] Skipped '\end{tabular}' at line 1850 column 16 > [INFO] Skipped '\subexperiment{SAP}' at line 1854 column 20 > > Error at "source" (line 1855, column 12): > unexpected [ > Additional details are found in the following paragraphs. > ^ > ``` > The carrot should point to the d in details. > > So I'm not sure why pandoc found what it thought was an "unexpected [". I > couldn't find a bracket in the preceding few dozen lines, but I did find > one in the few lines afterwards. Does the message mean something obscure? > > Thanks for your help. > Jeremy > > $ pandoc --version > pandoc 2.10 > Compiled with pandoc-types 1.21, texmath 0.12.0.2, skylighting 0.8.5 > Default user data directory: /Users/jlconlin/.local/share/pandoc or > /Users/jlconlin/.pandoc > Copyright (C) 2006-2020 John MacFarlane > Web: https://pandoc.org > This is free software; see the source for copying conditions. > There is no warranty, not even for merchantability or fitness > for a particular purpose. > > > On Thursday, September 10, 2020 at 6:50:28 PM UTC-6 John MacFarlane wrote: > >> >> It really depends on the details of the document, but if >> pandoc is struggling with certain commands and environments, >> one approach is to define custom macros for those, which >> convert them into something pandoc can handle. >> >> (In a few cases you might get away with just putting the .sty >> file in the working directory, so pandoc tries to parse it, >> but pandoc usually can't handle the lower-level tex definitions >> style files have, so this usually doesn't work.) >> >> For example, if you have a foobar command, just >> add this to your document >> >> \renewcommand{foobar}[2]{limit yourself >> here to stuff pandoc can handle} >> >> You can often get pretty far with this method. >> >> Jeremy Conlin writes: >> >> > I have a large (900 page) LaTeX document (broken up into several LaTeX >> > files) that I want to convert into restructured text. I've already tried >> to >> > use pandoc to convert some of the files and it has failed for a few >> > reasons. >> > >> > I'm a new pandoc user, but I figure I'm going to have to write my own >> > converter. Before I do, I wanted to ask this forum what the right way to >> > approach the conversion. I was planning on reading everything into >> Python, >> > do my own search/replace and then pass the result on to pandoc. I would >> > then rinse/repeat until I have everything the way I want it. >> > >> > I know there are filters and such that I can write to customize things, >> but >> > (as a beginner) I'm not sure if it would be easier to learn pandoc >> syntax >> > and write my own filter, or just go at it in Python as I described above. >> > >> > I don't mind doing it either way; I think it might be a fun side project >> to >> > do when I'm procrastinating doing what I really should be doing. >> > >> > Please advise on what is the right approach. I'm sure there are other >> > approaches too that I'm not aware of. I'm open for suggestions. >> > >> > Thanks, >> > Jeremy >> > >> > -- >> > You received this message because you are subscribed to the Google >> Groups "pandoc-discuss" group. >> > To unsubscribe from this group and stop receiving emails from it, send >> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/pandoc-discuss/9c40cd2c-9874-446b-8772-c8a99e377acan%40googlegroups.com >> . >> > > -- > You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/d8e598ff-e975-420d-baee-523f9ab38e35n%40googlegroups.com.