From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/26607 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: John MacFarlane Newsgroups: gmane.text.pandoc Subject: Re: pandoc.markdown to epub conversion took just under 4 hours on an average linux laptop Date: Wed, 28 Oct 2020 17:04:27 -0700 Message-ID: References: <22d3d478-357d-464c-b407-aefd2ed81dccn@googlegroups.com> <824220b2-6c2e-4c60-a935-e908f573a3d7n@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="31815"; mail-complaints-to="usenet@ciao.gmane.io" To: "cjns...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org" , pandoc-discuss Original-X-From: pandoc-discuss+bncBCJZJHG45QDBBGMO5D6AKGQESRUZWYQ-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Thu Oct 29 01:04:44 2020 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-oi1-f191.google.com ([209.85.167.191]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1kXvQq-00089j-39 for gtp-pandoc-discuss@m.gmane-mx.org; Thu, 29 Oct 2020 01:04:44 +0100 Original-Received: by mail-oi1-f191.google.com with SMTP id e3sf345066oig.17 for ; Wed, 28 Oct 2020 17:04:44 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1603929883; cv=pass; d=google.com; s=arc-20160816; b=KELmEtqhDtHqsSdcBsMF5ubi0WfkmddEGygTHQWqU0xZ4Nv201YURbQ+hkfzuk4VXg G4bvu4Qn+e4V/fzw3LlYm8XoKXXMxbgKEZUvJMe5ogu8TcA36mn3SC+6w9CXK1hNoAAt ZEC0z7AA4jI+0gziG03nL0HXExZU82huV9XJSvRYTxx2UkvoUE4jU036k4nwmmTThcL1 AwMI3vaHTflbDC1zGNuB5YkDjp2GJUZ7TtROIZfOoW8zye9Mz8SDLAfUmVxjBKtdyBBs Bw+aZbWdLrLdZ4LoP+5/Sb+UplPTFUs1dVIAhRCo5xA+q9vx14yozmJLdW8I+YQpuWsf SvKw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:content-transfer-encoding :mime-version:message-id:date:references:in-reply-to:subject:to:from :sender:dkim-signature; bh=+SrSvuaRAzD1O4Pfpbexee0L7tldFu5wHrRf2ViI3iI=; b=bmxtTf0rCd6p1x2UC9kPJZLCuzNFJVSF1G9QaM32EIXfRcN8whHGWC1hKTCYMwyZAe h50Bw8HyeHv1/WCLJ2MqU4ZtyJl/YXP6YTn5rxn5SPUTfU8Z3i42FL3EVk7CMZFIGO5C rPYzZZd6a7AqvcqqbV1FeDuDCuH/MjrCXiXyGMjUw/eZTSp+ZUoT1HSPw5Pt/J8Xf/wP hFQ87fYx31QKvqXTIm4EqDYhO5Soj4fAno74PoFbs39BhbevPqmkxV01lE02Z24VHqfZ +8OsOJbQQfUg8Jv7N4hh4ZEP7IlrFx1d+IHZ0pZBZYtvCpnblEAi688KgeXEz0dloqA4 mJ3Q== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=VfnNJkpV; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::529 as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:from:to:subject:in-reply-to:references:date:message-id :mime-version:content-transfer-encoding:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:list-subscribe :list-unsubscribe; bh=+SrSvuaRAzD1O4Pfpbexee0L7tldFu5wHrRf2ViI3iI=; b=fuJmnFx07HQzHvIQBo1G6f+gWma4kGw8ICBkzfKG+WDpfQ/9Zvu+cVhFPhe+u4/GQH 8Ug2hsZ6VpsBaXE53UAiFOF1ULrBuhpaA2zVRE5LQV64UjYdd0VyGkHef9ibnc5SwHuN x2QaXvqpxdnUaTmtxd6KokfJn4+pHKS46OzA5RAFsYHXD5o6yTO2pqR+bOVpbDHIxkuZ Q9FQj2amLtaGyOQxeEpoCr2K/KcxeKHRXVV0CZchEuDJrrwv/4hm9ek2YOdmwalc/bAc et1bguRjYWhhOF3hOKTcv1H2AzvRIjsAnjQOMsKC6oGqvdlw6hQSS3XtwBB7gWBxS2s7 7Hyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:from:to:subject:in-reply-to:references :date:message-id:mime-version:content-transfer-encoding :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=+SrSvuaRAzD1O4Pfpbexee0L7tldFu5wHrRf2ViI3iI=; b=MJw9em3pGt6u6+0E53oA4B+rTjlqNfaZ4zgXpl34JFQVgYkRjMt2o0jHK16uNg3XJc zEnO/3+B49KdyIa86F0B3A6UQfp8eTgiqFllyk278f/PNEihk/yNnyRsWvxgSjy4jknw oatggjPLsDPwFh/azleHHIeiNkXcQbFoWrqWDPJ2ZeiRZz4E4hUhdTasZVPAedA8qohD WMhkecXhm+npDxjLUbHj4qc7B8XeRIORSoZBlQicMOq578IxvRJTcYPRwwBbLHHTSjTt UgKxp6ezcLygNgcvLA2QJac0/Mfy0lh/tmCf0t1oChMfTrqyh5QkTfnlYp7JCIY0zKzo Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM531gU47OQlDQo7xss4pd0UJxy9I8gURdettj10d/2V9uCH0rYWG3 fx0wZ77PZkX3tHze1d4lXgo= X-Google-Smtp-Source: ABdhPJwJikY9YSaQxcGmndSpiP8UTjDIUzn0c3sEWaVJOb+zqB4IXuhEam+6ykFtuuCB1m6wAEabyQ== X-Received: by 2002:a05:6830:10d2:: with SMTP id z18mr1210286oto.41.1603929883165; Wed, 28 Oct 2020 17:04:43 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6808:c1:: with SMTP id t1ls225571oic.7.gmail; Wed, 28 Oct 2020 17:04:40 -0700 (PDT) X-Received: by 2002:aca:5ac4:: with SMTP id o187mr1123996oib.112.1603929880733; Wed, 28 Oct 2020 17:04:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603929880; cv=none; d=google.com; s=arc-20160816; b=FxSdvLcx2ZIhsYKXHknCVwXQ1Ygtz9Vqk8HLjdZ3h5n4HCCY1+Irb/yTixQ5Bp8KSm Ypflt/EJH4daA35BidDzg9lpozO0cW8Q8+uUS5aIUdHEUqLGTVryWFlaPm8v9DwOucvb cplQKENbADlrN3JP7KLIgwH1N8rpnXdRVdI/qmXLyJrrwkX3dM9y90gncFYQ9Doja4Dl mjyAmf6yPNjYtDs0XXAhvMFtoNr7u2hK/fO1XenhblkhO/clxQ9iEGTjy2Rgq5XxlER4 rbvdwg1OE0qhwD9Vkr+LB6YTl76aorFuC4avLVplWUwRkMJ6Rp8YxLlxPsdXOINubmSy d1mA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:to:from:dkim-signature; bh=5+u5HFLsUjDq+NyKe0YrXCOd8+Hoc4+BE04PRFDYuxQ=; b=HEzv8JWk1bYZ1aKn/weLujT7xT60d/L7fquhSYh6GuWwOp1FHCDxEIfSb4+zq4HbUP 45tK9FKwi7XEzXUdIHGuihsbdcQLj2MMTfC782xaNnY/m7TQz+Ecb8KNCfZjAns2MKu+ u+IlSEPay7ib3R2+xoMjzK9rwUBpTgeXgJNC+4F2xMPAc4V/KlQxBJDcl5rn/kmzUYaw 11GFTnMR+cncdV8/a6eyVePXDrvpW66Lw2XqPtaW6YIRD3obNLlMhva2qDRICZcMp6JT NSQnPwKsfswB7dhlZOmbS8p4MHWlRk3hynsthFxc3qPWDRYC3jsVxAx80WmWu0t2ED11 xoeQ== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=VfnNJkpV; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::529 as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org Original-Received: from mail-pg1-x529.google.com (mail-pg1-x529.google.com. [2607:f8b0:4864:20::529]) by gmr-mx.google.com with ESMTPS id a7si61296oie.4.2020.10.28.17.04.40 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 28 Oct 2020 17:04:40 -0700 (PDT) Received-SPF: pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::529 as permitted sender) client-ip=2607:f8b0:4864:20::529; Original-Received: by mail-pg1-x529.google.com with SMTP id i26so863577pgl.5 for ; Wed, 28 Oct 2020 17:04:40 -0700 (PDT) X-Received: by 2002:aa7:808a:0:b029:160:167d:d332 with SMTP id v10-20020aa7808a0000b0290160167dd332mr1462620pff.1.1603929879815; Wed, 28 Oct 2020 17:04:39 -0700 (PDT) Original-Received: from johnmacfarlane.net (li55-134.members.linode.com. [74.82.3.134]) by smtp.gmail.com with ESMTPSA id t20sm723236pfe.123.2020.10.28.17.04.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Oct 2020 17:04:38 -0700 (PDT) Original-Received: by johnmacfarlane.net (Postfix, from userid 1000) id 08D02A18A; Wed, 28 Oct 2020 20:04:28 -0400 (EDT) In-Reply-To: <824220b2-6c2e-4c60-a935-e908f573a3d7n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> X-Original-Sender: jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=VfnNJkpV; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::529 as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:26607 Archived-At: As I mentioned, --trace is the way to get an internal snap shot of parsing -- at least at the block level. It sounds as if that did tell you where the parser is getting stuck (it would be AFTER the last traced block). Putting raw tex blocks inside ```{=3Dlatex} ... ``` (the raw attribute syntax) will help the parser in tricky cases, so you might try that. "cjns...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org" writes: > Sorry for the confusion.... copy-pasted the wrong pandoc command. The one= I=20 > actutally used for this particular run that "took seconds" was: > > pandoc -o epub/test.epub md/title.txt md/* --css=3Dcss/stylesheet.css=20 > --epub-embed-font=3Dfonts/* --epub-cover-image=3Dimages/cover.png -f=20 > commonmark_x > > And yes I did see (same as the raw latex stuff) the content of the=20 > title.txt file verbatim in the output. > > So basically in my use case this run of pandoc did little more than the= =20 > cat command and format the output as an EPUB file.=20 > > I have tons of script/regex-generated of both HTML and LaTeX code in this= =20 > source so it has to be pandoc.markdown input. > > The odd thing is that I have been doing this for ages (even Vol. I of thi= s=20 > same book which is similar) and never had anything that took ages to=20 > compile. =20 > > Otherwise with nightly and without the "-f commonmark" flag the situatio= n=20 > is unchanged. > > Is there any way I could take a storage dump... backtrace... or something= =20 > when I kill the hung job? > > Would some kind of filter that takes some kind of snapshot of the interna= l=20 > state of the process help? > > Thanks, > > CJ > > P.S. I apologize for the messy reports I have sent in lately but I'm havi= ng=20 > major problems with this particular google group. I had to switch to goog= le=20 > chrome (a mess on linux. I normally use firefox) in order to be able to= =20 > post. And the posts I tried to send from my mail client never made it to= =20 > the group. I think I mentioned that this is not caused by my local setup= =20 > since I used someone else's account/machine and it still didn't go throug= h.=20 > Any chance someone might look into this at some point? > > On Tuesday, October 27, 2020 at 8:29:03 PM UTC-4 John MacFarlane wrote: > >> "cjns...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org" writes: >> >> > With the nightly version (2.11.0.4)=20 >> > >> > /tmp/pandoc -o epub/test.epub md/title.txt md2/ch*.md=20 >> > --css=3Dcss/stylesheet.css --epub-embed-font=3Dfonts/*=20 >> > --epub-cover-image=3Dimages/cover.png >> > >> > the conversion took seconds. >> > >> > But pandoc complains that, >> > >> > [WARNING] This document format requires a nonempty element. >> > Defaulting to 'title' as the title. >> > To specify a title, use 'title' in metadata or --metadata title=3D"...= ". >> > >> > And the epubcheck report the following errors probably related to the= =20 >> above=20 >> > warning: >> > >> > ERROR(RSC-005): epub/test.epub/EPUB/content.opf(9,14): Error while=20 >> parsing=20 >> > file: element "metadata" incomplete; missing required element "dc:titl= e" >> > ERROR(RSC-005): epub/test.epub/EPUB/nav.xhtml(11,134): Error while=20 >> parsing=20 >> > file: Anchors within nav elements must contain text >> > >> > Check finished with errors >> > Messages: 0 fatal / 2 errors / 0 warnings / 0 info >> > >> > epubcheck completed >> > >> > The title.txt file contains: >> > >> > % URBAIN DUBOIS >> > % La cuisine classique =E2=80=94 Volume II >> >> Weird. This SHOULD work. Are you seeing anything >> of this in the resulting epub? (I.e. did it get parsed, >> but not as metadata? If so, maybe you need a blank line >> at the end of title.txt.) (Also, I assume your input >> format is pandoc markdown? commonmark_x doesn't include >> an extension for this kind of title.) >> >> > When I take a look at the output everything looks good except that the= =20 >> raw=20 >> > latex bits are now included verbatim as if they were part of the=20 >> text/data. >> >> They shouldn't be -- again, is pandoc markdown your input format? >> Maybe a sample of how these occur in the markdown file? >> >> > > --=20 > You received this message because you are subscribed to the Google Groups= "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an= email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgi= d/pandoc-discuss/824220b2-6c2e-4c60-a935-e908f573a3d7n%40googlegroups.com. --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/m28sbpucc4.fsf%40MacBook-Pro.hsd1.ca.comcast.net.