From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/26577 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: John MacFarlane Newsgroups: gmane.text.pandoc Subject: Re: pandoc.markdown to epub conversion took just under 4 hours on an average linux laptop Date: Mon, 26 Oct 2020 14:15:40 -0700 Message-ID: References: Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="20999"; mail-complaints-to="usenet@ciao.gmane.io" To: Chris Jones , pandoc-discuss Original-X-From: pandoc-discuss+bncBCJZJHG45QDBBCPZ3T6AKGQEJW4RFCA-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mon Oct 26 22:16:03 2020 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-qv1-f60.google.com ([209.85.219.60]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1kX9qS-0005Gn-GK for gtp-pandoc-discuss@m.gmane-mx.org; Mon, 26 Oct 2020 22:16:00 +0100 Original-Received: by mail-qv1-f60.google.com with SMTP id s1sf6560076qvq.13 for ; Mon, 26 Oct 2020 14:16:00 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1603746959; cv=pass; d=google.com; s=arc-20160816; b=HpU4AmNa06wM7d993B85ggqllnndgbwPmfM6RJQDx1kqN8WKsH8MmeNksjytQAteAg MwqEVzTMMj58cWr4jfegTNxtzWZWYRcCDjoauFDkPzQrlb+ZOEBQUfy65R/HRVnwQ4+G KbMHNTG8+prdFYhbvJNXTbRb0+6uvAK7sh2MBeWPCQ4FJbX7enCTBWVHMiVokWrA6gni 1dDt4qhZPWxAmDR2HNUqLlGLXDaUOcldyy1wAiYvgLzRErCtVOBuEAqcamOcJohcQPBg W9DlJY7XrppNQQs5PZxj0v80svfcegFX/0DbTm89mfGPxC1t90s+EuOpgZRZ0WJV5vx1 5mcg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:mime-version:message-id :date:references:in-reply-to:subject:to:from:sender:dkim-signature; bh=BwAG+Aj0aNnq9nqUhPeC0IXtgJkPJj87W4Gq1TvNDKU=; b=rWt07+6798cpsVnEsebuSh5pEfKe4mPlQhiS7Gp5NaCrLi9zZmdXhE/p3l28TrM8Id ceU+aDQzQmzXwrWJUMGIbwUJhl5mHRN78pXOX8fntvfwDeBIsOJEuKqR5qjKsOCViYj7 mywkuhSc0No+BC0yGFzISnZgDbcEEznDjZZ3G22UXcvNSVF1aaQvClq7ZCkKkCCOZOCF Ipj1Nc1DtNGhM8PyTFhniRSd98SCzGfhOxlrKmZ/4F7xFCWWFct1TRwB8uP9L8SXARgO x3bvCkpP0CcrCNas4FftEVptGOYrzJjIgIFwv8YVL+DB5Nja7fFJD50w8iVldazGKF9P Lq4w== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=EkU+OcGw; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::630 as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:from:to:subject:in-reply-to:references:date:message-id :mime-version:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=BwAG+Aj0aNnq9nqUhPeC0IXtgJkPJj87W4Gq1TvNDKU=; b=lpWeGm3GshnCJQCxrSb10sPaEoDcdOR+pOyxVWtvoSyAFQHJZZ7rCHvrwJbD874ymO tLhbVlEtEbQp0D9dBBKspjO24Nxo38WPpA8DLtmwbGq3Ke7DT59513WDS0qNSdZUsqJq Vuukcni+aiiAXiQbzijOQiDLvQp+SGgcCPvrddQWaGM5ygHaPuWEYpenwrtvu7e2I826 NlhEA7gDKvg0bxoYtKxkJpxw0vMXjUipW2b0JY49T4UtB3XbrFYSpUDm3neTXnmKn9Tt j0IfzlbuzvOGu/vBwSEwQ9yS4Ap6USGi5cO5ZVAiNSbcuVO4yXtFbamvUMPpTqCSimHq v1RA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:from:to:subject:in-reply-to:references :date:message-id:mime-version:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=BwAG+Aj0aNnq9nqUhPeC0IXtgJkPJj87W4Gq1TvNDKU=; b=KCUZjUCsAEPqr00k14lFZNf6SWmG3avrz8/+UUxPfyzWR6edUNOkOkNTWkCVrIxRte i4E9GsWy3MEX+yyUmfo9s2XA4Kpd9wxVcPdt5ukzLb5QNUBMYbXE54kaTA7TU85nGWVm K0TSXxIU5FXw4/wfhIC8wKJcIetZSXESBKWjhsPXC9qiddZCe5B0O4ES5OdCyW5vvdAa +BTvh/7RpH+uCwTXIratkRzkTdzqIc+dPnmmtkteXOnIbA6ko3StSQPiLzKV4J1zyR9d 5EBeMrDs6/zmByxvmJjE64rddXuPVO7ijVCoBP934udMFrIJUBJCj9eRHhc7woM2iT+s M31Q== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM530fgKCi592431F38pSgzc0OsQ8/SdoCO/6NQ1qJO2D1ONSDoMAS YGzLOdEgLOGPs9s+qJW5iuI= X-Google-Smtp-Source: ABdhPJyilFRxetAS21Dcu9/5tLPY7ahKxwK8XHdffRPu2knjLFIaP2jdZxM4p7GWre6TXVpNJpUL8g== X-Received: by 2002:a37:2795:: with SMTP id n143mr20197994qkn.321.1603746959571; Mon, 26 Oct 2020 14:15:59 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:ad4:4d0c:: with SMTP id l12ls2601858qvl.4.gmail; Mon, 26 Oct 2020 14:15:53 -0700 (PDT) X-Received: by 2002:a05:6214:180d:: with SMTP id o13mr16790624qvw.34.1603746953180; Mon, 26 Oct 2020 14:15:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603746953; cv=none; d=google.com; s=arc-20160816; b=hHdzi1fx4GqjDBtVNzpkEbSixupMuQ63JRDejTmGTX+1zfjZI4JoThd6vh4YKFLaoA xglOX18Hdtehp0Sw/7757AtvbyDKiCKHPzrsL8rWFQWKzD8EUjBK2l8Dx6yse870QQZ3 eu08EZukuZIc4+1sTadzNXg9u0GLMjTfKbpLVmQfF8NPFoOdqitcide7nJO4aMDy/St3 L/fSKtG2GLCzbCZdftibXiazvJzFYXJ2Q8MpShvYOnDXQrWuUAo7Ughdyc7T0eI1BZIh 0GC0XOSDW0TP0nhFsgIobBm9lND0wfdO1ktT1LSxLekBhedY9/WowqSRNWT3B8P2s5B+ RJzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:message-id:date:references:in-reply-to:subject:to:from :dkim-signature; bh=wbwBojdq4bOFGkrgMczgCVgW1agdp7YqVPIITce6hYU=; b=sA1grO1plDr9iSulGhqc28p4vmwsspmusr7QK+MI/I9WW4sDnuhnNkXp+EHe0iXNgh 6F7GBlnMi2+j0UWxC/5fi/C1vdvE9wC4IVMd6K4eRm69QnTXD4iilIzlVz+JJ6MJH/he b16FMtFgaWjOR8CxqrxhDJKXTUTpBpdKQnmFXjqfQo5FOjhdmfZzyxzP0Gl3SJnlNNOk HcX+aeJzrRF0/+DKZ9f412y8JUW5fsvY/ionbRpMIEXiSGPokkU2EQUnp8XY99j3KlX0 TGSHQc+w0DeuxUBLlnEEsESgCj4OUfSO/BgjD6FSU2byK4DT5L6eEUMIek2LUKHwJTSn +Veg== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=EkU+OcGw; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::630 as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org Original-Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com. [2607:f8b0:4864:20::630]) by gmr-mx.google.com with ESMTPS id i13si256499qko.4.2020.10.26.14.15.53 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 26 Oct 2020 14:15:53 -0700 (PDT) Received-SPF: pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::630 as permitted sender) client-ip=2607:f8b0:4864:20::630; Original-Received: by mail-pl1-x630.google.com with SMTP id h2so5329313pll.11 for ; Mon, 26 Oct 2020 14:15:53 -0700 (PDT) X-Received: by 2002:a17:90a:1102:: with SMTP id d2mr23159514pja.178.1603746952252; Mon, 26 Oct 2020 14:15:52 -0700 (PDT) Original-Received: from johnmacfarlane.net (li55-134.members.linode.com. [74.82.3.134]) by smtp.gmail.com with ESMTPSA id 13sm5936692pfj.100.2020.10.26.14.15.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Oct 2020 14:15:51 -0700 (PDT) Original-Received: by johnmacfarlane.net (Postfix, from userid 1000) id 59ADEA18A; Mon, 26 Oct 2020 17:15:40 -0400 (EDT) In-Reply-To: X-Original-Sender: jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=EkU+OcGw; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::630 as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:26577 Archived-At: There are a few things that can trigger pathological behavior in the markdown parser. One way to find out what is to divide and conquer, converting shorter and shorter segments of your document to see if you can find where things get slow. Another possibility is to use --trace, which will give you very verbose output that will allow you to determine where excessive backtracking is occurring. If you don't need all pandoc extensions, and you're using recent pandoc, you might try `-f commonmark_x`, which uses the efficient commonmark parser extended with many (but not all) pandoc extensions. I would expect this to be much faster. Chris Jones writes: > Six files... ~274,000 words. A pandoc conversion to EPUB last night took > almost 4 hours. Comparable conversions on the same hardware take at most a > couple of minutes. > > How can I investigate & hopefully optimize? > > -- > You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/af5fe26b-4d84-4dcb-bdcd-6382469c476ao%40googlegroups.com.