From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/32243 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Bastien DUMONT Newsgroups: gmane.text.pandoc Subject: Re: Error caused by document length Date: Mon, 27 Feb 2023 14:54:45 +0000 Message-ID: References: <7ed278f7-071b-4bcc-9f9a-e9dd5c09ee55n@googlegroups.com> <8f11cfaf-7c36-4cc6-9866-aa3741d965a4n@googlegroups.com> <4bd152b5-32f7-4f4c-9a9b-0d20afebea84n@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="30764"; mail-complaints-to="usenet@ciao.gmane.io" To: 'Peter Vedal Utnes' via pandoc-discuss Original-X-From: pandoc-discuss+bncBDCINCES2QJRBOEI6OPQMGQE7NW5NJQ-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mon Feb 27 15:54:53 2023 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-lf1-f56.google.com ([209.85.167.56]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1pWetw-0007kf-WD for gtp-pandoc-discuss@m.gmane-mx.org; Mon, 27 Feb 2023 15:54:53 +0100 Original-Received: by mail-lf1-f56.google.com with SMTP id x22-20020a056512047600b004db2a1fed99sf1872624lfd.15 for ; Mon, 27 Feb 2023 06:54:52 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1677509692; cv=pass; d=google.com; s=arc-20160816; b=OO39WttnPL3+/C15WUR6+cg/089UfJjumq8ybHAGhQXwsanqYvQno0plg/p98jaGr6 mq20pIPVcvL4WjHsAdXogwgPqKUDQ+ueadYN91UnbtmhzCRkDGuaXvW68K1/6mbufKUm xhYo1pwwaRNGhmqhdGi1N4SfioY3XYx93Gm9QC/Hpwe6ZBbSmqbkGhAvwBwUXvUcxRso bIWcppb1pLgl0pGINXfPBT0bqpDfTgw3UV7xQJneUuKxs2fl4fjdTl9B6aBaGYz2VilI 5HScsP4x/nQnbASLxJyzXQdgXa/xurXiI4asJRTatKdDKinxx2FRS4MWVSmtTkSZN0fg sx9w== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:to:from:date:sender:dkim-signature; bh=OjA79vuLgh+cY2eV9iz43S1+w7Yh+zjZ+bPqO6Z0Etc=; b=jbDRpMH67JddCx0HLXL8zXY9viR+IEsOUmFc9Edi6R//A1c+61euAX0CGL83h7nLde cF4HcH7moJYRiIomWmBZDYQ8q41mzpcDw4S6IbSV0Dqpmum8AHR5V2uDb8cr1JhVH8Uj Q1XU+oK/sCUDLozMQB4FR6wGEtWzv2y+q0IFjJII9aRrJy2ZTCCMVVFgI8Xvy6DUkeSJ 9Jk/Qa3FrFlX/vOG3661eY+4HoEmPcqLl1/yFnMc0ITnuHSKZEq0BJCR6LCLkRCJhY1P NnxNWy9vN3GzO0qCM0QdZYBjUEAAypZ3UIiV2lUVG0bBPRugMyr69oI7d8WGMC6X9pb0 vokQ== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@posteo.net header.s=2017 header.b=fTYUuMnR; spf=pass (google.com: domain of bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org designates 185.67.36.65 as permitted sender) smtp.mailfrom=bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=posteo.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to :x-original-authentication-results:x-original-sender:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:to:from:date:sender:from:to:cc :subject:date:message-id:reply-to; bh=OjA79vuLgh+cY2eV9iz43S1+w7Yh+zjZ+bPqO6Z0Etc=; b=IDg1kjErM84HwyOSI0jQaIjGI0+jSlSCi5sSeIVaaiHDZr/5tWWpeFcPHzfnmshenc SaoLG1QvjefM5q4QkCOKsUMmI8Kapep6FfQhCeAdiq85pLqIlDc0FjRCbYdz7XC00W97 PrQ2Jl3ZPFdOVe+W4yL2GOkCNg52mcy304MtPKXqLkf8PRQUNt8cvKCm0N/yRc5jH2Bi f2Frix3NZ7HtgYQLfnVoh0agkieVwknX5lF/mP7hO/jdS28drZSVxbB6Uq5p04idRfNd j7iJC4c+/9dlI0lCFO9c+Vq1UmXDBUk1xFumpfIo61XpE X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :x-spam-checked-in-group:list-id:mailing-list:precedence:reply-to :x-original-authentication-results:x-original-sender:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:to:from:date:x-gm-message-state :sender:from:to:cc:subject:date:message-id:reply-to; bh=OjA79vuLgh+cY2eV9iz43S1+w7Yh+zjZ+bPqO6Z0Etc=; b=EDydIi4lOVkZVfIhom6VId8uNDDoAaET+MyTcCRX/wTBd/1sI3uvrNE6o6IPSx51Sv w3D2VUXdQCtsyizcOMyYhr2uRTCO9eP1GLFwJUOWoXUvu1uNzopgW9PKSMo9ClHDiZga Qyrmz40kUYoIXzYI4ND09/Xfn2FvHkDpgNPxcViHY/z+LuE3VpAsBKK14Ya54nA48+NS bUAFSUB2lcTlxh+d9zM/JTepeROXE+dG7M4ctNbD5r7wRRbGEDG+YN0PUuQ/mYZ0DHyr Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AO0yUKWQnSVqZd/N99hi3sLvekCnATXWUKaUUG2FeN6HzunpHq9u4Ze0 jV66D+e/DHb8YYlubOluRSs= X-Google-Smtp-Source: AK7set+VIrwlXdrpQXTBPwOzYvTrBRu5iMpWoD+i71fdyKY+zClonN2Pb1EsU4w5UUR5LNFzcrlIHg== X-Received: by 2002:ac2:5289:0:b0:4e1:dbbb:493b with SMTP id q9-20020ac25289000000b004e1dbbb493bmr130055lfm.4.1677509692058; Mon, 27 Feb 2023 06:54:52 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6512:3055:b0:4db:5081:6ce7 with SMTP id b21-20020a056512305500b004db50816ce7ls333654lfb.1.-pod-prod-gmail; Mon, 27 Feb 2023 06:54:47 -0800 (PST) X-Received: by 2002:a05:6512:234d:b0:4dc:790c:9100 with SMTP id p13-20020a056512234d00b004dc790c9100mr2605993lfu.12.1677509687569; Mon, 27 Feb 2023 06:54:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677509687; cv=none; d=google.com; s=arc-20160816; b=vk0Uq1dxQodw8yJqaWwNusF7lK4vYpaZchpPRHG01BkxDnGIuErzdmInirtUc4U9mT oFqIb8WddWsgvVprkaPPGw1WW85ikoUAEaC+x8AgD2vPrrVuZmw/yqdLeMh8MP7z4z3c f/bMXVWyWKcV5i55yYEUvOupI976Mk3CkkRStUuS2JRcVWHMQCkZu22FO7n7758J9rSc OMbOuQstX6i93gEXwy6IMYsSLfvr6So9iIEpHM3BeIW4cKK86EgcIhEy068XWecUbTVI 6PwBnjb9WBptRWBneXMzcAntKBRDBMCDrBnAeBY431UZFWH2B46QUjSPXG005/5n8FYH sk1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:to:from:date :dkim-signature; bh=eNSNIOH2CaEq0Zm5vDOU51t2fYk5OaYonU6DmSFDgMs=; b=H609V2szPU/sOHT28/2k2NH859TZPmW650cO/7Kfmzm1elrko8p3z3vwqIjSJqsPlt q2QKw3PnfDg0uORqcEy0o+t7HboAB8NBK8rVbuteYEdM0uceP8oHGX+p+ARRmaXLUWpm 6ULa6xJAug2yEk2WQQtjCsPffGrQIXpMW5zJ+vbf/9D7FwLv7fLW08HyhPXWwIYDrteF hhjLMjHDWZ/wScF/k3u669SrORbkyeHRnuV7nyt98IH/GN/bl3Y8lWrLPNFZYihns5GC 6bNv2a5/MLRIBFxlVV77gyd7jxG30KR5oYu1MEoSVwA4Hg34YyeNfI0b8yyQHiyNBp6y ChWg== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@posteo.net header.s=2017 header.b=fTYUuMnR; spf=pass (google.com: domain of bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org designates 185.67.36.65 as permitted sender) smtp.mailfrom=bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=posteo.net Original-Received: from mout01.posteo.de (mout01.posteo.de. [185.67.36.65]) by gmr-mx.google.com with ESMTPS id y25-20020a056512045900b004d5786b729esi280486lfk.9.2023.02.27.06.54.47 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Feb 2023 06:54:47 -0800 (PST) Received-SPF: pass (google.com: domain of bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org designates 185.67.36.65 as permitted sender) client-ip=185.67.36.65; Original-Received: from submission (posteo.de [185.67.36.169]) by mout01.posteo.de (Postfix) with ESMTPS id CD2062402AD for ; Mon, 27 Feb 2023 15:54:46 +0100 (CET) Original-Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4PQNnf2c2Lz9rxg for ; Mon, 27 Feb 2023 15:54:46 +0100 (CET) Content-Disposition: inline In-Reply-To: X-Original-Sender: bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@posteo.net header.s=2017 header.b=fTYUuMnR; spf=pass (google.com: domain of bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org designates 185.67.36.65 as permitted sender) smtp.mailfrom=bastien.dumont-VwIFZPTo/vqsTnJN9+BGXg@public.gmane.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=posteo.net Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:32243 Archived-At: Maybe you could restore the paragraphs you replaced with "test" one by one = and convert the document until Bidi throws an error. Then, you can remove t= he sentences of the offending paragraph one by one until the document is re= ad again without error. Thus you could isolate at least one of the sentence= s that cause the error. Le Monday 27 February 2023 =C3=A0 06:33:28AM, 'Peter Vedal Utnes' via pando= c-discuss a =C3=A9crit : > I just did some further testing, and replaced the sections that I would > otherwise have removed with as many words and paragraphs, but no signs, o= nly > "test test test" etc. The document then works. So I was wrong about the l= ength: > It must be some character or symbol producing the error (only with pandoc= , not > other EPUB converters). Any idea how to further isolate it, or how to > circumvent with a pandoc command or template? >=20 > Thanks for the help so far, Bernardo. >=20 >=20 >=20 > mandag 27. februar 2023 kl. 15:23:57 UTC+1 skrev Peter Vedal Utnes: >=20 > I am not sure what you mean by normalize in this context. I'll elabor= ate in > case this is what you mean: In the interest of removing variables tha= t > might interfere with troubleshooting, I have copied the text from res= earch > papers (not just one, but a few), pasted it in notepad, copied and pa= sted > it back into a new word-file (this is more thorough than "clear > formatting"), ran this "pure" file through pandoc and I get the error= . If I > then randomly shorten the file, the error disappears. This is not the= case > for my "test" file, but only for research papers, which is baffling. = I can > only assume that pandoc responds to something like a character or in-= text > references in particular contexts, or as was my original hypothesis, = the > number of lines or columns in the EPUB.=C2=A0 >=20 > mandag 27. februar 2023 kl. 15:17:10 UTC+1 skrev bernardov...@gmail.c= om: >=20 > Have you tried editing the original research paper in some minor = way > (adding or removing a couple of characters) and then running it? = This > is a completely wild guess, but maybe the text in the file is get= ting > normalized upon editing them, whereas the original research paper= still > contains the unedited, unnormalized text. >=20 > On Mon, Feb 27, 2023 at 10:48=E2=80=AFAM 'Peter Vedal Utnes' via = pandoc-discuss > wrote: >=20 > I thank you for the suggestion. It is proving somewhat hard t= o > (dis)confirm. I have made a testfile with just the word "test= " > pasted over and over again, with and without various formatti= ng and > with the same length or longer as the proper papers. This fil= e > consistently works. But when I attempt to do it with a regula= r > research paper, it only works if I shorten it. Curiously, I c= an > remove either half of the main text, or indeed sections here = and > there, randomly, and it works, but not with all of them prese= nt. I > have combed it for special characters or tags, but cannot fin= d > any.=C2=A0 >=20 > mandag 27. februar 2023 kl. 13:49:58 UTC+1 skrev Bernardo C. = D. A. > Vasconcelos: >=20 > I do not know the answer to this problem in particular, b= ut > perhaps it is worth checking the main document and the > bibliography for invisible control characters (e.g. `\X{A= 0}`). > They tend to cause all sorts of strange problems that res= ult in > random error msgs. >=20 > On Monday, February 27, 2023 at 8:16:20=E2=80=AFAM UTC-3 = Peter Vedal > Utnes wrote: >=20 > We have a workflow in Open Journal Systems where we u= se > Pandoc to convert word documents to EPUB, and then di= splay > them with an embedded EPUB app (Bibi).=C2=A0 >=20 > Our resulting EPUBs work fine with both debuggers and > viewers like calibre. They work in Bibi, but only whe= n they > are reduced to a certain length. Whenever the files e= xceed > approx 100 lines or 600 words, Bibi claims: >=20 > TypeError: Cannot read properties of undefined (readi= ng > =E2=80=98getAttribute=E2=80=99) > =20 > Meanwhile, the same documents works when converted to= EPUB > using other converters, or when I reduce the length > (length, not size in bytes-- I've tried with graphics= , > still works).=C2=A0It suddenly works when I reduce th= e length by > removing pure paragraph text, even though all the for= matted > elements (abstract, references, etc) are the same.=C2= =A0 >=20 > I recognize that this problem is very specific to the > interrelation pandoc <-> Bibi, but I'd be grateful fo= r > general troubleshooting suggestions.=C2=A0 >=20 > Thanks in advance,=C2=A0 >=20 > Peter >=20 >=20 > -- > You received this message because you are subscribed to a top= ic in > the Google Groups "pandoc-discuss" group. > To unsubscribe from this topic, visit [1]https://groups.googl= e.com/ > d/topic/pandoc-discuss/hPUa1uWGS_k/unsubscribe. > To unsubscribe from this group and all its topics, send an em= ail to > pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit [2]https:// > groups.google.com/d/msgid/pandoc-discuss/ > 4bd152b5-32f7-4f4c-9a9b-0d20afebea84n%40googlegroups.com. >=20 > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an= email > to [3]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit [4]https://groups.google.com/d/m= sgid/ > pandoc-discuss/bc147d77-69c9-4e5d-82a6-e149f662a823n%40googlegroups.com. >=20 > References: >=20 > [1] https://groups.google.com/d/topic/pandoc-discuss/hPUa1uWGS_k/unsubscr= ibe > [2] https://groups.google.com/d/msgid/pandoc-discuss/4bd152b5-32f7-4f4c-9= a9b-0d20afebea84n%40googlegroups.com?utm_medium=3Demail&utm_source=3Dfooter > [3] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org > [4] https://groups.google.com/d/msgid/pandoc-discuss/bc147d77-69c9-4e5d-8= 2a6-e149f662a823n%40googlegroups.com?utm_medium=3Demail&utm_source=3Dfooter --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/Y/zENVDPPqb4eHUo%40localhost.