From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/32249 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: "'William Lupton' via pandoc-discuss" Newsgroups: gmane.text.pandoc Subject: Re: Error caused by document length Date: Mon, 27 Feb 2023 17:10:18 +0000 Message-ID: References: <7ed278f7-071b-4bcc-9f9a-e9dd5c09ee55n@googlegroups.com> <8f11cfaf-7c36-4cc6-9866-aa3741d965a4n@googlegroups.com> <4bd152b5-32f7-4f4c-9a9b-0d20afebea84n@googlegroups.com> <0AFB3E23-B7C1-49E8-9F8A-12716F6A2C40@gmail.com> <20942a45-0995-4a50-888a-cf25e9895920n@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="0000000000004e20cc05f5b18ea3" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="17607"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBCS4HJ6WSAHBBB6I6OPQMGQE6QI4KIQ-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mon Feb 27 18:10:34 2023 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-ed1-f60.google.com ([209.85.208.60]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1pWh1G-0004SA-4R for gtp-pandoc-discuss@m.gmane-mx.org; Mon, 27 Feb 2023 18:10:34 +0100 Original-Received: by mail-ed1-f60.google.com with SMTP id eh16-20020a0564020f9000b004acc4f8aa3fsf9499075edb.3 for ; Mon, 27 Feb 2023 09:10:34 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1677517833; cv=pass; d=google.com; s=arc-20160816; b=ZWlduzSWNBZog6O0kY6xLLE+Sw3HuTgJA0qmF295Pvb8ToEzcmcROMWxBlL5OJ1ljm 7AnkGsrIBX4CY+7/efKXNCbpqhjJ8qqtKK9YmncFngW/Bztu+a1uMQ5+hT22BnWyJqZZ gK9gDeWhEuhYGD5ZiYK0jNq2b5Xj0lKt0P8GBZsEYl8i+jpO3L4g8cYrHR5DbaOFFu5t pUX8iT7Oz0G+8+C7gUHyK0WV7AU+RAMPRe0/SN+7Wzp7x057nUsbdCkV+AVP4gqWIvOR tE6OpdO//OrAjRgDjPpzJ5SuhjqUhFaonv/vTG6JIwbz1iPejVwZD0REuTiKujNbyMPN +V3w== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:to:subject:message-id:date :from:in-reply-to:references:mime-version:dkim-signature; bh=Z0xiQCP7xGn2VSnDi9KzJSnS2qZladhrOGsHyHAg80g=; b=VSbzlqvDNR2L1+j/BX2Mm70LJS4+ODsL26euuZF3dEBmusnoFcjwN8ZtyNDTzuKbpg CX/PY9XBV+i+Z+S/sAoSE4xn1HyiIM/yiYzM0i7azvLb1xin/R49+K/V0z0G96TGyVgI Di+vcY3y6Biw5wGu/9vbF6Rcy0qhdcmz0J1mOOop3ZDYT2yP9wj9aEY0GMzxrupqNX6L rY+BPvhgIzUAR9sQi+A+9fhsYSSQq1Fb+6AKxhmyY6SoyTm65AOx1SEeDsm6hXkRmH5U 4tqv0NdZUU75dPSlHR5BhAQixkOT2rW2W3uB0zSJdUYmsJz3ajrMkx0DvUlP6oW0mPPG iO/A== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@broadband-forum.org header.s=google header.b=DN8op8EA; spf=pass (google.com: domain of wlupton-QSt+ys/nuMyEUIsrzH9SikB+6BGkLq7r@public.gmane.org designates 2a00:1450:4864:20::52f as permitted sender) smtp.mailfrom=wlupton-QSt+ys/nuMyEUIsrzH9SikB+6BGkLq7r@public.gmane.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=broadband-forum.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to :x-original-authentication-results:x-original-sender:to:subject :message-id:date:from:in-reply-to:references:mime-version:from:to:cc :subject:date:message-id:reply-to; bh=Z0xiQCP7xGn2VSnDi9KzJSnS2qZladhrOGsHyHAg80g=; b=qYIYhLUv1sFBUve1pF9xifJqQKCQmYQzgXZdLR4TtjPff41+b6Twu3GYIibizH+I4F JizVksvyeIGWJ6NuDFH/bYEh8GL+/H4AgtZScsDg9FT/J1dKT1hhCrbEj24NBiPkdeOF AbrbKaKVPCoQUpJ+7IDVclAPpkWeXfkiy8z/eZXcf26CUYsTNx3l03EuXVd3GuQM9Uun GUB1aHscGQ37v4dU8EbuhfmlZ0syN+UwtnfheVVAhPKqEsvI0wGkC7CuObs8Oiefk9xU oA9lJDQgaVvKFLJPiotBM84I2dbZmjWGYfB0egk/CJDRiWcmYL0GcMix6Fwx908FGa79 Rl2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :x-spam-checked-in-group:list-id:mailing-list:precedence:reply-to :x-original-authentication-results:x-original-sender:to:subject :message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Z0xiQCP7xGn2VSnDi9KzJSnS2qZladhrOGsHyHAg80g=; b=SMAXujsD5JZdoRnLyoCX4L1l4gWWDNE7DNP39iNd+i//LnSfC72+ziLbhJ098AAPud /ZuK7QyU1RzZwN1/q0q4CgRU9xjz2pQDyPOAIrWCg13/776brL4TXhgPiVIMWB89iMqk wjNPVo+tD3cjBPpvjoCFJlvYhVrec9bkeORkjsRFUHYSFje1o5+g3t//zn3D23EdKXPV NOuBertJqc+yDj7CYVjAU5yTMBp106pT65U8IJAgY6khiodQqINgWsvhezldQ8zbqc/h Ds7JgF/HWWjRyncR6VEzpfkAdYckm4P4qC2UJMvuM7h9jnULf4JUp7fsn4fLCBQ X-Gm-Message-State: AO0yUKWn1xyKrn7reRAw7t4tj8D/7eqRmh3pJ7xcOKXrZxuX0woL7OFX 6Mfzh7qDCk/0jIXBQxO4ESk= X-Google-Smtp-Source: AK7set8D73WOQyOlDqxC1ichb1UmSKyro4inwWzenP22whRZGeb5OmCo/s5CIi07qFYjCv40813HqQ== X-Received: by 2002:a17:906:7f94:b0:8b1:3133:d57b with SMTP id f20-20020a1709067f9400b008b13133d57bmr14929909ejr.14.1677517833645; Mon, 27 Feb 2023 09:10:33 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6402:4488:b0:4ac:d2bc:e0cb with SMTP id er8-20020a056402448800b004acd2bce0cbls5533309edb.0.-pod-prod-gmail; Mon, 27 Feb 2023 09:10:29 -0800 (PST) X-Received: by 2002:a05:6402:1017:b0:4af:59c0:d303 with SMTP id c23-20020a056402101700b004af59c0d303mr206541edu.26.1677517829717; Mon, 27 Feb 2023 09:10:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677517829; cv=none; d=google.com; s=arc-20160816; b=XEdou7luMBKaZQZ/2Zm3vN0IYOPx8a/XGLWmmve6BpAu8/iXJ8p1PhaVJ2j1Q+MelS OvJrWBZFIDtXk+UA0EIVRSqtnCehecj/DO2ZrTWbpW5zynnIIlORrdOBdDU6bgx0VAaQ UlBNAtm8oSLQbSWsXAy3odj+Ve8NOOuE4RdSMaLYCRzmUIclQHk2gxBSe573tZOl9KHf CArWTl6BkvTF9/fCvt3iSTGi/4btmbDnQ8jKZ8sqyf3JmSMqSPuEn+nZXsS3t2D+E9uk IarKYeA5eELHS6xqfZsjGoPP9V54Mulq4JdFoOLo/OLrtmm3y2e3BAMBLVnUbfskX2e9 wSmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=Eov0gqny3etCmgndjUSJexJLmSkHSVaXYilD433PYBg=; b=s9/yYoY8tuWf3cJtWvmECFX3M8AzfJXjuyEcdN8RAoJCcVT4M+OT54g4GbyglLJow4 +NQL71etmLBYSP4R15LlfFZMnufmx8Gt5aSAGNXZQrPn0jTiuP1qd7N4l4PVnP3PK21L DEUFBhmKRx73bIDemKPRStUD0O88UG0FxNnT89W5c6PhUeZHBqSaCgAPIwLFI0iICNQ1 x7uvg94OeeqXWYZqlCdJhPORWkeYN5PG6RVQPr0Dd/o9bNhDvce7kssaskceKxELdHBE Cjta2BELplivVzk3EW82RlXEjeGsH0N/Y1R4DO3hYnsriHUjBo+sFvCryPrvJfIw/kxA AFlA== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@broadband-forum.org header.s=google header.b=DN8op8EA; spf=pass (google.com: domain of wlupton-QSt+ys/nuMyEUIsrzH9SikB+6BGkLq7r@public.gmane.org designates 2a00:1450:4864:20::52f as permitted sender) smtp.mailfrom=wlupton-QSt+ys/nuMyEUIsrzH9SikB+6BGkLq7r@public.gmane.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=broadband-forum.org Original-Received: from mail-ed1-x52f.google.com (mail-ed1-x52f.google.com. [2a00:1450:4864:20::52f]) by gmr-mx.google.com with ESMTPS id cz11-20020a0564021cab00b004ad7974808bsi402351edb.1.2023.02.27.09.10.29 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 27 Feb 2023 09:10:29 -0800 (PST) Received-SPF: pass (google.com: domain of wlupton-QSt+ys/nuMyEUIsrzH9SikB+6BGkLq7r@public.gmane.org designates 2a00:1450:4864:20::52f as permitted sender) client-ip=2a00:1450:4864:20::52f; Original-Received: by mail-ed1-x52f.google.com with SMTP id cy6so28718759edb.5 for ; Mon, 27 Feb 2023 09:10:29 -0800 (PST) X-Received: by 2002:a50:8754:0:b0:4ad:7c43:13c with SMTP id 20-20020a508754000000b004ad7c43013cmr132866edv.2.1677517829048; Mon, 27 Feb 2023 09:10:29 -0800 (PST) In-Reply-To: X-Original-Sender: wlupton-QSt+ys/nuMyEUIsrzH9SikB+6BGkLq7r@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@broadband-forum.org header.s=google header.b=DN8op8EA; spf=pass (google.com: domain of wlupton-QSt+ys/nuMyEUIsrzH9SikB+6BGkLq7r@public.gmane.org designates 2a00:1450:4864:20::52f as permitted sender) smtp.mailfrom=wlupton-QSt+ys/nuMyEUIsrzH9SikB+6BGkLq7r@public.gmane.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=broadband-forum.org X-Original-From: William Lupton Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:32249 Archived-At: --0000000000004e20cc05f5b18ea3 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Maybe this too obvious a comment, but it couldn't be the em-dashes could it? Both your sentences below appear to have em-dashes. Try replacing them with hyphens? 1) Over years I have experienced much Bronze in the form of articles in toll access (TA) journals that have been made freely available for reading =E2=80=93 not open access, but =E2=80=9CFree access=E2=80=9D as some publis= hers call it. 2) One thing is to help editors to become aware of the issue, another is to find practical solutions for them to transition their scholarly content to OA = =E2=80=93 the rest of their content is really not of interest to us. --> 1) Over years I have experienced much Bronze in the form of articles in toll access (TA) journals that have been made freely available for reading - not open access, but =E2=80=9CFree access=E2=80=9D as some publishers cal= l it. 2) One thing is to help editors to become aware of the issue, another is to find practical solutions for them to transition their scholarly content to OA - the rest of their content is really not of interest to us. On Mon, 27 Feb 2023 at 16:49, 'Peter Vedal Utnes' via pandoc-discuss < pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> wrote: > When I convert and try to publish a document with only the offending > sentences, it does indeed fail, Bastien. Even when the document is > otherwise empty. It is hard to see what might be causing this. I will hav= e > to continue the elimination down to the word, but I've been at this for > nine hours and it is getting late. Will do that tomorrow. Meanwhile, than= ks > for the help, all of you. > > > mandag 27. februar 2023 kl. 17:41:58 UTC+1 skrev Bastien DUMONT: > >> If you narrow down the document to the offending sentences (or only one >> of them), does bibi fail to read the resulting EPUB? Such minimal source >> and EPUB documents would be easier to inspect, and the latter could even= be >> included in a bug report for bibi. >> >> Le Monday 27 February 2023 =C3=A0 08:22:34AM, 'Peter Vedal Utnes' via >> pandoc-discuss a =C3=A9crit : >> > I have now done the elimination process, as suggested by Bastien, of >> replacing >> > the working file, which was the EPUB of the research paper where I had >> swapped >> > paragraphs 2-10 with "test test test", with the original paragraphs >> from the >> > paper. It worked until I tried to restore a sentence in the middle of >> paragraph >> > 3, going from above, or paragraph 6, going from below. When I insert >> the next >> > sentence in either end, the document fails to convert (in a manner >> readable by >> > bibi epub viewer). There does not seem to be unicode characters that >> might >> > interfere. I have ran the debugger you suggest, John ,and there are >> indeed >> > errors (metadata not filled in and a missing tag end) but I fixing >> these do not >> > seem to work. >> > >> > Here are the seemingly innocuous sentences that fail from above and >> below, >> > respectively: 1) Over years I have experienced much Bronze in the for= m >> of >> > articles in toll access (TA) journals that have been made freely >> available for >> > reading =E2=80=93 not open access, but =E2=80=9CFree access=E2=80=9D a= s some publishers call >> it. 2) One >> > thing is to help editors to become aware of the issue, another is to >> find >> > practical solutions for them to transition their scholarly content to >> OA =E2=80=93 the >> > rest of their content is really not of interest to us. >> > >> > There seem to issues with a few other sentences in those 3 paragraphs >> too, but >> > I can't see a pattern. >> > Here is the article in question, though it is only the PDF galley, my >> EPUB >> > testing is on a private server: >> https://septentrio.uit.no/index.php/nopos/ >> > article/view/6665 >> > >> > >> > >> > mandag 27. februar 2023 kl. 17:08:31 UTC+1 skrev John MacFarlane: >> > >> > You could try running epubcheck on the epub produced by pandoc, to see >> if >> > it points to anything. >> > >> > >> > > On Feb 27, 2023, at 6:33 AM, 'Peter Vedal Utnes' via pandoc-discuss = < >> > pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> wrote: >> > > >> > > I just did some further testing, and replaced the sections that I >> would >> > otherwise have removed with as many words and paragraphs, but no signs= , >> > only "test test test" etc. The document then works. So I was wrong >> about >> > the length: It must be some character or symbol producing the error >> (only >> > with pandoc, not other EPUB converters). Any idea how to further >> isolate >> > it, or how to circumvent with a pandoc command or template? >> > > >> > > Thanks for the help so far, Bernardo. >> > > >> > > >> > > >> > > mandag 27. februar 2023 kl. 15:23:57 UTC+1 skrev Peter Vedal Utnes: >> > > I am not sure what you mean by normalize in this context. I'll >> elaborate >> > in case this is what you mean: In the interest of removing variables >> that >> > might interfere with troubleshooting, I have copied the text from >> research >> > papers (not just one, but a few), pasted it in notepad, copied and >> pasted >> > it back into a new word-file (this is more thorough than "clear >> > formatting"), ran this "pure" file through pandoc and I get the error. >> If I >> > then randomly shorten the file, the error disappears. This is not the >> case >> > for my "test" file, but only for research papers, which is baffling. I >> can >> > only assume that pandoc responds to something like a character or >> in-text >> > references in particular contexts, or as was my original hypothesis, >> the >> > number of lines or columns in the EPUB. >> > > >> > > mandag 27. februar 2023 kl. 15:17:10 UTC+1 skrev >> bernardov...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org: >> > > Have you tried editing the original research paper in some minor way >> > (adding or removing a couple of characters) and then running it? This >> is a >> > completely wild guess, but maybe the text in the file is getting >> normalized >> > upon editing them, whereas the original research paper still contains >> the >> > unedited, unnormalized text. >> > > >> > > On Mon, Feb 27, 2023 at 10:48=E2=80=AFAM 'Peter Vedal Utnes' via >> pandoc-discuss < >> > pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> wrote: >> > > I thank you for the suggestion. It is proving somewhat hard to (dis) >> > confirm. I have made a testfile with just the word "test" pasted over >> and >> > over again, with and without various formatting and with the same >> length or >> > longer as the proper papers. This file consistently works. But when I >> > attempt to do it with a regular research paper, it only works if I >> shorten >> > it. Curiously, I can remove either half of the main text, or indeed >> > sections here and there, randomly, and it works, but not with all of >> them >> > present. I have combed it for special characters or tags, but cannot >> find >> > any. >> > > >> > > mandag 27. februar 2023 kl. 13:49:58 UTC+1 skrev Bernardo C. D. A. >> > Vasconcelos: >> > > I do not know the answer to this problem in particular, but perhaps >> it is >> > worth checking the main document and the bibliography for invisible >> control >> > characters (e.g. `\X{A0}`). They tend to cause all sorts of strange >> > problems that result in random error msgs. >> > > >> > > On Monday, February 27, 2023 at 8:16:20=E2=80=AFAM UTC-3 Peter Vedal= Utnes >> wrote: >> > > We have a workflow in Open Journal Systems where we use Pandoc to >> convert >> > word documents to EPUB, and then display them with an embedded EPUB ap= p >> > (Bibi). >> > > >> > > Our resulting EPUBs work fine with both debuggers and viewers like >> > calibre. They work in Bibi, but only when they are reduced to a certai= n >> > length. Whenever the files exceed approx 100 lines or 600 words, Bibi >> > claims: >> > > >> > > TypeError: Cannot read properties of undefined (reading >> =E2=80=98getAttribute=E2=80=99) >> > > >> > > Meanwhile, the same documents works when converted to EPUB using >> other >> > converters, or when I reduce the length (length, not size in bytes-- >> I've >> > tried with graphics, still works). It suddenly works when I reduce the >> > length by removing pure paragraph text, even though all the formatted >> > elements (abstract, references, etc) are the same. >> > > >> > > I recognize that this problem is very specific to the interrelation >> > pandoc <-> Bibi, but I'd be grateful for general troubleshooting >> > suggestions. >> > > >> > > Thanks in advance, >> > > >> > > Peter >> > > >> > > >> > > -- >> > > You received this message because you are subscribed to a topic in >> the >> > Google Groups "pandoc-discuss" group. >> > > To unsubscribe from this topic, visit [1]https://groups.google.com/d= / >> > topic/pandoc-discuss/hPUa1uWGS_k/unsubscribe. >> > > To unsubscribe from this group and all its topics, send an email to >> > pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> > > To view this discussion on the web visit [2] >> https://groups.google.com/d/ >> > msgid/pandoc-discuss/ >> > 4bd152b5-32f7-4f4c-9a9b-0d20afebea84n%40googlegroups.com. >> > > >> > > -- >> > > You received this message because you are subscribed to the Google >> Groups >> > "pandoc-discuss" group. >> > > To unsubscribe from this group and stop receiving emails from it, >> send an >> > email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> > > To view this discussion on the web visit [3] >> https://groups.google.com/d/ >> > msgid/pandoc-discuss/ >> > bc147d77-69c9-4e5d-82a6-e149f662a823n%40googlegroups.com. >> > >> > >> > -- >> > You received this message because you are subscribed to the Google >> Groups >> > "pandoc-discuss" group. >> > To unsubscribe from this group and stop receiving emails from it, send >> an email >> > to [4]pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> > To view this discussion on the web visit [5] >> https://groups.google.com/d/msgid/ >> > pandoc-discuss/20942a45-0995-4a50-888a-cf25e9895920n%40googlegroups.co= m. >> >> > >> > References: >> > >> > [1] >> https://groups.google.com/d/topic/pandoc-discuss/hPUa1uWGS_k/unsubscribe >> > [2] >> https://groups.google.com/d/msgid/pandoc-discuss/4bd152b5-32f7-4f4c-9a9b= -0d20afebea84n%40googlegroups.com >> > [3] >> https://groups.google.com/d/msgid/pandoc-discuss/bc147d77-69c9-4e5d-82a6= -e149f662a823n%40googlegroups.com >> > [4] mailto:pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org >> > [5] >> https://groups.google.com/d/msgid/pandoc-discuss/20942a45-0995-4a50-888a= -cf25e9895920n%40googlegroups.com?utm_medium=3Demail&utm_source=3Dfooter >> >> -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/a484697f-9076-4a13-acf1-= a645fa611614n%40googlegroups.com > > . > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/CAEe_xxhpYFisSG6gMHRx%3DadVbGWOLGyS30t2G6VqnX7Sye1GEQ%40mail= .gmail.com. --0000000000004e20cc05f5b18ea3 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Maybe this too obvious a comment, but it couldn't be t= he em-dashes could it? Both your sentences below appear to have em-dashes. = Try replacing them with hyphens?

1) Over year= s I have experienced much Bronze in the form of articles in toll access (TA= ) journals that have been made freely available for reading =E2=80=93 not o= pen access, but =E2=80=9CFree access=E2=80=9D as some publishers call it.=C2=A02)=C2=A0One thing is to help editors to become aware = of the issue, another is to find practical solutions for them to transition= their scholarly content to OA =E2=80=93 the rest of their content is reall= y not of interest to us.

<= div>-->

1) Over years I have experienced much Bronze in the form of article= s in toll access (TA) journals that have been made freely available for rea= ding - not open access, but =E2=80=9CFree access=E2=80=9D as some publisher= s call it.=C2=A02)=C2=A0One thing is to help editors to b= ecome aware of the issue, another is to find practical solutions for them t= o transition their scholarly content to OA - the rest of their content is r= eally not of interest to us.

On Mon, 27 F= eb 2023 at 16:49, 'Peter Vedal Utnes' via pandoc-discuss <pandoc-discuss-/JYPxA39Uh4Ykp1iOSErHA@public.gmane.org= m> wrote:
When I convert and try to publish a document with only the offending sente= nces, it does indeed fail, Bastien. Even when the document is otherwise emp= ty. It is hard to see what might be causing this. I will have to continue t= he elimination down to the word, but I've been at this for nine hours a= nd it is getting late. Will do that tomorrow. Meanwhile, thanks for the hel= p, all of you.=C2=A0


mandag 27. februar 2023 kl. 17:41:= 58 UTC+1 skrev Bastien DUMONT:
If you narrow down the document to the offending sentences (= or only one of them), does bibi fail to read the resulting EPUB? Such minim= al source and EPUB documents would be easier to inspect, and the latter cou= ld even be included in a bug report for bibi.

Le Monday 27 February 2023 =C3=A0 08:22:34AM, 'Peter Vedal Utnes= 9; via pandoc-discuss a =C3=A9crit :
> I have now done the elimination process, as suggested by Bastien, = of replacing
> the working file, which was the EPUB of the research paper where I= had swapped
> paragraphs 2-10 with "test test test", with the original= paragraphs from the
> paper. It worked until I tried to restore a sentence in the middle= of paragraph
> 3, going from above, or paragraph 6, going from below. When I inse= rt the next
> sentence in either end, the document fails to convert (in a manner= readable by
> bibi epub viewer). There does not seem to be unicode characters th= at might
> interfere. I have ran the debugger you suggest, John ,and there ar= e indeed
> errors (metadata not filled in and a missing tag end) but I fixing= these do not
> seem to work.=C2=A0
>=20
> Here are the seemingly innocuous sentences that fail from above an= d below,
> respectively: 1)=C2=A0=C2=A0Over years I have experienced much Bro= nze in the form of
> articles in toll access (TA) journals that have been made freely a= vailable for
> reading =E2=80=93 not open access, but =E2=80=9CFree access=E2=80= =9D as some publishers call it. 2) One
> thing is to help editors to become aware of the issue, another is = to find
> practical solutions for them to transition their scholarly content= to OA =E2=80=93 the
> rest of their content is really not of interest to us.
>=20
> There seem to issues with a few other sentences in those 3 paragra= phs too, but
> I can't see a pattern.=C2=A0
> Here is the article in question, though it is only the PDF galley,= my EPUB
> testing is on a private server:=C2=A0https://septen= trio.uit.no/index.php/nopos/
> article/view/6665
>=20
>=20
>=20
> mandag 27. februar 2023 kl. 17:08:31 UTC+1 skrev John MacFarlane:
>=20
> You could try running epubcheck on the epub produced by pandoc= , to see if
> it points to anything.
>=20
>=20
> > On Feb 27, 2023, at 6:33 AM, 'Peter Vedal Utnes' = via pandoc-discuss <
> pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> wrote:
> >
> > I just did some further testing, and replaced the section= s that I would
> otherwise have removed with as many words and paragraphs, but = no signs,
> only "test test test" etc. The document then works. = So I was wrong about
> the length: It must be some character or symbol producing the = error (only
> with pandoc, not other EPUB converters). Any idea how to furth= er isolate
> it, or how to circumvent with a pandoc command or template?
> >
> > Thanks for the help so far, Bernardo.
> >
> >
> >
> > mandag 27. februar 2023 kl. 15:23:57 UTC+1 skrev Peter Ve= dal Utnes:
> > I am not sure what you mean by normalize in this context.= I'll elaborate
> in case this is what you mean: In the interest of removing var= iables that
> might interfere with troubleshooting, I have copied the text f= rom research
> papers (not just one, but a few), pasted it in notepad, copied= and pasted
> it back into a new word-file (this is more thorough than "= ;clear
> formatting"), ran this "pure" file through pand= oc and I get the error. If I
> then randomly shorten the file, the error disappears. This is = not the case
> for my "test" file, but only for research papers, wh= ich is baffling. I can
> only assume that pandoc responds to something like a character= or in-text
> references in particular contexts, or as was my original hypot= hesis, the
> number of lines or columns in the EPUB.
> >
> > mandag 27. februar 2023 kl. 15:17:10 UTC+1 skrev bernardov...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org:
> > Have you tried editing the original research paper in som= e minor way
> (adding or removing a couple of characters) and then running i= t? This is a
> completely wild guess, but maybe the text in the file is getti= ng normalized
> upon editing them, whereas the original research paper still c= ontains the
> unedited, unnormalized text.
> >
> > On Mon, Feb 27, 2023 at 10:48=E2=80=AFAM 'Peter Vedal= Utnes' via pandoc-discuss <
> pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> wrote:
> > I thank you for the suggestion. It is proving somewhat ha= rd to (dis)
> confirm. I have made a testfile with just the word "test&= quot; pasted over and
> over again, with and without various formatting and with the s= ame length or
> longer as the proper papers. This file consistently works. But= when I
> attempt to do it with a regular research paper, it only works = if I shorten
> it. Curiously, I can remove either half of the main text, or i= ndeed
> sections here and there, randomly, and it works, but not with = all of them
> present. I have combed it for special characters or tags, but = cannot find
> any.
> >
> > mandag 27. februar 2023 kl. 13:49:58 UTC+1 skrev Bernardo= C. D. A.
> Vasconcelos:
> > I do not know the answer to this problem in particular, b= ut perhaps it is
> worth checking the main document and the bibliography for invi= sible control
> characters (e.g. `\X{A0}`). They tend to cause all sorts of st= range
> problems that result in random error msgs.
> >
> > On Monday, February 27, 2023 at 8:16:20=E2=80=AFAM UTC-3 = Peter Vedal Utnes wrote:
> > We have a workflow in Open Journal Systems where we use P= andoc to convert
> word documents to EPUB, and then display them with an embedded= EPUB app
> (Bibi).
> >
> > Our resulting EPUBs work fine with both debuggers and vie= wers like
> calibre. They work in Bibi, but only when they are reduced to = a certain
> length. Whenever the files exceed approx 100 lines or 600 word= s, Bibi
> claims:
> >
> > TypeError: Cannot read properties of undefined (reading = =E2=80=98getAttribute=E2=80=99)
> >
> > Meanwhile, the same documents works when converted to EPU= B using other
> converters, or when I reduce the length (length, not size in b= ytes-- I've
> tried with graphics, still works). It suddenly works when I re= duce the
> length by removing pure paragraph text, even though all the fo= rmatted
> elements (abstract, references, etc) are the same.
> >
> > I recognize that this problem is very specific to the int= errelation
> pandoc <-> Bibi, but I'd be grateful for general tro= ubleshooting
> suggestions.
> >
> > Thanks in advance,
> >
> > Peter
> >
> >
> > --
> > You received this message because you are subscribed to a= topic in the
> Google Groups "pandoc-discuss" group.
> > To unsubscribe from this topic, visit [1]https://groups.= google.com/d/
> topic/pandoc-discuss/hPUa1uWGS_k/unsubscribe.
> > To unsubscribe from this group and all its topics, send a= n email to=20
> pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
> > To view this discussion on the web visit [2]https://grou= ps.google.com/d/
> msgid/pandoc-discuss/
> 4bd152b5-32f7-4f4c-9a9b-0d20afebea84n%40googlegroups.com.
> >
> > --
> > You received this message because you are subscribed to t= he Google Groups
> "pandoc-discuss" group.
> > To unsubscribe from this group and stop receiving emails = from it, send an
> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org= .
> > To view this discussion on the web visit [3]https://grou= ps.google.com/d/
> msgid/pandoc-discuss/
> bc147d77-69c9-4e5d-82a6-e149f662a823n%40googlegroups.com.
>=20
>=20
> --
> You received this message because you are subscribed to the Google= Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, = send an email
> to [4]pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
> To view this discussion on the web visit [5]https://groups.= google.com/d/msgid/
> pandoc-discuss/20942a45-0995-4a50-888a-cf25e9895920n%40googlegroups.= com.
>=20
> References:
>=20
> [1] https://groups.go= ogle.com/d/topic/pandoc-discuss/hPUa1uWGS_k/unsubscribe
> [2] https://groups.google.com/d/msgid/pandoc-discuss/4bd152b5-3= 2f7-4f4c-9a9b-0d20afebea84n%40googlegroups.com
> [3] https://groups.google.com/d/msgid/pandoc-discuss/bc147d77-6= 9c9-4e5d-82a6-e149f662a823n%40googlegroups.com
> [4] mailto:pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> [5]
https://groups.g= oogle.com/d/msgid/pandoc-discuss/20942a45-0995-4a50-888a-cf25e9895920n%40go= oglegroups.com?utm_medium=3Demail&utm_source=3Dfooter

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https= ://groups.google.com/d/msgid/pandoc-discuss/a484697f-9076-4a13-acf1-a645fa6= 11614n%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://group= s.google.com/d/msgid/pandoc-discuss/CAEe_xxhpYFisSG6gMHRx%3DadVbGWOLGyS30t2= G6VqnX7Sye1GEQ%40mail.gmail.com.
--0000000000004e20cc05f5b18ea3--