I am not sure what you mean by normalize in this context. I'll elaborate in case this is what you mean: In the interest of removing variables that might interfere with troubleshooting, I have copied the text from research papers (not just one, but a few), pasted it in notepad, copied and pasted it back into a new word-file (this is more thorough than "clear formatting"), ran this "pure" file through pandoc and I get the error. If I then randomly shorten the file, the error disappears. This is not the case for my "test" file, but only for research papers, which is baffling. I can only assume that pandoc responds to something like a character or in-text references in particular contexts, or as was my original hypothesis, the number of lines or columns in the EPUB. mandag 27. februar 2023 kl. 15:17:10 UTC+1 skrev bernardov...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org: > Have you tried editing the original research paper in some minor way > (adding or removing a couple of characters) and then running it? This is a > completely wild guess, but maybe the text in the file is getting normalized > upon editing them, whereas the original research paper still contains the > unedited, unnormalized text. > > On Mon, Feb 27, 2023 at 10:48 AM 'Peter Vedal Utnes' via pandoc-discuss < > pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> wrote: > >> I thank you for the suggestion. It is proving somewhat hard to >> (dis)confirm. I have made a testfile with just the word "test" pasted over >> and over again, with and without various formatting and with the same >> length or longer as the proper papers. This file consistently works. But >> when I attempt to do it with a regular research paper, it only works if I >> shorten it. Curiously, I can remove either half of the main text, or indeed >> sections here and there, randomly, and it works, but not with all of them >> present. I have combed it for special characters or tags, but cannot find >> any. >> >> mandag 27. februar 2023 kl. 13:49:58 UTC+1 skrev Bernardo C. D. A. >> Vasconcelos: >> >>> I do not know the answer to this problem in particular, but perhaps it >>> is worth checking the main document *and* the bibliography for >>> invisible control characters (e.g. `\X{A0}`). They tend to cause all sorts >>> of strange problems that result in random error msgs. >>> >>> On Monday, February 27, 2023 at 8:16:20 AM UTC-3 Peter Vedal Utnes wrote: >>> >>>> We have a workflow in Open Journal Systems where we use Pandoc to >>>> convert word documents to EPUB, and then display them with an embedded EPUB >>>> app (Bibi). >>>> >>>> Our resulting EPUBs work fine with both debuggers and viewers like >>>> calibre. They work in Bibi, but only when they are reduced to a certain >>>> length. Whenever the files exceed approx 100 lines or 600 words, Bibi >>>> claims: >>>> >>>> TypeError: Cannot read properties of undefined (reading ‘getAttribute’) >>>> >>>> Meanwhile, the same documents works when converted to EPUB using other >>>> converters, or when I reduce the length (length, not size in bytes-- I've >>>> tried with graphics, still works). It suddenly works when I reduce the >>>> length by removing pure paragraph text, even though all the formatted >>>> elements (abstract, references, etc) are the same. >>>> >>>> I recognize that this problem is very specific to the interrelation >>>> pandoc <-> Bibi, but I'd be grateful for general troubleshooting >>>> suggestions. >>>> >>>> Thanks in advance, >>>> >>>> Peter >>>> >>>> -- >> You received this message because you are subscribed to a topic in the >> Google Groups "pandoc-discuss" group. >> To unsubscribe from this topic, visit >> https://groups.google.com/d/topic/pandoc-discuss/hPUa1uWGS_k/unsubscribe. >> To unsubscribe from this group and all its topics, send an email to >> pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/pandoc-discuss/4bd152b5-32f7-4f4c-9a9b-0d20afebea84n%40googlegroups.com >> >> . >> > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/db7972f9-8881-4941-92ea-9b8f51c0c404n%40googlegroups.com.