When I convert and try to publish a document with only the offending sentences, it does indeed fail, Bastien. Even when the document is otherwise empty. It is hard to see what might be causing this. I will have to continue the elimination down to the word, but I've been at this for nine hours and it is getting late. Will do that tomorrow. Meanwhile, thanks for the help, all of you. 


mandag 27. februar 2023 kl. 17:41:58 UTC+1 skrev Bastien DUMONT:
If you narrow down the document to the offending sentences (or only one of them), does bibi fail to read the resulting EPUB? Such minimal source and EPUB documents would be easier to inspect, and the latter could even be included in a bug report for bibi.

Le Monday 27 February 2023 à 08:22:34AM, 'Peter Vedal Utnes' via pandoc-discuss a écrit :
> I have now done the elimination process, as suggested by Bastien, of replacing
> the working file, which was the EPUB of the research paper where I had swapped
> paragraphs 2-10 with "test test test", with the original paragraphs from the
> paper. It worked until I tried to restore a sentence in the middle of paragraph
> 3, going from above, or paragraph 6, going from below. When I insert the next
> sentence in either end, the document fails to convert (in a manner readable by
> bibi epub viewer). There does not seem to be unicode characters that might
> interfere. I have ran the debugger you suggest, John ,and there are indeed
> errors (metadata not filled in and a missing tag end) but I fixing these do not
> seem to work. 
>
> Here are the seemingly innocuous sentences that fail from above and below,
> respectively: 1)  Over years I have experienced much Bronze in the form of
> articles in toll access (TA) journals that have been made freely available for
> reading – not open access, but “Free access” as some publishers call it. 2) One
> thing is to help editors to become aware of the issue, another is to find
> practical solutions for them to transition their scholarly content to OA – the
> rest of their content is really not of interest to us.
>
> There seem to issues with a few other sentences in those 3 paragraphs too, but
> I can't see a pattern. 
> Here is the article in question, though it is only the PDF galley, my EPUB
> testing is on a private server: https://septentrio.uit.no/index.php/nopos/
> article/view/6665
>
>
>
> mandag 27. februar 2023 kl. 17:08:31 UTC+1 skrev John MacFarlane:
>
> You could try running epubcheck on the epub produced by pandoc, to see if
> it points to anything.
>
>
> > On Feb 27, 2023, at 6:33 AM, 'Peter Vedal Utnes' via pandoc-discuss <
> pandoc-...@googlegroups.com> wrote:
> >
> > I just did some further testing, and replaced the sections that I would
> otherwise have removed with as many words and paragraphs, but no signs,
> only "test test test" etc. The document then works. So I was wrong about
> the length: It must be some character or symbol producing the error (only
> with pandoc, not other EPUB converters). Any idea how to further isolate
> it, or how to circumvent with a pandoc command or template?
> >
> > Thanks for the help so far, Bernardo.
> >
> >
> >
> > mandag 27. februar 2023 kl. 15:23:57 UTC+1 skrev Peter Vedal Utnes:
> > I am not sure what you mean by normalize in this context. I'll elaborate
> in case this is what you mean: In the interest of removing variables that
> might interfere with troubleshooting, I have copied the text from research
> papers (not just one, but a few), pasted it in notepad, copied and pasted
> it back into a new word-file (this is more thorough than "clear
> formatting"), ran this "pure" file through pandoc and I get the error. If I
> then randomly shorten the file, the error disappears. This is not the case
> for my "test" file, but only for research papers, which is baffling. I can
> only assume that pandoc responds to something like a character or in-text
> references in particular contexts, or as was my original hypothesis, the
> number of lines or columns in the EPUB.
> >
> > mandag 27. februar 2023 kl. 15:17:10 UTC+1 skrev bernardov...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org:
> > Have you tried editing the original research paper in some minor way
> (adding or removing a couple of characters) and then running it? This is a
> completely wild guess, but maybe the text in the file is getting normalized
> upon editing them, whereas the original research paper still contains the
> unedited, unnormalized text.
> >
> > On Mon, Feb 27, 2023 at 10:48 AM 'Peter Vedal Utnes' via pandoc-discuss <
> pandoc-...@googlegroups.com> wrote:
> > I thank you for the suggestion. It is proving somewhat hard to (dis)
> confirm. I have made a testfile with just the word "test" pasted over and
> over again, with and without various formatting and with the same length or
> longer as the proper papers. This file consistently works. But when I
> attempt to do it with a regular research paper, it only works if I shorten
> it. Curiously, I can remove either half of the main text, or indeed
> sections here and there, randomly, and it works, but not with all of them
> present. I have combed it for special characters or tags, but cannot find
> any.
> >
> > mandag 27. februar 2023 kl. 13:49:58 UTC+1 skrev Bernardo C. D. A.
> Vasconcelos:
> > I do not know the answer to this problem in particular, but perhaps it is
> worth checking the main document and the bibliography for invisible control
> characters (e.g. `\X{A0}`). They tend to cause all sorts of strange
> problems that result in random error msgs.
> >
> > On Monday, February 27, 2023 at 8:16:20 AM UTC-3 Peter Vedal Utnes wrote:
> > We have a workflow in Open Journal Systems where we use Pandoc to convert
> word documents to EPUB, and then display them with an embedded EPUB app
> (Bibi).
> >
> > Our resulting EPUBs work fine with both debuggers and viewers like
> calibre. They work in Bibi, but only when they are reduced to a certain
> length. Whenever the files exceed approx 100 lines or 600 words, Bibi
> claims:
> >
> > TypeError: Cannot read properties of undefined (reading ‘getAttribute’)
> >
> > Meanwhile, the same documents works when converted to EPUB using other
> converters, or when I reduce the length (length, not size in bytes-- I've
> tried with graphics, still works). It suddenly works when I reduce the
> length by removing pure paragraph text, even though all the formatted
> elements (abstract, references, etc) are the same.
> >
> > I recognize that this problem is very specific to the interrelation
> pandoc <-> Bibi, but I'd be grateful for general troubleshooting
> suggestions.
> >
> > Thanks in advance,
> >
> > Peter
> >
> >
> > --
> > You received this message because you are subscribed to a topic in the
> Google Groups "pandoc-discuss" group.
> > To unsubscribe from this topic, visit [1]https://groups.google.com/d/
> topic/pandoc-discuss/hPUa1uWGS_k/unsubscribe.
> > To unsubscribe from this group and all its topics, send an email to
> pandoc-discus...@googlegroups.com.
> > To view this discussion on the web visit [2]https://groups.google.com/d/
> msgid/pandoc-discuss/
> 4bd152b5-32f7-4f4c-9a9b-0d20afebea84n%40googlegroups.com.
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
> > To view this discussion on the web visit [3]https://groups.google.com/d/
> msgid/pandoc-discuss/
> bc147d77-69c9-4e5d-82a6-e149f662a823n%40googlegroups.com.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email
> to [4]pandoc-discus...@googlegroups.com.
> To view this discussion on the web visit [5]https://groups.google.com/d/msgid/
> pandoc-discuss/20942a45-0995-4a50-888a-cf25e9895920n%40googlegroups.com.
>
> References:
>
> [1] https://groups.google.com/d/topic/pandoc-discuss/hPUa1uWGS_k/unsubscribe
> [2] https://groups.google.com/d/msgid/pandoc-discuss/4bd152b5-32f7-4f4c-9a9b-0d20afebea84n%40googlegroups.com
> [3] https://groups.google.com/d/msgid/pandoc-discuss/bc147d77-69c9-4e5d-82a6-e149f662a823n%40googlegroups.com
> [4] mailto:pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> [5] https://groups.google.com/d/msgid/pandoc-discuss/20942a45-0995-4a50-888a-cf25e9895920n%40googlegroups.com?utm_medium=email&utm_source=footer

--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/a484697f-9076-4a13-acf1-a645fa611614n%40googlegroups.com.