public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: Kolen Cheung <christian.kolen-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: pandoc-discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Subject: Re: HTML → EPUB: Either "Out of memory" or "openBinaryFile: invalid argument (Invalid argument)"
Date: Wed, 22 Apr 2020 15:17:54 -0700 (PDT)	[thread overview]
Message-ID: <60dc6b96-7284-47e3-bbb2-938857c61dd5@googlegroups.com> (raw)
In-Reply-To: <026f695e-0849-4c01-969b-0c2ccbeb31b9-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 3992 bytes --]

Version too old. Try to reproduce it using the latest 
version: https://github.com/jgm/pandoc/releases/latest There's various way 
to install it, e.g. you can just unzip pandoc-2.9.2.1-linux-amd64.tar.gz 
and put pandoc and pandoc-citeproc to somewhere in your path, such as 
~/.local/bin

(To take one more step you can go to the GitHub Action to download the 
latest nightly build to make sure the problem has not been solved yet.)

In general you'd want to ensure the problem has not been solved yet, and to 
do that you want the latest version, which unfortunately in distros with 
package manager can be a big problem because people often just use the one 
from there, which is too old especially from Ubuntu.

On Wednesday, April 22, 2020 at 2:59:38 PM UTC-7, Heck Lennon wrote:
>
> pandoc 2.5.2 on Ubuntu 19.10.
>
> Turns out I had to use "-t epub" instead of "-t epub3" :
>
> pandoc -f html -t epub -o output.epub input.html
>
> Thank you.
>
> Le mercredi 22 avril 2020 17:58:39 UTC+2, John MacFarlane a écrit :
>>
>>
>> What pandoc version are you running on the linux box? 
>> This works fine for me. 
>>
>>
>> Heck Lennon <frdt...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: 
>>
>> > Since I had a Linux host available, I went around that issue with 
>> Windows 
>> > and shell expansion. 
>> > 
>> > pandoc -f html -t epub3 -o output.epub input.html 
>> > 
>> > 
>> > pandoc ran successfully (no error message), but the EPUB can't be 
>> opened in 
>> > a Windows GUI application that supports EPUB files ("Error loading 
>> > file.epub"). Likewise, I can't open the file after changing its 
>> extension 
>> > from EPUB to ZIP. 
>> > 
>> > Here's the input files (HTML + PNGs): 
>> > 
>> > https://we.tl/t-5EeGXML1rb 
>> > 
>> > Do I need extra options in the command line? 
>> > 
>> > Le mercredi 22 avril 2020 11:55:49 UTC+2, Heck Lennon a écrit : 
>> >> 
>> >> Thanks everyone for the infos! 
>> >> 
>> >> Le mercredi 22 avril 2020 01:25:21 UTC+2, Kolen Cheung a écrit : 
>> >>> 
>> >>> A side note, since your goal is to convert from PDF to ePub, you 
>> probably 
>> >>> will have better results using other tools. Eg I know it can be 
>> converted 
>> >>> to docx, and then from docx to ePub. There may he tool that can help 
>> you 
>> >>> convert that directly too. Essentially for the tools you choose, 
>> you’d want 
>> >>> to choose one preserving most information. And since pandoc focuses 
>> many on 
>> >>> the structure of the document, much other information would be lost. 
>> The 
>> >>> choice of tool also depends on which ones you’re comfortable with, Eg 
>> the 
>> >>> PDF to docx I mentioned probably can be done by Adobe Acrobat and MS 
>> Word. 
>> >>> But they are proprietary and difficult to run from the command line. 
>> >>> 
>> >>> In your case, since you have a tool preconverted them to html 
>> already, 
>> >>> html to ePub can be done better by some other engines (since the 2 
>> are 
>> >>> closely related.) may be you can try Calibre which also have a cli. 
>> >> 
>> >> 
>> > 
>> > -- 
>> > You received this message because you are subscribed to the Google 
>> Groups "pandoc-discuss" group. 
>> > To unsubscribe from this group and stop receiving emails from it, send 
>> an email to pandoc-...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org 
>> > To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/pandoc-discuss/b3218bbb-9846-4e52-b201-7e4a1b8b09d6%40googlegroups.com. 
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/60dc6b96-7284-47e3-bbb2-938857c61dd5%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 5865 bytes --]

  parent reply	other threads:[~2020-04-22 22:17 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-21  0:29 Heck Lennon
     [not found] ` <cfd086c1-9fe5-41bd-b735-3cd8db7579d9-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-04-21  5:40   ` John MacFarlane
     [not found]     ` <m2d081o0qc.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2020-04-21 10:10       ` Heck Lennon
     [not found]         ` <65ccb50b-6595-450d-86ca-c8103867e3bf-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-04-21 10:52           ` Heck Lennon
     [not found]             ` <f11a136c-0f32-4a59-b7cf-4aab865e1d68-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-04-21 18:21               ` John MacFarlane
     [not found]                 ` <m2368wog2l.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2020-04-21 19:40                   ` Anders Eriksson DC
2020-04-21 23:25   ` Kolen Cheung
     [not found]     ` <879425ff-d491-4d0b-8ffe-db24ad9cce23-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-04-22  9:55       ` Heck Lennon
     [not found]         ` <14c0eaf0-b920-477c-a735-dded7f1df0c5-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-04-22 12:30           ` Heck Lennon
     [not found]             ` <b3218bbb-9846-4e52-b201-7e4a1b8b09d6-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-04-22 15:58               ` John MacFarlane
     [not found]                 ` <m2tv1bfr6q.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2020-04-22 21:59                   ` Heck Lennon
     [not found]                     ` <026f695e-0849-4c01-969b-0c2ccbeb31b9-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-04-22 22:17                       ` Kolen Cheung [this message]
     [not found]                         ` <60dc6b96-7284-47e3-bbb2-938857c61dd5-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-04-23 14:53                           ` Heck Lennon
     [not found]                             ` <774af370-df13-43ec-97bc-68af09d2c2f4-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-04-29  0:44                               ` Kolen Cheung

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=60dc6b96-7284-47e3-bbb2-938857c61dd5@googlegroups.com \
    --to=christian.kolen-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).