From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/24956 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Heck Lennon Newsgroups: gmane.text.pandoc Subject: =?UTF-8?Q?Re:_HTML_=E2=86=92_EPUB:_Either_"Out_of_memory"_or_"open?= =?UTF-8?Q?BinaryFile:_invalid_argument_(Invalid_argument)"?= Date: Wed, 22 Apr 2020 14:59:38 -0700 (PDT) Message-ID: <026f695e-0849-4c01-969b-0c2ccbeb31b9@googlegroups.com> References: <879425ff-d491-4d0b-8ffe-db24ad9cce23@googlegroups.com> <14c0eaf0-b920-477c-a735-dded7f1df0c5@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_3334_301551676.1587592778681" Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="11246"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBDPJHXO6WIOBBS74QL2QKGQEEKZYI4Y-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Wed Apr 22 23:59:46 2020 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-oi1-f189.google.com ([209.85.167.189]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1jRNPE-0002ly-Nv for gtp-pandoc-discuss@m.gmane-mx.org; Wed, 22 Apr 2020 23:59:44 +0200 Original-Received: by mail-oi1-f189.google.com with SMTP id o204sf2287416oib.7 for ; Wed, 22 Apr 2020 14:59:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:date:from:to:message-id:in-reply-to:references:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:list-subscribe :list-unsubscribe; bh=tcApXjEgmR80HQ8uaNkAWmYiXeUUOLSUrA2FvQ0uJaQ=; b=bbHORgAIrRZUpvl/PisoFvpl/WCchvYOSya7aLwGQ7+CvE1e7QTVRXPHv2dA1vRwhN KdcPtH/YpQ0HfCpAfk5wI84cR+HgYX9UuWaS3oAV1zKfwebv9B0oVp/Ek59a6uJE4uKJ h6njf28q/dS6R+gX5TVThtAUHuCfOfEul+NNN+PIelKvdIeiKZmEVwi/Q4WNFY86zaX+ 7q94mloGadoplMPUtooocKhPJ/yYw7OmgiIiVsERAQZwQfQxFMmBDpERvpqgP7+77iFf g4Hi1x6judQGGE96hjyFhsX6QSKb534gdpfhQm70l7wm6PzL4pSNMF6ocCieBGPxVawW Xiug== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-subscribe:list-unsubscribe; bh=tcApXjEgmR80HQ8uaNkAWmYiXeUUOLSUrA2FvQ0uJaQ=; b=O6VbYp5tpZVjxPxGea3v3sd5dG5EkXRGql9FOy41+dnFz4lPqdb6n5SCTDi2mw1Lz/ g0TsOHt3yQjNQ3pAs/dZ/i6rB3ZZpQaaLv6Wr0oToRySr/JrwaksP9fX+cykFkf4EGPd t8i9f/wGB7WxVzsnHnx0PjDVOOTP2jSwjDnpO2nVOcFN+/ihp6aASXE8mXCwBkjJ+IB4 hwHOlDwz2qHuym1lJVthkZAPRun6xxj5eqSbMl0fMsi3z3k3wRFa0JEVD4KLiAp7ZQM4 eBUq9vQnKzqyUGl7oPPPSGrlOrSJfBDcNTE1CU+eQzdJfunWpA4sJJUfcL6Mtoa1U+WK nbng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:date:from:to:message-id:in-reply-to :references:subject:mime-version:x-original-sender:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=tcApXjEgmR80HQ8uaNkAWmYiXeUUOLSUrA2FvQ0uJaQ=; b=kGdi2wZybHnUYatOrRTum0br/rIrSiu0XOgHP7ftKSVbsT6jIrvX2avQLtz3b2fTLd 761YDWMc26vdrlqE698frto4CBSye3UOSiA985td43r+EFc7Hv1LhmkFzReTeJUbHd0s /h8x6ArxfgcapJx+2TRjpXXNpk+d8iOyr+yscL53m8o23JexiVbNdpXUnrNEv/62WyB1 O/L3LIIe23CYMAPh91/i00D4X/J3rQ47zVezBoedkfEMq6709D+RAzY5WK2Iea12hZ0u JI8jviWK7E1VU9FCzrdrbmejy366+KrdxaVju4Vn+B63VCboy6QQeLz7hj8KgnhoAzLs AECw== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AGi0PuZddMVPtI9iturp3xmhpCAJQPmdS79oSyUVAjY/DRgCeUlHvmQP HZ6pyzAocOyj+0YCFlyG2XA= X-Google-Smtp-Source: APiQypLFMSqivbP3TzAQOPYnAbOYKcJgrP1Ju+TCclZd8Q1yJVzbJUwadqqkQN/slS/wC2yZ47xyhA== X-Received: by 2002:aca:b406:: with SMTP id d6mr836018oif.15.1587592783756; Wed, 22 Apr 2020 14:59:43 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a9d:363:: with SMTP id 90ls975998otv.6.gmail; Wed, 22 Apr 2020 14:59:39 -0700 (PDT) X-Received: by 2002:a9d:58c:: with SMTP id 12mr1087864otd.156.1587592779278; Wed, 22 Apr 2020 14:59:39 -0700 (PDT) In-Reply-To: X-Original-Sender: frdtheman-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:24956 Archived-At: ------=_Part_3334_301551676.1587592778681 Content-Type: multipart/alternative; boundary="----=_Part_3335_1140323669.1587592778681" ------=_Part_3335_1140323669.1587592778681 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable pandoc 2.5.2 on Ubuntu 19.10. Turns out I had to use "-t epub" instead of "-t epub3" : pandoc -f html -t epub -o output.epub input.html Thank you. Le mercredi 22 avril 2020 17:58:39 UTC+2, John MacFarlane a =C3=A9crit : > > > What pandoc version are you running on the linux box?=20 > This works fine for me.=20 > > > Heck Lennon > writes:=20 > > > Since I had a Linux host available, I went around that issue with=20 > Windows=20 > > and shell expansion.=20 > >=20 > > pandoc -f html -t epub3 -o output.epub input.html=20 > >=20 > >=20 > > pandoc ran successfully (no error message), but the EPUB can't be opene= d=20 > in=20 > > a Windows GUI application that supports EPUB files ("Error loading=20 > > file.epub"). Likewise, I can't open the file after changing its=20 > extension=20 > > from EPUB to ZIP.=20 > >=20 > > Here's the input files (HTML + PNGs):=20 > >=20 > > https://we.tl/t-5EeGXML1rb=20 > >=20 > > Do I need extra options in the command line?=20 > >=20 > > Le mercredi 22 avril 2020 11:55:49 UTC+2, Heck Lennon a =C3=A9crit :=20 > >>=20 > >> Thanks everyone for the infos!=20 > >>=20 > >> Le mercredi 22 avril 2020 01:25:21 UTC+2, Kolen Cheung a =C3=A9crit := =20 > >>>=20 > >>> A side note, since your goal is to convert from PDF to ePub, you=20 > probably=20 > >>> will have better results using other tools. Eg I know it can be=20 > converted=20 > >>> to docx, and then from docx to ePub. There may he tool that can help= =20 > you=20 > >>> convert that directly too. Essentially for the tools you choose, you= =E2=80=99d=20 > want=20 > >>> to choose one preserving most information. And since pandoc focuses= =20 > many on=20 > >>> the structure of the document, much other information would be lost.= =20 > The=20 > >>> choice of tool also depends on which ones you=E2=80=99re comfortable = with, Eg=20 > the=20 > >>> PDF to docx I mentioned probably can be done by Adobe Acrobat and MS= =20 > Word.=20 > >>> But they are proprietary and difficult to run from the command line.= =20 > >>>=20 > >>> In your case, since you have a tool preconverted them to html already= ,=20 > >>> html to ePub can be done better by some other engines (since the 2 ar= e=20 > >>> closely related.) may be you can try Calibre which also have a cli.= =20 > >>=20 > >>=20 > >=20 > > --=20 > > You received this message because you are subscribed to the Google=20 > Groups "pandoc-discuss" group.=20 > > To unsubscribe from this group and stop receiving emails from it, send= =20 > an email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org .=20 > > To view this discussion on the web visit=20 > https://groups.google.com/d/msgid/pandoc-discuss/b3218bbb-9846-4e52-b201-= 7e4a1b8b09d6%40googlegroups.com.=20 > > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/026f695e-0849-4c01-969b-0c2ccbeb31b9%40googlegroups.com. ------=_Part_3335_1140323669.1587592778681 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
pandoc 2.5.2 on Ubuntu 19.10.

Turns out I had to us= e "-t epub" instead of "-t epub3" :

= pandoc -f html -t epub -o output.epub= input.html

Thank you.

Le= mercredi 22 avril 2020 17:58:39 UTC+2, John MacFarlane a =C3=A9crit=C2=A0:=

What pandoc version are you running on the linux box?
This works fine for me.


Heck Lennon <frdt...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Since I had a Linux host available, I went around that issue with = Windows=20
> and shell expansion.
>
> pandoc -f html -t epub3 -o output.epub input.html
>
>
> pandoc ran successfully (no error message), but the EPUB can't= be opened in=20
> a Windows GUI application that supports EPUB files ("Error lo= ading=20
> file.epub"). Likewise, I can't open the file after changi= ng its extension=20
> from EPUB to ZIP.
>
> Here's the input files (HTML + PNGs):
>
> https://we.tl/t-5EeGXML1rb
>
> Do I need extra options in the command line?
>
> Le mercredi 22 avril 2020 11:55:49 UTC+2, Heck Lennon a =C3=A9crit= :
>>
>> Thanks everyone for the infos!
>>
>> Le mercredi 22 avril 2020 01:25:21 UTC+2, Kolen Cheung a =C3= =A9crit :
>>>
>>> A side note, since your goal is to convert from PDF to ePu= b, you probably=20
>>> will have better results using other tools. Eg I know it c= an be converted=20
>>> to docx, and then from docx to ePub. There may he tool tha= t can help you=20
>>> convert that directly too. Essentially for the tools you c= hoose, you=E2=80=99d want=20
>>> to choose one preserving most information. And since pando= c focuses many on=20
>>> the structure of the document, much other information woul= d be lost. The=20
>>> choice of tool also depends on which ones you=E2=80=99re c= omfortable with, Eg the=20
>>> PDF to docx I mentioned probably can be done by Adobe Acro= bat and MS Word.=20
>>> But they are proprietary and difficult to run from the com= mand line.=20
>>>
>>> In your case, since you have a tool preconverted them to h= tml already,=20
>>> html to ePub can be done better by some other engines (sin= ce the 2 are=20
>>> closely related.) may be you can try Calibre which also ha= ve a cli.
>>
>>
>
> --=20
> You received this message because you are subscribed to the Google= Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, = send an email to pandoc-...@googlegroups.com.
> To view this discussion on the web visit https://groups.= google.com/d/msgid/pandoc-discuss/b3218bbb-9846-4e52-b201-7e4a1b8= b09d6%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/= msgid/pandoc-discuss/026f695e-0849-4c01-969b-0c2ccbeb31b9%40googlegroups.co= m.
------=_Part_3335_1140323669.1587592778681-- ------=_Part_3334_301551676.1587592778681--