public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: John MacFarlane <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org>
To: Heck Lennon <frdtheman-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	pandoc-discuss
	<pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Subject: Re: HTML → EPUB: Either "Out of memory" or "openBinaryFile: invalid argument (Invalid argument)"
Date: Tue, 21 Apr 2020 11:21:22 -0700	[thread overview]
Message-ID: <m2368wog2l.fsf@johnmacfarlane.net> (raw)
In-Reply-To: <f11a136c-0f32-4a59-b7cf-4aab865e1d68-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>


Yes, pandoc can use stdin (see the manual), but only when
files aren't explicitly specified on the command line.  You
can't use stdin AND name input files, as in your batch file.

I have no idea why you wouldn't be getting shell expansion
of *.html; maybe someone who uses Windows could comment.
Are there perhaps special characters or spaces in your
.html file names?  Try

pandoc "*.html"

Heck Lennon <frdtheman-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Per this thread…
> https://groups.google.com/d/msg/pandoc-discuss/eMfCGU3Gn8E/bEhyLpUYBAAJ
> … I named the batch file pandoc.cmd, and re-ran the command thusly:
>
> echo output.epub | pandoc *.html -
>
> It runs for a few minutes, and ends with displaying some HTML… but no .epub 
> can be found.
>
> I assume I'm not using the command correctly. Can pandoc use the standard 
> input?
>
>
> Le mardi 21 avril 2020 12:10:30 UTC+2, Heck Lennon a écrit :
>>
>> It's Windows (7, 32 bits) and pandoc 2.9.2.1.
>>
>> Le mardi 21 avril 2020 07:40:45 UTC+2, John MacFarlane a écrit :
>>>
>>>
>>> That's extremely strange.  Your shell should be expanding the * 
>>> in *.html before it even gets to pandoc.  So if pandoc can see 
>>> the *, your shell hasn't done what it's supposed to. 
>>>
>>> What OS are you using, and what version of pandoc? 
>>>
>>> Heck Lennon <frdt...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: 
>>>
>>> > Hello 
>>> > 
>>> > 
>>> > On Windows (7, 32 bits), I'm trying to convert a ~450 page PDF into 
>>> EPUB. 
>>> > 
>>> > 
>>> > 1. I used "mutool draw" to convert the PDF into a single, ~10MB HTML: 
>>> > 
>>> > 
>>> > pandoc -f html -t epub3 -o output.epub input.html 
>>> > 
>>> > (~10mn wait on my sluggish computer) 
>>> > 
>>> > "Out of memory": 
>>> > 
>>> > 
>>> > 2. Next, I reran "mutool draw" to convert the PDF as one page = one 
>>> HTML 
>>> > page: 
>>> > 
>>> > 
>>> > pandoc -o output.epub  *.html 
>>> > 
>>> > pandoc: *.html: openBinaryFile: invalid argument (Invalid argument) 
>>> > 
>>> > 
>>> > 3.Finally, I used pandoc to concatenate all the HTML files, but still 
>>> got a 
>>> > "openBinaryFile: invalid argument (Invalid argument)". 
>>> > 
>>> > 
>>> > pandoc *.html > full.html 
>>> > 
>>> > pandoc: *.html: openBinaryFile: invalid argument (Invalid argument) 
>>> > 
>>> > 
>>> > What do you suggest I try? 
>>> > 
>>> > 
>>> > Thank you. 
>>> > 
>>> > -- 
>>> > You received this message because you are subscribed to the Google 
>>> Groups "pandoc-discuss" group. 
>>> > To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to pandoc-...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org 
>>> > To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/pandoc-discuss/cfd086c1-9fe5-41bd-b735-3cd8db7579d9%40googlegroups.com. 
>>>
>>>
>>
>
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/f11a136c-0f32-4a59-b7cf-4aab865e1d68%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/m2368wog2l.fsf%40johnmacfarlane.net.


  parent reply	other threads:[~2020-04-21 18:21 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-21  0:29 Heck Lennon
     [not found] ` <cfd086c1-9fe5-41bd-b735-3cd8db7579d9-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-04-21  5:40   ` John MacFarlane
     [not found]     ` <m2d081o0qc.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2020-04-21 10:10       ` Heck Lennon
     [not found]         ` <65ccb50b-6595-450d-86ca-c8103867e3bf-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-04-21 10:52           ` Heck Lennon
     [not found]             ` <f11a136c-0f32-4a59-b7cf-4aab865e1d68-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-04-21 18:21               ` John MacFarlane [this message]
     [not found]                 ` <m2368wog2l.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2020-04-21 19:40                   ` Anders Eriksson DC
2020-04-21 23:25   ` Kolen Cheung
     [not found]     ` <879425ff-d491-4d0b-8ffe-db24ad9cce23-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-04-22  9:55       ` Heck Lennon
     [not found]         ` <14c0eaf0-b920-477c-a735-dded7f1df0c5-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-04-22 12:30           ` Heck Lennon
     [not found]             ` <b3218bbb-9846-4e52-b201-7e4a1b8b09d6-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-04-22 15:58               ` John MacFarlane
     [not found]                 ` <m2tv1bfr6q.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2020-04-22 21:59                   ` Heck Lennon
     [not found]                     ` <026f695e-0849-4c01-969b-0c2ccbeb31b9-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-04-22 22:17                       ` Kolen Cheung
     [not found]                         ` <60dc6b96-7284-47e3-bbb2-938857c61dd5-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-04-23 14:53                           ` Heck Lennon
     [not found]                             ` <774af370-df13-43ec-97bc-68af09d2c2f4-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-04-29  0:44                               ` Kolen Cheung

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m2368wog2l.fsf@johnmacfarlane.net \
    --to=jgm-tvlzxgkolnx2fbvcvol8/a@public.gmane.org \
    --cc=frdtheman-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).