public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* How do I use media bags?
@ 2018-05-01  0:53 Paul
       [not found] ` <9bd0cbe0-31dd-45bd-be19-6d4fcc49bbec-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Paul @ 2018-05-01  0:53 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1078 bytes --]

Hello!

Just tried to use the Haskell API side of Pandoc but I rand into lots of 
"could not fetch resource" warnings". Is there anything that would help me 
extract the media from the downloaded .docx files and embed them in my 
epubs?
The (I believe) relevant parts of my code are here: 
https://gist.github.com/MagnificentPako/22df4be40251a07a4f5d0dc4fafc5d46

(For some reason I had to re-implement Semigroup for both Pandoc and Meta 
because my GHC would complain that they are missing it otherwise.)

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/9bd0cbe0-31dd-45bd-be19-6d4fcc49bbec%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 1519 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How do I use media bags?
       [not found] ` <9bd0cbe0-31dd-45bd-be19-6d4fcc49bbec-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2018-05-01  1:53   ` John MacFarlane
       [not found]     ` <m236zc87lb.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
  2018-05-01 15:12   ` John MacFarlane
  1 sibling, 1 reply; 9+ messages in thread
From: John MacFarlane @ 2018-05-01  1:53 UTC (permalink / raw)
  To: Paul, pandoc-discuss

Paul <freack1208-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Hello!
>
> Just tried to use the Haskell API side of Pandoc but I rand into lots of 
> "could not fetch resource" warnings". Is there anything that would help me 
> extract the media from the downloaded .docx files and embed them in my 
> epubs?
> The (I believe) relevant parts of my code are here: 
> https://gist.github.com/MagnificentPako/22df4be40251a07a4f5d0dc4fafc5d46
>
> (For some reason I had to re-implement Semigroup for both Pandoc and Meta 
> because my GHC would complain that they are missing it otherwise.)

Are you using the most recent version of pandoc-types?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How do I use media bags?
       [not found]     ` <m236zc87lb.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
@ 2018-05-01 12:16       ` Paul
       [not found]         ` <b1d8abc3-8255-40ff-a1ba-075dceace535-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Paul @ 2018-05-01 12:16 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1495 bytes --]

No, I'm using pandoc-types 17.3.1 and pandoc 2.0.6. I'm not sure if I can 
change that though, unless it's possible to enforce newer versions in Nix.

Am Dienstag, 1. Mai 2018 03:52:59 UTC+2 schrieb John MacFarlane:
>
> Paul <freac...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <javascript:>> writes: 
>
> > Hello! 
> > 
> > Just tried to use the Haskell API side of Pandoc but I rand into lots of 
> > "could not fetch resource" warnings". Is there anything that would help 
> me 
> > extract the media from the downloaded .docx files and embed them in my 
> > epubs? 
> > The (I believe) relevant parts of my code are here: 
> > https://gist.github.com/MagnificentPako/22df4be40251a07a4f5d0dc4fafc5d46 
> > 
> > (For some reason I had to re-implement Semigroup for both Pandoc and 
> Meta 
> > because my GHC would complain that they are missing it otherwise.) 
>
> Are you using the most recent version of pandoc-types? 
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/b1d8abc3-8255-40ff-a1ba-075dceace535%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 2921 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How do I use media bags?
       [not found]         ` <b1d8abc3-8255-40ff-a1ba-075dceace535-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2018-05-01 12:44           ` Ivan Lazar Miljenovic
       [not found]             ` <CA+u6gbxA-M2c-jz7AhWbCg_FgUPDdXOG1-mqz8kDQk20Oh0-Lg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Ivan Lazar Miljenovic @ 2018-05-01 12:44 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 1 May 2018 at 22:16, Paul <freack1208-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> No, I'm using pandoc-types 17.3.1 and pandoc 2.0.6. I'm not sure if I can
> change that though, unless it's possible to enforce newer versions in Nix.

NixOS 18.03 has pandoc-2.1.2 (but still pandoc-types 1.17.3.1).

If you use the unstable channel you have 1.17.4.2 albeit not by
default (as pandoc-types_1_17_4_2; there's pandoc_2_1_3 that *may* be
built using this version of pandoc-types).

You can also always use package overrides or overlays to add new
versions as well.

-- 
Ivan Lazar Miljenovic
Ivan.Miljenovic-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
http://IvanMiljenovic.wordpress.com


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How do I use media bags?
       [not found] ` <9bd0cbe0-31dd-45bd-be19-6d4fcc49bbec-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2018-05-01  1:53   ` John MacFarlane
@ 2018-05-01 15:12   ` John MacFarlane
       [not found]     ` <m2r2mv76ln.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
  1 sibling, 1 reply; 9+ messages in thread
From: John MacFarlane @ 2018-05-01 15:12 UTC (permalink / raw)
  To: Paul, pandoc-discuss


The more recent pandoc-types should eliminate the need for
the orphan Semigroup instance.

You shouldn't need fillMediaBag if you're reading docx, because
the docx reader should populate the media bag automatically using
images from the docx container itself.

However, there's an issue with the way  your code is structured.
Each time you  do runIO, it will reinitialize the mediabag, so
you'll lose information.  What you should do is put the whole
conversion pipeline (reading docxs and writing epubs) under one runIO.

Paul <freack1208-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Hello!
>
> Just tried to use the Haskell API side of Pandoc but I rand into lots of 
> "could not fetch resource" warnings". Is there anything that would help me 
> extract the media from the downloaded .docx files and embed them in my 
> epubs?
> The (I believe) relevant parts of my code are here: 
> https://gist.github.com/MagnificentPako/22df4be40251a07a4f5d0dc4fafc5d46
>
> (For some reason I had to re-implement Semigroup for both Pandoc and Meta 
> because my GHC would complain that they are missing it otherwise.)
>
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/9bd0cbe0-31dd-45bd-be19-6d4fcc49bbec%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How do I use media bags?
       [not found]     ` <m2r2mv76ln.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
@ 2018-05-01 15:53       ` Paul
       [not found]         ` <8bd93e89-5f06-4c39-9e68-1191cbe6df60-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Paul @ 2018-05-01 15:53 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2861 bytes --]

Thanks for pointing that out! Now I got it to almost work. The epub is 
being generated and it contains images, but it seems to like using images 
where they shouldn't be (not like randomly adding images, but using the 
wrong image). Are there collisions between the images? If so, is that 
something one can fix?

My current code: 
https://gist.github.com/MagnificentPako/b72c3347b3e6bc2474c26ee91b8384ae

Am Dienstag, 1. Mai 2018 17:11:25 UTC+2 schrieb John MacFarlane:
>
>
> The more recent pandoc-types should eliminate the need for 
> the orphan Semigroup instance. 
>
> You shouldn't need fillMediaBag if you're reading docx, because 
> the docx reader should populate the media bag automatically using 
> images from the docx container itself. 
>
> However, there's an issue with the way  your code is structured. 
> Each time you  do runIO, it will reinitialize the mediabag, so 
> you'll lose information.  What you should do is put the whole 
> conversion pipeline (reading docxs and writing epubs) under one runIO. 
>
> Paul <freac...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <javascript:>> writes: 
>
> > Hello! 
> > 
> > Just tried to use the Haskell API side of Pandoc but I rand into lots of 
> > "could not fetch resource" warnings". Is there anything that would help 
> me 
> > extract the media from the downloaded .docx files and embed them in my 
> > epubs? 
> > The (I believe) relevant parts of my code are here: 
> > https://gist.github.com/MagnificentPako/22df4be40251a07a4f5d0dc4fafc5d46 
> > 
> > (For some reason I had to re-implement Semigroup for both Pandoc and 
> Meta 
> > because my GHC would complain that they are missing it otherwise.) 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "pandoc-discuss" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>. 
> > To post to this group, send email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org 
> <javascript:>. 
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/pandoc-discuss/9bd0cbe0-31dd-45bd-be19-6d4fcc49bbec%40googlegroups.com. 
>
> > For more options, visit https://groups.google.com/d/optout. 
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/8bd93e89-5f06-4c39-9e68-1191cbe6df60%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 5422 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How do I use media bags?
       [not found]             ` <CA+u6gbxA-M2c-jz7AhWbCg_FgUPDdXOG1-mqz8kDQk20Oh0-Lg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-05-01 16:15               ` Paul
       [not found]                 ` <2408d89d-5ad5-430d-8b01-7c24e2167773-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Paul @ 2018-05-01 16:15 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1632 bytes --]

I tried using an overlay in my shell.nix but that won't do anything. I'm 
still getting the old pandoc and pandoc-types :/ 

Unless I'm doing something wrong: 
https://gist.github.com/MagnificentPako/e7a9bcd3bb9dd452087b90efc3d9861d

Am Dienstag, 1. Mai 2018 14:44:58 UTC+2 schrieb Ivan Miljenovic:
>
> On 1 May 2018 at 22:16, Paul <freac...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <javascript:>> wrote: 
> > No, I'm using pandoc-types 17.3.1 and pandoc 2.0.6. I'm not sure if I 
> can 
> > change that though, unless it's possible to enforce newer versions in 
> Nix. 
>
> NixOS 18.03 has pandoc-2.1.2 (but still pandoc-types 1.17.3.1). 
>
> If you use the unstable channel you have 1.17.4.2 albeit not by 
> default (as pandoc-types_1_17_4_2; there's pandoc_2_1_3 that *may* be 
> built using this version of pandoc-types). 
>
> You can also always use package overrides or overlays to add new 
> versions as well. 
>
> -- 
> Ivan Lazar Miljenovic 
> Ivan.Mi...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <javascript:> 
> http://IvanMiljenovic.wordpress.com 
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/2408d89d-5ad5-430d-8b01-7c24e2167773%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 3104 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How do I use media bags?
       [not found]         ` <8bd93e89-5f06-4c39-9e68-1191cbe6df60-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2018-05-01 17:42           ` John MacFarlane
  0 siblings, 0 replies; 9+ messages in thread
From: John MacFarlane @ 2018-05-01 17:42 UTC (permalink / raw)
  To: Paul, pandoc-discuss

Paul <freack1208-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Thanks for pointing that out! Now I got it to almost work. The epub is 
> being generated and it contains images, but it seems to like using images 
> where they shouldn't be (not like randomly adding images, but using the 
> wrong image). Are there collisions between the images? If so, is that 
> something one can fix?

The mediabag code was designed with the idea that you'd be extracting
media from one file and doing something with it.  Thus, insertMedia
just uses Data.Map.insert with the normalized file path. If two docx
files have different images with the same filename, and you read
both of them in, you'll get a collision in the media bag
and unwanted results.

Perhaps this is something that could be improved.  For example,
insertMedia could be modified to return a filename, perhaps
modified if there's already an item with that name.  This would
require some modifications across several readers.
@jkr @mpickering @tarleb - any thoughts?

In the mean time, your best bet would be to postprocess the Pandoc
structure after each docx is read using a filter, changing the
image names systematically in both the Image elements and the
MediaBag, to avoid collisions.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How do I use media bags?
       [not found]                 ` <2408d89d-5ad5-430d-8b01-7c24e2167773-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2018-05-01 22:06                   ` Ivan Lazar Miljenovic
  0 siblings, 0 replies; 9+ messages in thread
From: Ivan Lazar Miljenovic @ 2018-05-01 22:06 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 2 May 2018 at 02:15, Paul <freack1208-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> I tried using an overlay in my shell.nix but that won't do anything. I'm
> still getting the old pandoc and pandoc-types :/
>
> Unless I'm doing something wrong:
> https://gist.github.com/MagnificentPako/e7a9bcd3bb9dd452087b90efc3d9861d

AFAIK, overlays have to go in your ~/.config/nixpkgs configuration,
not a shell.nix (I could be wrong, I'm still poking around trying to
work out how to do stuff with nixpkgs).

>
> Am Dienstag, 1. Mai 2018 14:44:58 UTC+2 schrieb Ivan Miljenovic:
>>
>> On 1 May 2018 at 22:16, Paul <freac...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> > No, I'm using pandoc-types 17.3.1 and pandoc 2.0.6. I'm not sure if I
>> > can
>> > change that though, unless it's possible to enforce newer versions in
>> > Nix.
>>
>> NixOS 18.03 has pandoc-2.1.2 (but still pandoc-types 1.17.3.1).
>>
>> If you use the unstable channel you have 1.17.4.2 albeit not by
>> default (as pandoc-types_1_17_4_2; there's pandoc_2_1_3 that *may* be
>> built using this version of pandoc-types).
>>
>> You can also always use package overrides or overlays to add new
>> versions as well.
>>
>> --
>> Ivan Lazar Miljenovic
>> Ivan.Mi...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
>> http://IvanMiljenovic.wordpress.com
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/2408d89d-5ad5-430d-8b01-7c24e2167773%40googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.



-- 
Ivan Lazar Miljenovic
Ivan.Miljenovic-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
http://IvanMiljenovic.wordpress.com


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-05-01 22:06 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-01  0:53 How do I use media bags? Paul
     [not found] ` <9bd0cbe0-31dd-45bd-be19-6d4fcc49bbec-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2018-05-01  1:53   ` John MacFarlane
     [not found]     ` <m236zc87lb.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2018-05-01 12:16       ` Paul
     [not found]         ` <b1d8abc3-8255-40ff-a1ba-075dceace535-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2018-05-01 12:44           ` Ivan Lazar Miljenovic
     [not found]             ` <CA+u6gbxA-M2c-jz7AhWbCg_FgUPDdXOG1-mqz8kDQk20Oh0-Lg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-05-01 16:15               ` Paul
     [not found]                 ` <2408d89d-5ad5-430d-8b01-7c24e2167773-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2018-05-01 22:06                   ` Ivan Lazar Miljenovic
2018-05-01 15:12   ` John MacFarlane
     [not found]     ` <m2r2mv76ln.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2018-05-01 15:53       ` Paul
     [not found]         ` <8bd93e89-5f06-4c39-9e68-1191cbe6df60-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2018-05-01 17:42           ` John MacFarlane

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).