* How do I use media bags? @ 2018-05-01 0:53 Paul [not found] ` <9bd0cbe0-31dd-45bd-be19-6d4fcc49bbec-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Paul @ 2018-05-01 0:53 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 1078 bytes --] Hello! Just tried to use the Haskell API side of Pandoc but I rand into lots of "could not fetch resource" warnings". Is there anything that would help me extract the media from the downloaded .docx files and embed them in my epubs? The (I believe) relevant parts of my code are here: https://gist.github.com/MagnificentPako/22df4be40251a07a4f5d0dc4fafc5d46 (For some reason I had to re-implement Semigroup for both Pandoc and Meta because my GHC would complain that they are missing it otherwise.) -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/9bd0cbe0-31dd-45bd-be19-6d4fcc49bbec%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. [-- Attachment #1.2: Type: text/html, Size: 1519 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <9bd0cbe0-31dd-45bd-be19-6d4fcc49bbec-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: How do I use media bags? [not found] ` <9bd0cbe0-31dd-45bd-be19-6d4fcc49bbec-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2018-05-01 1:53 ` John MacFarlane [not found] ` <m236zc87lb.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org> 2018-05-01 15:12 ` John MacFarlane 1 sibling, 1 reply; 9+ messages in thread From: John MacFarlane @ 2018-05-01 1:53 UTC (permalink / raw) To: Paul, pandoc-discuss Paul <freack1208-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: > Hello! > > Just tried to use the Haskell API side of Pandoc but I rand into lots of > "could not fetch resource" warnings". Is there anything that would help me > extract the media from the downloaded .docx files and embed them in my > epubs? > The (I believe) relevant parts of my code are here: > https://gist.github.com/MagnificentPako/22df4be40251a07a4f5d0dc4fafc5d46 > > (For some reason I had to re-implement Semigroup for both Pandoc and Meta > because my GHC would complain that they are missing it otherwise.) Are you using the most recent version of pandoc-types? ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <m236zc87lb.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>]
* Re: How do I use media bags? [not found] ` <m236zc87lb.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org> @ 2018-05-01 12:16 ` Paul [not found] ` <b1d8abc3-8255-40ff-a1ba-075dceace535-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Paul @ 2018-05-01 12:16 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 1495 bytes --] No, I'm using pandoc-types 17.3.1 and pandoc 2.0.6. I'm not sure if I can change that though, unless it's possible to enforce newer versions in Nix. Am Dienstag, 1. Mai 2018 03:52:59 UTC+2 schrieb John MacFarlane: > > Paul <freac...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <javascript:>> writes: > > > Hello! > > > > Just tried to use the Haskell API side of Pandoc but I rand into lots of > > "could not fetch resource" warnings". Is there anything that would help > me > > extract the media from the downloaded .docx files and embed them in my > > epubs? > > The (I believe) relevant parts of my code are here: > > https://gist.github.com/MagnificentPako/22df4be40251a07a4f5d0dc4fafc5d46 > > > > (For some reason I had to re-implement Semigroup for both Pandoc and > Meta > > because my GHC would complain that they are missing it otherwise.) > > Are you using the most recent version of pandoc-types? > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/b1d8abc3-8255-40ff-a1ba-075dceace535%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. [-- Attachment #1.2: Type: text/html, Size: 2921 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <b1d8abc3-8255-40ff-a1ba-075dceace535-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: How do I use media bags? [not found] ` <b1d8abc3-8255-40ff-a1ba-075dceace535-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2018-05-01 12:44 ` Ivan Lazar Miljenovic [not found] ` <CA+u6gbxA-M2c-jz7AhWbCg_FgUPDdXOG1-mqz8kDQk20Oh0-Lg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Ivan Lazar Miljenovic @ 2018-05-01 12:44 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw On 1 May 2018 at 22:16, Paul <freack1208-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > No, I'm using pandoc-types 17.3.1 and pandoc 2.0.6. I'm not sure if I can > change that though, unless it's possible to enforce newer versions in Nix. NixOS 18.03 has pandoc-2.1.2 (but still pandoc-types 1.17.3.1). If you use the unstable channel you have 1.17.4.2 albeit not by default (as pandoc-types_1_17_4_2; there's pandoc_2_1_3 that *may* be built using this version of pandoc-types). You can also always use package overrides or overlays to add new versions as well. -- Ivan Lazar Miljenovic Ivan.Miljenovic-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org http://IvanMiljenovic.wordpress.com ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <CA+u6gbxA-M2c-jz7AhWbCg_FgUPDdXOG1-mqz8kDQk20Oh0-Lg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: How do I use media bags? [not found] ` <CA+u6gbxA-M2c-jz7AhWbCg_FgUPDdXOG1-mqz8kDQk20Oh0-Lg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-05-01 16:15 ` Paul [not found] ` <2408d89d-5ad5-430d-8b01-7c24e2167773-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Paul @ 2018-05-01 16:15 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 1632 bytes --] I tried using an overlay in my shell.nix but that won't do anything. I'm still getting the old pandoc and pandoc-types :/ Unless I'm doing something wrong: https://gist.github.com/MagnificentPako/e7a9bcd3bb9dd452087b90efc3d9861d Am Dienstag, 1. Mai 2018 14:44:58 UTC+2 schrieb Ivan Miljenovic: > > On 1 May 2018 at 22:16, Paul <freac...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <javascript:>> wrote: > > No, I'm using pandoc-types 17.3.1 and pandoc 2.0.6. I'm not sure if I > can > > change that though, unless it's possible to enforce newer versions in > Nix. > > NixOS 18.03 has pandoc-2.1.2 (but still pandoc-types 1.17.3.1). > > If you use the unstable channel you have 1.17.4.2 albeit not by > default (as pandoc-types_1_17_4_2; there's pandoc_2_1_3 that *may* be > built using this version of pandoc-types). > > You can also always use package overrides or overlays to add new > versions as well. > > -- > Ivan Lazar Miljenovic > Ivan.Mi...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <javascript:> > http://IvanMiljenovic.wordpress.com > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/2408d89d-5ad5-430d-8b01-7c24e2167773%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. [-- Attachment #1.2: Type: text/html, Size: 3104 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <2408d89d-5ad5-430d-8b01-7c24e2167773-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: How do I use media bags? [not found] ` <2408d89d-5ad5-430d-8b01-7c24e2167773-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2018-05-01 22:06 ` Ivan Lazar Miljenovic 0 siblings, 0 replies; 9+ messages in thread From: Ivan Lazar Miljenovic @ 2018-05-01 22:06 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw On 2 May 2018 at 02:15, Paul <freack1208-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > I tried using an overlay in my shell.nix but that won't do anything. I'm > still getting the old pandoc and pandoc-types :/ > > Unless I'm doing something wrong: > https://gist.github.com/MagnificentPako/e7a9bcd3bb9dd452087b90efc3d9861d AFAIK, overlays have to go in your ~/.config/nixpkgs configuration, not a shell.nix (I could be wrong, I'm still poking around trying to work out how to do stuff with nixpkgs). > > Am Dienstag, 1. Mai 2018 14:44:58 UTC+2 schrieb Ivan Miljenovic: >> >> On 1 May 2018 at 22:16, Paul <freac...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: >> > No, I'm using pandoc-types 17.3.1 and pandoc 2.0.6. I'm not sure if I >> > can >> > change that though, unless it's possible to enforce newer versions in >> > Nix. >> >> NixOS 18.03 has pandoc-2.1.2 (but still pandoc-types 1.17.3.1). >> >> If you use the unstable channel you have 1.17.4.2 albeit not by >> default (as pandoc-types_1_17_4_2; there's pandoc_2_1_3 that *may* be >> built using this version of pandoc-types). >> >> You can also always use package overrides or overlays to add new >> versions as well. >> >> -- >> Ivan Lazar Miljenovic >> Ivan.Mi...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org >> http://IvanMiljenovic.wordpress.com > > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/2408d89d-5ad5-430d-8b01-7c24e2167773%40googlegroups.com. > > For more options, visit https://groups.google.com/d/optout. -- Ivan Lazar Miljenovic Ivan.Miljenovic-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org http://IvanMiljenovic.wordpress.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: How do I use media bags? [not found] ` <9bd0cbe0-31dd-45bd-be19-6d4fcc49bbec-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2018-05-01 1:53 ` John MacFarlane @ 2018-05-01 15:12 ` John MacFarlane [not found] ` <m2r2mv76ln.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org> 1 sibling, 1 reply; 9+ messages in thread From: John MacFarlane @ 2018-05-01 15:12 UTC (permalink / raw) To: Paul, pandoc-discuss The more recent pandoc-types should eliminate the need for the orphan Semigroup instance. You shouldn't need fillMediaBag if you're reading docx, because the docx reader should populate the media bag automatically using images from the docx container itself. However, there's an issue with the way your code is structured. Each time you do runIO, it will reinitialize the mediabag, so you'll lose information. What you should do is put the whole conversion pipeline (reading docxs and writing epubs) under one runIO. Paul <freack1208-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: > Hello! > > Just tried to use the Haskell API side of Pandoc but I rand into lots of > "could not fetch resource" warnings". Is there anything that would help me > extract the media from the downloaded .docx files and embed them in my > epubs? > The (I believe) relevant parts of my code are here: > https://gist.github.com/MagnificentPako/22df4be40251a07a4f5d0dc4fafc5d46 > > (For some reason I had to re-implement Semigroup for both Pandoc and Meta > because my GHC would complain that they are missing it otherwise.) > > -- > You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/9bd0cbe0-31dd-45bd-be19-6d4fcc49bbec%40googlegroups.com. > For more options, visit https://groups.google.com/d/optout. ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <m2r2mv76ln.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>]
* Re: How do I use media bags? [not found] ` <m2r2mv76ln.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org> @ 2018-05-01 15:53 ` Paul [not found] ` <8bd93e89-5f06-4c39-9e68-1191cbe6df60-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Paul @ 2018-05-01 15:53 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 2861 bytes --] Thanks for pointing that out! Now I got it to almost work. The epub is being generated and it contains images, but it seems to like using images where they shouldn't be (not like randomly adding images, but using the wrong image). Are there collisions between the images? If so, is that something one can fix? My current code: https://gist.github.com/MagnificentPako/b72c3347b3e6bc2474c26ee91b8384ae Am Dienstag, 1. Mai 2018 17:11:25 UTC+2 schrieb John MacFarlane: > > > The more recent pandoc-types should eliminate the need for > the orphan Semigroup instance. > > You shouldn't need fillMediaBag if you're reading docx, because > the docx reader should populate the media bag automatically using > images from the docx container itself. > > However, there's an issue with the way your code is structured. > Each time you do runIO, it will reinitialize the mediabag, so > you'll lose information. What you should do is put the whole > conversion pipeline (reading docxs and writing epubs) under one runIO. > > Paul <freac...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <javascript:>> writes: > > > Hello! > > > > Just tried to use the Haskell API side of Pandoc but I rand into lots of > > "could not fetch resource" warnings". Is there anything that would help > me > > extract the media from the downloaded .docx files and embed them in my > > epubs? > > The (I believe) relevant parts of my code are here: > > https://gist.github.com/MagnificentPako/22df4be40251a07a4f5d0dc4fafc5d46 > > > > (For some reason I had to re-implement Semigroup for both Pandoc and > Meta > > because my GHC would complain that they are missing it otherwise.) > > > > -- > > You received this message because you are subscribed to the Google > Groups "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>. > > To post to this group, send email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org > <javascript:>. > > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/9bd0cbe0-31dd-45bd-be19-6d4fcc49bbec%40googlegroups.com. > > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/8bd93e89-5f06-4c39-9e68-1191cbe6df60%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. [-- Attachment #1.2: Type: text/html, Size: 5422 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <8bd93e89-5f06-4c39-9e68-1191cbe6df60-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: How do I use media bags? [not found] ` <8bd93e89-5f06-4c39-9e68-1191cbe6df60-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2018-05-01 17:42 ` John MacFarlane 0 siblings, 0 replies; 9+ messages in thread From: John MacFarlane @ 2018-05-01 17:42 UTC (permalink / raw) To: Paul, pandoc-discuss Paul <freack1208-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: > Thanks for pointing that out! Now I got it to almost work. The epub is > being generated and it contains images, but it seems to like using images > where they shouldn't be (not like randomly adding images, but using the > wrong image). Are there collisions between the images? If so, is that > something one can fix? The mediabag code was designed with the idea that you'd be extracting media from one file and doing something with it. Thus, insertMedia just uses Data.Map.insert with the normalized file path. If two docx files have different images with the same filename, and you read both of them in, you'll get a collision in the media bag and unwanted results. Perhaps this is something that could be improved. For example, insertMedia could be modified to return a filename, perhaps modified if there's already an item with that name. This would require some modifications across several readers. @jkr @mpickering @tarleb - any thoughts? In the mean time, your best bet would be to postprocess the Pandoc structure after each docx is read using a filter, changing the image names systematically in both the Image elements and the MediaBag, to avoid collisions. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2018-05-01 22:06 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-05-01 0:53 How do I use media bags? Paul [not found] ` <9bd0cbe0-31dd-45bd-be19-6d4fcc49bbec-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2018-05-01 1:53 ` John MacFarlane [not found] ` <m236zc87lb.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org> 2018-05-01 12:16 ` Paul [not found] ` <b1d8abc3-8255-40ff-a1ba-075dceace535-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2018-05-01 12:44 ` Ivan Lazar Miljenovic [not found] ` <CA+u6gbxA-M2c-jz7AhWbCg_FgUPDdXOG1-mqz8kDQk20Oh0-Lg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-05-01 16:15 ` Paul [not found] ` <2408d89d-5ad5-430d-8b01-7c24e2167773-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2018-05-01 22:06 ` Ivan Lazar Miljenovic 2018-05-01 15:12 ` John MacFarlane [not found] ` <m2r2mv76ln.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org> 2018-05-01 15:53 ` Paul [not found] ` <8bd93e89-5f06-4c39-9e68-1191cbe6df60-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2018-05-01 17:42 ` John MacFarlane
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).