* Read + convert (content from) Google Doc with public URL? @ 2020-09-22 14:51 Martin Post [not found] ` <c20c4344-fd78-4d47-8700-7573f159df2bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Martin Post @ 2020-09-22 14:51 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 1172 bytes --] I may be barking up the wrong tree here, but I am sure that it will tell me so. :) Pandoc can convert HTML documents from public URLs. (pandoc -f html -t markdown https://www.fsf.org -o result.md) Is it possible (without too many extra steps/tools…) to have Pandoc read and convert the content of a Google Docs document with a public (viewing) URL? I.e., if I have a Google doc containing only few lines of Markdown, could Pandoc read and parse these as such, ignoring the documents styles? I tried and only get a zsh shell error (“no matches found”). I assume this would require getting the text-only content from such a doc using Google’s APIs, but I’m no coder and wouldn’t know where to start here. Thank you. -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/c20c4344-fd78-4d47-8700-7573f159df2bn%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 1573 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <c20c4344-fd78-4d47-8700-7573f159df2bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Read + convert (content from) Google Doc with public URL? [not found] ` <c20c4344-fd78-4d47-8700-7573f159df2bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2020-09-22 16:06 ` Joseph Reagle [not found] ` <ad7c70b6-e850-2e91-84da-74dae93a35e2-T1oY19WcHSwdnm+yROfE0A@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Joseph Reagle @ 2020-09-22 16:06 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw On 9/22/20 10:51 AM, Martin Post wrote: > Is it possible (without too many extra steps/tools…) to have Pandoc read and convert the content of a Google Docs document with a public (viewing) URL? Not that I know of. I think you'd want to download it as a docx and then convert from there. > I.e., if I have a Google doc containing only few lines of Markdown, could Pandoc read and parse these as such, ignoring the documents styles? I tried and only get a zsh shell error (“no matches found”). Ignoring GDocs styles and simply working on raw text (which happens to be markdown) is a different question, and probably addressed by something like this: ``` lynx -dump https://docs.google.com/document/... | pandoc ``` -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/ad7c70b6-e850-2e91-84da-74dae93a35e2%40reagle.org. ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <ad7c70b6-e850-2e91-84da-74dae93a35e2-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>]
* Re: Read + convert (content from) Google Doc with public URL? [not found] ` <ad7c70b6-e850-2e91-84da-74dae93a35e2-T1oY19WcHSwdnm+yROfE0A@public.gmane.org> @ 2020-09-23 18:21 ` BPJ 2020-09-24 8:07 ` Martin Post 0 siblings, 1 reply; 9+ messages in thread From: BPJ @ 2020-09-23 18:21 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1: Type: text/plain, Size: 1985 bytes --] You have to download it, but you may have better luck by downloading as HTML (which gives you a zip file which you must unpack) than as DOCX since much formatting is lost going from DOCX. -- Better --help|less than helpless Den tis 22 sep. 2020 18:07Joseph Reagle <joseph.2011-T1oY19WcHSwdnm+yROfE0A@public.gmane.org> skrev: > > > On 9/22/20 10:51 AM, Martin Post wrote: > > Is it possible (without too many extra steps/tools…) to have Pandoc read > and convert the content of a Google Docs document with a public (viewing) > URL? > > Not that I know of. I think you'd want to download it as a docx and then > convert from there. > > > I.e., if I have a Google doc containing only few lines of Markdown, > could Pandoc read and parse these as such, ignoring the documents styles? I > tried and only get a zsh shell error (“no matches found”). > > Ignoring GDocs styles and simply working on raw text (which happens to be > markdown) is a different question, and probably addressed by something like > this: > > ``` > lynx -dump https://docs.google.com/document/... | pandoc > ``` > > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/ad7c70b6-e850-2e91-84da-74dae93a35e2%40reagle.org > . > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhAUW%3DHWiqVDFh0KGq8xopaiKR2vUb7-CyKHMbvv_JdA%3DA%40mail.gmail.com. [-- Attachment #2: Type: text/html, Size: 2958 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Read + convert (content from) Google Doc with public URL? 2020-09-23 18:21 ` BPJ @ 2020-09-24 8:07 ` Martin Post [not found] ` <bab82269-d7e3-4854-b4e5-7d5dc214d8d9n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Martin Post @ 2020-09-24 8:07 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 2703 bytes --] Well, I’m looking for something that will allow authors to collaborate on a text over the web using basic Markdown and let me pick up the result for conversion in Pandoc, preferably without physically downloading single files. Basically a poor man’s single source publishing solution. Google Docs has comments and versions, and many people know it – but there may be better, more simple alternatives, e.g. Etherpad-style editors. It also seems that it’s possible to extract plain text from a Google Doc for further processing: https://developers.google.com/docs/api/samples/extract-text On Wednesday, September 23, 2020 at 8:21:52 PM UTC+2 BPJ wrote: > You have to download it, but you may have better luck by downloading as > HTML (which gives you a zip file which you must unpack) than as DOCX since > much formatting is lost going from DOCX. > > -- > Better --help|less than helpless > > Den tis 22 sep. 2020 18:07Joseph Reagle <josep...-T1oY19WcHSwdnm+yROfE0A@public.gmane.org> skrev: > >> >> >> On 9/22/20 10:51 AM, Martin Post wrote: >> > Is it possible (without too many extra steps/tools…) to have Pandoc >> read and convert the content of a Google Docs document with a public >> (viewing) URL? >> >> Not that I know of. I think you'd want to download it as a docx and then >> convert from there. >> >> > I.e., if I have a Google doc containing only few lines of Markdown, >> could Pandoc read and parse these as such, ignoring the documents styles? I >> tried and only get a zsh shell error (“no matches found”). >> >> Ignoring GDocs styles and simply working on raw text (which happens to be >> markdown) is a different question, and probably addressed by something like >> this: >> >> ``` >> lynx -dump https://docs.google.com/document/... | pandoc >> ``` >> >> -- >> You received this message because you are subscribed to the Google Groups >> "pandoc-discuss" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/pandoc-discuss/ad7c70b6-e850-2e91-84da-74dae93a35e2%40reagle.org >> . >> > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/bab82269-d7e3-4854-b4e5-7d5dc214d8d9n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 4452 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <bab82269-d7e3-4854-b4e5-7d5dc214d8d9n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Read + convert (content from) Google Doc with public URL? [not found] ` <bab82269-d7e3-4854-b4e5-7d5dc214d8d9n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2020-09-24 9:55 ` Albert Krewinkel [not found] ` <87y2kzjy84.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> 2020-09-24 18:38 ` BPJ 1 sibling, 1 reply; 9+ messages in thread From: Albert Krewinkel @ 2020-09-24 9:55 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw Martin Post writes: > Well, I’m looking for something that will allow authors to collaborate on a > text over the web using basic Markdown and let me pick up the result for > conversion in Pandoc, preferably without physically downloading single > files. Basically a poor man’s single source publishing solution. Google > Docs has comments and versions, and many people know it – but there may be > better, more simple alternatives, e.g. Etherpad-style editors. I like to use [HackMD] for collaboration. There is also an open source version called [CodiMD]. More complex to set up, but also really nice: GitLab with CI pipelines for automatic doc generation, combined with [Netlify CMS] for easy metadata handling. [HackMD]: https://hackmd.io [CodiMD]: https://github.com/hackmdio/codimd [Netlify CMS]: https://www.netlifycms.org/ -- Albert Krewinkel GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/87y2kzjy84.fsf%40zeitkraut.de. ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <87y2kzjy84.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>]
* Re: Read + convert (content from) Google Doc with public URL? [not found] ` <87y2kzjy84.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> @ 2020-09-24 14:15 ` mb21 [not found] ` <4c4a7da8-176b-469a-8c3c-b45d7b94f08an-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: mb21 @ 2020-09-24 14:15 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 1533 bytes --] there is also https://stackedit.io ... On Thursday, September 24, 2020 at 11:56:18 AM UTC+2 Albert Krewinkel wrote: > > Martin Post writes: > > > Well, I’m looking for something that will allow authors to collaborate > on a > > text over the web using basic Markdown and let me pick up the result for > > conversion in Pandoc, preferably without physically downloading single > > files. Basically a poor man’s single source publishing solution. Google > > Docs has comments and versions, and many people know it – but there may > be > > better, more simple alternatives, e.g. Etherpad-style editors. > > I like to use [HackMD] for collaboration. There is also an open source > version called [CodiMD]. > > More complex to set up, but also really nice: GitLab with CI pipelines > for automatic doc generation, combined with [Netlify CMS] for easy > metadata handling. > > [HackMD]: https://hackmd.io > [CodiMD]: https://github.com/hackmdio/codimd > [Netlify CMS]: https://www.netlifycms.org/ > > -- > Albert Krewinkel > GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/4c4a7da8-176b-469a-8c3c-b45d7b94f08an%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 2827 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <4c4a7da8-176b-469a-8c3c-b45d7b94f08an-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Read + convert (content from) Google Doc with public URL? [not found] ` <4c4a7da8-176b-469a-8c3c-b45d7b94f08an-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2020-09-24 14:35 ` David Denton 0 siblings, 0 replies; 9+ messages in thread From: David Denton @ 2020-09-24 14:35 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw [-- Attachment #1: Type: text/plain, Size: 2305 bytes --] Hackmd is very good. David On Thu., Sep. 24, 2020, 11:15 a.m. mb21, <mauro.bieg-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > there is also https://stackedit.io ... > > On Thursday, September 24, 2020 at 11:56:18 AM UTC+2 Albert Krewinkel > wrote: > >> >> Martin Post writes: >> >> > Well, I’m looking for something that will allow authors to collaborate >> on a >> > text over the web using basic Markdown and let me pick up the result >> for >> > conversion in Pandoc, preferably without physically downloading single >> > files. Basically a poor man’s single source publishing solution. Google >> > Docs has comments and versions, and many people know it – but there may >> be >> > better, more simple alternatives, e.g. Etherpad-style editors. >> >> I like to use [HackMD] for collaboration. There is also an open source >> version called [CodiMD]. >> >> More complex to set up, but also really nice: GitLab with CI pipelines >> for automatic doc generation, combined with [Netlify CMS] for easy >> metadata handling. >> >> [HackMD]: https://hackmd.io >> [CodiMD]: https://github.com/hackmdio/codimd >> [Netlify CMS]: https://www.netlifycms.org/ >> >> -- >> Albert Krewinkel >> GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 >> > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/4c4a7da8-176b-469a-8c3c-b45d7b94f08an%40googlegroups.com > <https://groups.google.com/d/msgid/pandoc-discuss/4c4a7da8-176b-469a-8c3c-b45d7b94f08an%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAEjmFJpSHmKOskVncJxrLR2ZYACYUQYSfSGDJmxPG%3D2V06hv0A%40mail.gmail.com. [-- Attachment #2: Type: text/html, Size: 3654 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Read + convert (content from) Google Doc with public URL? [not found] ` <bab82269-d7e3-4854-b4e5-7d5dc214d8d9n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2020-09-24 9:55 ` Albert Krewinkel @ 2020-09-24 18:38 ` BPJ [not found] ` <CADAJKhDju=hgJOiLZrxT8d_0i=c77NBsFRQXCDGtBZaHzUwiAQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 1 sibling, 1 reply; 9+ messages in thread From: BPJ @ 2020-09-24 18:38 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1: Type: text/plain, Size: 3954 bytes --] I found this blog post. Hopefully the info is not outdated. https://www.techrepublic.com/google-amp/article/how-to-collaborate-with-markdown-in-google-docs-and-google-drive/ https://v.gd/UbmlUY You could probably cobble together a script which converts DOCX <-> Markdown with Pandoc and uses the Google Docs API to up-/down load. I found this: https://developers.google.com/docs/api https://developers.google.com/docs/api/quickstart/python -- Better --help|less than helpless Den tors 24 sep. 2020 10:08Martin Post <martinpostberlin-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> skrev: > Well, I’m looking for something that will allow authors to collaborate on > a text over the web using basic Markdown and let me pick up the result for > conversion in Pandoc, preferably without physically downloading single > files. Basically a poor man’s single source publishing solution. Google > Docs has comments and versions, and many people know it – but there may be > better, more simple alternatives, e.g. Etherpad-style editors. > > It also seems that it’s possible to extract plain text from a Google Doc > for further processing: > https://developers.google.com/docs/api/samples/extract-text > > > > On Wednesday, September 23, 2020 at 8:21:52 PM UTC+2 BPJ wrote: > >> You have to download it, but you may have better luck by downloading as >> HTML (which gives you a zip file which you must unpack) than as DOCX since >> much formatting is lost going from DOCX. >> >> -- >> Better --help|less than helpless >> >> Den tis 22 sep. 2020 18:07Joseph Reagle <josep...-T1oY19WcHSwdnm+yROfE0A@public.gmane.org> skrev: >> >>> >>> >>> On 9/22/20 10:51 AM, Martin Post wrote: >>> > Is it possible (without too many extra steps/tools…) to have Pandoc >>> read and convert the content of a Google Docs document with a public >>> (viewing) URL? >>> >>> Not that I know of. I think you'd want to download it as a docx and then >>> convert from there. >>> >>> > I.e., if I have a Google doc containing only few lines of Markdown, >>> could Pandoc read and parse these as such, ignoring the documents styles? I >>> tried and only get a zsh shell error (“no matches found”). >>> >>> Ignoring GDocs styles and simply working on raw text (which happens to >>> be markdown) is a different question, and probably addressed by something >>> like this: >>> >>> ``` >>> lynx -dump https://docs.google.com/document/... | pandoc >>> ``` >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "pandoc-discuss" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/pandoc-discuss/ad7c70b6-e850-2e91-84da-74dae93a35e2%40reagle.org >>> . >>> >> -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/bab82269-d7e3-4854-b4e5-7d5dc214d8d9n%40googlegroups.com > <https://groups.google.com/d/msgid/pandoc-discuss/bab82269-d7e3-4854-b4e5-7d5dc214d8d9n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhDju%3DhgJOiLZrxT8d_0i%3Dc77NBsFRQXCDGtBZaHzUwiAQ%40mail.gmail.com. [-- Attachment #2: Type: text/html, Size: 6396 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <CADAJKhDju=hgJOiLZrxT8d_0i=c77NBsFRQXCDGtBZaHzUwiAQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Read + convert (content from) Google Doc with public URL? [not found] ` <CADAJKhDju=hgJOiLZrxT8d_0i=c77NBsFRQXCDGtBZaHzUwiAQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2020-10-02 13:42 ` Martin Post 0 siblings, 0 replies; 9+ messages in thread From: Martin Post @ 2020-10-02 13:42 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 5248 bytes --] Sorry for not contributing to this thread (which I started) for a week; I was a bit under the weather. Great and helpful contributions – thank you, everyone! So, in summary: - Direct “Pandoc scraping” from a Google Doc URL doesn’t work. - I installed and tried Lynx with -dump, but again, no dice. The problem seems to be that Lynx “sees” all the editor JS code and stumbles before parsing the visible content. - HackMD is great. I had bookmarked this ages ago and forgotten about it. Will look into it again as an alternative to the Google Docs workflow. @ Albert: I may get back to you about the suggested Netflify workflow. - StackEdit - not quite what I was looking for, but obviously an excellent (web) app. - @ BP: Great catch; this article and the two linked Google Docs add-ons are what I was looking for. It’s not fully “automatic” (Pandoc reading & converting Google Docs content directly), but close enough. Contributors can use comments, versioning etc., and I can still get Markdown-formatted content straight from the browser without downloading. Thanks again. On Thursday, September 24, 2020 at 8:38:32 PM UTC+2 BP wrote: > I found this blog post. Hopefully the info is not outdated. > > > https://www.techrepublic.com/google-amp/article/how-to-collaborate-with-markdown-in-google-docs-and-google-drive/ > > https://v.gd/UbmlUY > > You could probably cobble together a script which converts DOCX <-> > Markdown with Pandoc and uses the Google Docs API to up-/down load. I found > this: > > https://developers.google.com/docs/api > > https://developers.google.com/docs/api/quickstart/python > > -- > Better --help|less than helpless > Den tors 24 sep. 2020 10:08Martin Post <martinpo...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> skrev: > >> Well, I’m looking for something that will allow authors to collaborate on >> a text over the web using basic Markdown and let me pick up the result for >> conversion in Pandoc, preferably without physically downloading single >> files. Basically a poor man’s single source publishing solution. Google >> Docs has comments and versions, and many people know it – but there may be >> better, more simple alternatives, e.g. Etherpad-style editors. >> >> It also seems that it’s possible to extract plain text from a Google Doc >> for further processing: >> https://developers.google.com/docs/api/samples/extract-text >> >> >> >> On Wednesday, September 23, 2020 at 8:21:52 PM UTC+2 BPJ wrote: >> >>> You have to download it, but you may have better luck by downloading as >>> HTML (which gives you a zip file which you must unpack) than as DOCX since >>> much formatting is lost going from DOCX. >>> >>> -- >>> Better --help|less than helpless >>> >>> Den tis 22 sep. 2020 18:07Joseph Reagle <josep...-T1oY19WcHSwdnm+yROfE0A@public.gmane.org> skrev: >>> >>>> >>>> >>>> On 9/22/20 10:51 AM, Martin Post wrote: >>>> > Is it possible (without too many extra steps/tools…) to have Pandoc >>>> read and convert the content of a Google Docs document with a public >>>> (viewing) URL? >>>> >>>> Not that I know of. I think you'd want to download it as a docx and >>>> then convert from there. >>>> >>>> > I.e., if I have a Google doc containing only few lines of Markdown, >>>> could Pandoc read and parse these as such, ignoring the documents styles? I >>>> tried and only get a zsh shell error (“no matches found”). >>>> >>>> Ignoring GDocs styles and simply working on raw text (which happens to >>>> be markdown) is a different question, and probably addressed by something >>>> like this: >>>> >>>> ``` >>>> lynx -dump https://docs.google.com/document/... | pandoc >>>> ``` >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "pandoc-discuss" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/pandoc-discuss/ad7c70b6-e850-2e91-84da-74dae93a35e2%40reagle.org >>>> . >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "pandoc-discuss" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/pandoc-discuss/bab82269-d7e3-4854-b4e5-7d5dc214d8d9n%40googlegroups.com >> <https://groups.google.com/d/msgid/pandoc-discuss/bab82269-d7e3-4854-b4e5-7d5dc214d8d9n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/19c81526-bc8d-45e9-9877-60af6074e9e0n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 9850 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2020-10-02 13:42 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-09-22 14:51 Read + convert (content from) Google Doc with public URL? Martin Post [not found] ` <c20c4344-fd78-4d47-8700-7573f159df2bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2020-09-22 16:06 ` Joseph Reagle [not found] ` <ad7c70b6-e850-2e91-84da-74dae93a35e2-T1oY19WcHSwdnm+yROfE0A@public.gmane.org> 2020-09-23 18:21 ` BPJ 2020-09-24 8:07 ` Martin Post [not found] ` <bab82269-d7e3-4854-b4e5-7d5dc214d8d9n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2020-09-24 9:55 ` Albert Krewinkel [not found] ` <87y2kzjy84.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> 2020-09-24 14:15 ` mb21 [not found] ` <4c4a7da8-176b-469a-8c3c-b45d7b94f08an-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2020-09-24 14:35 ` David Denton 2020-09-24 18:38 ` BPJ [not found] ` <CADAJKhDju=hgJOiLZrxT8d_0i=c77NBsFRQXCDGtBZaHzUwiAQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2020-10-02 13:42 ` Martin Post
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).