* Read + convert (content from) Google Doc with public URL?
@ 2020-09-22 14:51 Martin Post
[not found] ` <c20c4344-fd78-4d47-8700-7573f159df2bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Martin Post @ 2020-09-22 14:51 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 1172 bytes --]
I may be barking up the wrong tree here, but I am sure that it will tell me
so. :)
Pandoc can convert HTML documents from public URLs. (pandoc -f html -t
markdown https://www.fsf.org -o result.md)
Is it possible (without too many extra steps/tools…) to have Pandoc read
and convert the content of a Google Docs document with a public (viewing)
URL?
I.e., if I have a Google doc containing only few lines of Markdown, could
Pandoc read and parse these as such, ignoring the documents styles? I tried
and only get a zsh shell error (“no matches found”).
I assume this would require getting the text-only content from such a doc
using Google’s APIs, but I’m no coder and wouldn’t know where to start here.
Thank you.
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/c20c4344-fd78-4d47-8700-7573f159df2bn%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 1573 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Read + convert (content from) Google Doc with public URL?
[not found] ` <c20c4344-fd78-4d47-8700-7573f159df2bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2020-09-22 16:06 ` Joseph Reagle
[not found] ` <ad7c70b6-e850-2e91-84da-74dae93a35e2-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Joseph Reagle @ 2020-09-22 16:06 UTC (permalink / raw)
To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw
On 9/22/20 10:51 AM, Martin Post wrote:
> Is it possible (without too many extra steps/tools…) to have Pandoc read and convert the content of a Google Docs document with a public (viewing) URL?
Not that I know of. I think you'd want to download it as a docx and then convert from there.
> I.e., if I have a Google doc containing only few lines of Markdown, could Pandoc read and parse these as such, ignoring the documents styles? I tried and only get a zsh shell error (“no matches found”).
Ignoring GDocs styles and simply working on raw text (which happens to be markdown) is a different question, and probably addressed by something like this:
```
lynx -dump https://docs.google.com/document/... | pandoc
```
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/ad7c70b6-e850-2e91-84da-74dae93a35e2%40reagle.org.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Read + convert (content from) Google Doc with public URL?
[not found] ` <ad7c70b6-e850-2e91-84da-74dae93a35e2-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>
@ 2020-09-23 18:21 ` BPJ
2020-09-24 8:07 ` Martin Post
0 siblings, 1 reply; 9+ messages in thread
From: BPJ @ 2020-09-23 18:21 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1: Type: text/plain, Size: 1985 bytes --]
You have to download it, but you may have better luck by downloading as
HTML (which gives you a zip file which you must unpack) than as DOCX since
much formatting is lost going from DOCX.
--
Better --help|less than helpless
Den tis 22 sep. 2020 18:07Joseph Reagle <joseph.2011-T1oY19WcHSwdnm+yROfE0A@public.gmane.org> skrev:
>
>
> On 9/22/20 10:51 AM, Martin Post wrote:
> > Is it possible (without too many extra steps/tools…) to have Pandoc read
> and convert the content of a Google Docs document with a public (viewing)
> URL?
>
> Not that I know of. I think you'd want to download it as a docx and then
> convert from there.
>
> > I.e., if I have a Google doc containing only few lines of Markdown,
> could Pandoc read and parse these as such, ignoring the documents styles? I
> tried and only get a zsh shell error (“no matches found”).
>
> Ignoring GDocs styles and simply working on raw text (which happens to be
> markdown) is a different question, and probably addressed by something like
> this:
>
> ```
> lynx -dump https://docs.google.com/document/... | pandoc
> ```
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/ad7c70b6-e850-2e91-84da-74dae93a35e2%40reagle.org
> .
>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhAUW%3DHWiqVDFh0KGq8xopaiKR2vUb7-CyKHMbvv_JdA%3DA%40mail.gmail.com.
[-- Attachment #2: Type: text/html, Size: 2958 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Read + convert (content from) Google Doc with public URL?
2020-09-23 18:21 ` BPJ
@ 2020-09-24 8:07 ` Martin Post
[not found] ` <bab82269-d7e3-4854-b4e5-7d5dc214d8d9n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Martin Post @ 2020-09-24 8:07 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 2703 bytes --]
Well, I’m looking for something that will allow authors to collaborate on a
text over the web using basic Markdown and let me pick up the result for
conversion in Pandoc, preferably without physically downloading single
files. Basically a poor man’s single source publishing solution. Google
Docs has comments and versions, and many people know it – but there may be
better, more simple alternatives, e.g. Etherpad-style editors.
It also seems that it’s possible to extract plain text from a Google Doc
for further processing:
https://developers.google.com/docs/api/samples/extract-text
On Wednesday, September 23, 2020 at 8:21:52 PM UTC+2 BPJ wrote:
> You have to download it, but you may have better luck by downloading as
> HTML (which gives you a zip file which you must unpack) than as DOCX since
> much formatting is lost going from DOCX.
>
> --
> Better --help|less than helpless
>
> Den tis 22 sep. 2020 18:07Joseph Reagle <josep...-T1oY19WcHSwdnm+yROfE0A@public.gmane.org> skrev:
>
>>
>>
>> On 9/22/20 10:51 AM, Martin Post wrote:
>> > Is it possible (without too many extra steps/tools…) to have Pandoc
>> read and convert the content of a Google Docs document with a public
>> (viewing) URL?
>>
>> Not that I know of. I think you'd want to download it as a docx and then
>> convert from there.
>>
>> > I.e., if I have a Google doc containing only few lines of Markdown,
>> could Pandoc read and parse these as such, ignoring the documents styles? I
>> tried and only get a zsh shell error (“no matches found”).
>>
>> Ignoring GDocs styles and simply working on raw text (which happens to be
>> markdown) is a different question, and probably addressed by something like
>> this:
>>
>> ```
>> lynx -dump https://docs.google.com/document/... | pandoc
>> ```
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/pandoc-discuss/ad7c70b6-e850-2e91-84da-74dae93a35e2%40reagle.org
>> .
>>
>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/bab82269-d7e3-4854-b4e5-7d5dc214d8d9n%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 4452 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Read + convert (content from) Google Doc with public URL?
[not found] ` <bab82269-d7e3-4854-b4e5-7d5dc214d8d9n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2020-09-24 9:55 ` Albert Krewinkel
[not found] ` <87y2kzjy84.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
2020-09-24 18:38 ` BPJ
1 sibling, 1 reply; 9+ messages in thread
From: Albert Krewinkel @ 2020-09-24 9:55 UTC (permalink / raw)
To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw
Martin Post writes:
> Well, I’m looking for something that will allow authors to collaborate on a
> text over the web using basic Markdown and let me pick up the result for
> conversion in Pandoc, preferably without physically downloading single
> files. Basically a poor man’s single source publishing solution. Google
> Docs has comments and versions, and many people know it – but there may be
> better, more simple alternatives, e.g. Etherpad-style editors.
I like to use [HackMD] for collaboration. There is also an open source
version called [CodiMD].
More complex to set up, but also really nice: GitLab with CI pipelines
for automatic doc generation, combined with [Netlify CMS] for easy
metadata handling.
[HackMD]: https://hackmd.io
[CodiMD]: https://github.com/hackmdio/codimd
[Netlify CMS]: https://www.netlifycms.org/
--
Albert Krewinkel
GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/87y2kzjy84.fsf%40zeitkraut.de.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Read + convert (content from) Google Doc with public URL?
[not found] ` <87y2kzjy84.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
@ 2020-09-24 14:15 ` mb21
[not found] ` <4c4a7da8-176b-469a-8c3c-b45d7b94f08an-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: mb21 @ 2020-09-24 14:15 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 1533 bytes --]
there is also https://stackedit.io ...
On Thursday, September 24, 2020 at 11:56:18 AM UTC+2 Albert Krewinkel wrote:
>
> Martin Post writes:
>
> > Well, I’m looking for something that will allow authors to collaborate
> on a
> > text over the web using basic Markdown and let me pick up the result for
> > conversion in Pandoc, preferably without physically downloading single
> > files. Basically a poor man’s single source publishing solution. Google
> > Docs has comments and versions, and many people know it – but there may
> be
> > better, more simple alternatives, e.g. Etherpad-style editors.
>
> I like to use [HackMD] for collaboration. There is also an open source
> version called [CodiMD].
>
> More complex to set up, but also really nice: GitLab with CI pipelines
> for automatic doc generation, combined with [Netlify CMS] for easy
> metadata handling.
>
> [HackMD]: https://hackmd.io
> [CodiMD]: https://github.com/hackmdio/codimd
> [Netlify CMS]: https://www.netlifycms.org/
>
> --
> Albert Krewinkel
> GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124
>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/4c4a7da8-176b-469a-8c3c-b45d7b94f08an%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 2827 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Read + convert (content from) Google Doc with public URL?
[not found] ` <4c4a7da8-176b-469a-8c3c-b45d7b94f08an-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2020-09-24 14:35 ` David Denton
0 siblings, 0 replies; 9+ messages in thread
From: David Denton @ 2020-09-24 14:35 UTC (permalink / raw)
To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw
[-- Attachment #1: Type: text/plain, Size: 2305 bytes --]
Hackmd is very good.
David
On Thu., Sep. 24, 2020, 11:15 a.m. mb21, <mauro.bieg-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> there is also https://stackedit.io ...
>
> On Thursday, September 24, 2020 at 11:56:18 AM UTC+2 Albert Krewinkel
> wrote:
>
>>
>> Martin Post writes:
>>
>> > Well, I’m looking for something that will allow authors to collaborate
>> on a
>> > text over the web using basic Markdown and let me pick up the result
>> for
>> > conversion in Pandoc, preferably without physically downloading single
>> > files. Basically a poor man’s single source publishing solution. Google
>> > Docs has comments and versions, and many people know it – but there may
>> be
>> > better, more simple alternatives, e.g. Etherpad-style editors.
>>
>> I like to use [HackMD] for collaboration. There is also an open source
>> version called [CodiMD].
>>
>> More complex to set up, but also really nice: GitLab with CI pipelines
>> for automatic doc generation, combined with [Netlify CMS] for easy
>> metadata handling.
>>
>> [HackMD]: https://hackmd.io
>> [CodiMD]: https://github.com/hackmdio/codimd
>> [Netlify CMS]: https://www.netlifycms.org/
>>
>> --
>> Albert Krewinkel
>> GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124
>>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/4c4a7da8-176b-469a-8c3c-b45d7b94f08an%40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/4c4a7da8-176b-469a-8c3c-b45d7b94f08an%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAEjmFJpSHmKOskVncJxrLR2ZYACYUQYSfSGDJmxPG%3D2V06hv0A%40mail.gmail.com.
[-- Attachment #2: Type: text/html, Size: 3654 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Read + convert (content from) Google Doc with public URL?
[not found] ` <bab82269-d7e3-4854-b4e5-7d5dc214d8d9n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-09-24 9:55 ` Albert Krewinkel
@ 2020-09-24 18:38 ` BPJ
[not found] ` <CADAJKhDju=hgJOiLZrxT8d_0i=c77NBsFRQXCDGtBZaHzUwiAQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
1 sibling, 1 reply; 9+ messages in thread
From: BPJ @ 2020-09-24 18:38 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1: Type: text/plain, Size: 3954 bytes --]
I found this blog post. Hopefully the info is not outdated.
https://www.techrepublic.com/google-amp/article/how-to-collaborate-with-markdown-in-google-docs-and-google-drive/
https://v.gd/UbmlUY
You could probably cobble together a script which converts DOCX <->
Markdown with Pandoc and uses the Google Docs API to up-/down load. I found
this:
https://developers.google.com/docs/api
https://developers.google.com/docs/api/quickstart/python
--
Better --help|less than helpless
Den tors 24 sep. 2020 10:08Martin Post <martinpostberlin-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> skrev:
> Well, I’m looking for something that will allow authors to collaborate on
> a text over the web using basic Markdown and let me pick up the result for
> conversion in Pandoc, preferably without physically downloading single
> files. Basically a poor man’s single source publishing solution. Google
> Docs has comments and versions, and many people know it – but there may be
> better, more simple alternatives, e.g. Etherpad-style editors.
>
> It also seems that it’s possible to extract plain text from a Google Doc
> for further processing:
> https://developers.google.com/docs/api/samples/extract-text
>
>
>
> On Wednesday, September 23, 2020 at 8:21:52 PM UTC+2 BPJ wrote:
>
>> You have to download it, but you may have better luck by downloading as
>> HTML (which gives you a zip file which you must unpack) than as DOCX since
>> much formatting is lost going from DOCX.
>>
>> --
>> Better --help|less than helpless
>>
>> Den tis 22 sep. 2020 18:07Joseph Reagle <josep...-T1oY19WcHSwdnm+yROfE0A@public.gmane.org> skrev:
>>
>>>
>>>
>>> On 9/22/20 10:51 AM, Martin Post wrote:
>>> > Is it possible (without too many extra steps/tools…) to have Pandoc
>>> read and convert the content of a Google Docs document with a public
>>> (viewing) URL?
>>>
>>> Not that I know of. I think you'd want to download it as a docx and then
>>> convert from there.
>>>
>>> > I.e., if I have a Google doc containing only few lines of Markdown,
>>> could Pandoc read and parse these as such, ignoring the documents styles? I
>>> tried and only get a zsh shell error (“no matches found”).
>>>
>>> Ignoring GDocs styles and simply working on raw text (which happens to
>>> be markdown) is a different question, and probably addressed by something
>>> like this:
>>>
>>> ```
>>> lynx -dump https://docs.google.com/document/... | pandoc
>>> ```
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "pandoc-discuss" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/pandoc-discuss/ad7c70b6-e850-2e91-84da-74dae93a35e2%40reagle.org
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/bab82269-d7e3-4854-b4e5-7d5dc214d8d9n%40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/bab82269-d7e3-4854-b4e5-7d5dc214d8d9n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhDju%3DhgJOiLZrxT8d_0i%3Dc77NBsFRQXCDGtBZaHzUwiAQ%40mail.gmail.com.
[-- Attachment #2: Type: text/html, Size: 6396 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Read + convert (content from) Google Doc with public URL?
[not found] ` <CADAJKhDju=hgJOiLZrxT8d_0i=c77NBsFRQXCDGtBZaHzUwiAQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2020-10-02 13:42 ` Martin Post
0 siblings, 0 replies; 9+ messages in thread
From: Martin Post @ 2020-10-02 13:42 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 5248 bytes --]
Sorry for not contributing to this thread (which I started) for a week; I
was a bit under the weather.
Great and helpful contributions – thank you, everyone! So, in summary:
- Direct “Pandoc scraping” from a Google Doc URL doesn’t work.
- I installed and tried Lynx with -dump, but again, no dice. The problem
seems to be that Lynx “sees” all the editor JS code and stumbles before
parsing the visible content.
- HackMD is great. I had bookmarked this ages ago and forgotten about it.
Will look into it again as an alternative to the Google Docs workflow. @
Albert: I may get back to you about the suggested Netflify workflow.
- StackEdit - not quite what I was looking for, but obviously an excellent
(web) app.
- @ BP: Great catch; this article and the two linked Google Docs add-ons
are what I was looking for. It’s not fully “automatic” (Pandoc reading &
converting Google Docs content directly), but close enough. Contributors
can use comments, versioning etc., and I can still get Markdown-formatted
content straight from the browser without downloading.
Thanks again.
On Thursday, September 24, 2020 at 8:38:32 PM UTC+2 BP wrote:
> I found this blog post. Hopefully the info is not outdated.
>
>
> https://www.techrepublic.com/google-amp/article/how-to-collaborate-with-markdown-in-google-docs-and-google-drive/
>
> https://v.gd/UbmlUY
>
> You could probably cobble together a script which converts DOCX <->
> Markdown with Pandoc and uses the Google Docs API to up-/down load. I found
> this:
>
> https://developers.google.com/docs/api
>
> https://developers.google.com/docs/api/quickstart/python
>
> --
> Better --help|less than helpless
> Den tors 24 sep. 2020 10:08Martin Post <martinpo...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> skrev:
>
>> Well, I’m looking for something that will allow authors to collaborate on
>> a text over the web using basic Markdown and let me pick up the result for
>> conversion in Pandoc, preferably without physically downloading single
>> files. Basically a poor man’s single source publishing solution. Google
>> Docs has comments and versions, and many people know it – but there may be
>> better, more simple alternatives, e.g. Etherpad-style editors.
>>
>> It also seems that it’s possible to extract plain text from a Google Doc
>> for further processing:
>> https://developers.google.com/docs/api/samples/extract-text
>>
>>
>>
>> On Wednesday, September 23, 2020 at 8:21:52 PM UTC+2 BPJ wrote:
>>
>>> You have to download it, but you may have better luck by downloading as
>>> HTML (which gives you a zip file which you must unpack) than as DOCX since
>>> much formatting is lost going from DOCX.
>>>
>>> --
>>> Better --help|less than helpless
>>>
>>> Den tis 22 sep. 2020 18:07Joseph Reagle <josep...-T1oY19WcHSwdnm+yROfE0A@public.gmane.org> skrev:
>>>
>>>>
>>>>
>>>> On 9/22/20 10:51 AM, Martin Post wrote:
>>>> > Is it possible (without too many extra steps/tools…) to have Pandoc
>>>> read and convert the content of a Google Docs document with a public
>>>> (viewing) URL?
>>>>
>>>> Not that I know of. I think you'd want to download it as a docx and
>>>> then convert from there.
>>>>
>>>> > I.e., if I have a Google doc containing only few lines of Markdown,
>>>> could Pandoc read and parse these as such, ignoring the documents styles? I
>>>> tried and only get a zsh shell error (“no matches found”).
>>>>
>>>> Ignoring GDocs styles and simply working on raw text (which happens to
>>>> be markdown) is a different question, and probably addressed by something
>>>> like this:
>>>>
>>>> ```
>>>> lynx -dump https://docs.google.com/document/... | pandoc
>>>> ```
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "pandoc-discuss" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/pandoc-discuss/ad7c70b6-e850-2e91-84da-74dae93a35e2%40reagle.org
>>>> .
>>>>
>>> --
>> You received this message because you are subscribed to the Google Groups
>> "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>
> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/pandoc-discuss/bab82269-d7e3-4854-b4e5-7d5dc214d8d9n%40googlegroups.com
>> <https://groups.google.com/d/msgid/pandoc-discuss/bab82269-d7e3-4854-b4e5-7d5dc214d8d9n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/19c81526-bc8d-45e9-9877-60af6074e9e0n%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 9850 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2020-10-02 13:42 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-22 14:51 Read + convert (content from) Google Doc with public URL? Martin Post
[not found] ` <c20c4344-fd78-4d47-8700-7573f159df2bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-09-22 16:06 ` Joseph Reagle
[not found] ` <ad7c70b6-e850-2e91-84da-74dae93a35e2-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>
2020-09-23 18:21 ` BPJ
2020-09-24 8:07 ` Martin Post
[not found] ` <bab82269-d7e3-4854-b4e5-7d5dc214d8d9n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-09-24 9:55 ` Albert Krewinkel
[not found] ` <87y2kzjy84.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
2020-09-24 14:15 ` mb21
[not found] ` <4c4a7da8-176b-469a-8c3c-b45d7b94f08an-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-09-24 14:35 ` David Denton
2020-09-24 18:38 ` BPJ
[not found] ` <CADAJKhDju=hgJOiLZrxT8d_0i=c77NBsFRQXCDGtBZaHzUwiAQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2020-10-02 13:42 ` Martin Post
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).