public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Read + convert (content from) Google Doc with public URL?
@ 2020-09-22 14:51 Martin Post
       [not found] ` <c20c4344-fd78-4d47-8700-7573f159df2bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Martin Post @ 2020-09-22 14:51 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1172 bytes --]

I may be barking up the wrong tree here, but I am sure that it will tell me 
so. :)

Pandoc can convert HTML documents from public URLs. (pandoc -f html -t 
markdown https://www.fsf.org -o result.md)

Is it possible (without too many extra steps/tools…) to have Pandoc read 
and convert the content of a Google Docs document with a public (viewing) 
URL?

I.e., if I have a Google doc containing only few lines of Markdown, could 
Pandoc read and parse these as such, ignoring the documents styles? I tried 
and only get a zsh shell error (“no matches found”).

I assume this would require getting the text-only content from such a doc 
using Google’s APIs, but I’m no coder and wouldn’t know where to start here.

Thank you.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/c20c4344-fd78-4d47-8700-7573f159df2bn%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 1573 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Read + convert (content from) Google Doc with public URL?
       [not found] ` <c20c4344-fd78-4d47-8700-7573f159df2bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2020-09-22 16:06   ` Joseph Reagle
       [not found]     ` <ad7c70b6-e850-2e91-84da-74dae93a35e2-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Joseph Reagle @ 2020-09-22 16:06 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw



On 9/22/20 10:51 AM, Martin Post wrote:
> Is it possible (without too many extra steps/tools…) to have Pandoc read and convert the content of a Google Docs document with a public (viewing) URL?

Not that I know of. I think you'd want to download it as a docx and then convert from there.

> I.e., if I have a Google doc containing only few lines of Markdown, could Pandoc read and parse these as such, ignoring the documents styles? I tried and only get a zsh shell error (“no matches found”).

Ignoring GDocs styles and simply working on raw text (which happens to be markdown) is a different question, and probably addressed by something like this:

```
lynx -dump https://docs.google.com/document/... | pandoc
```

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/ad7c70b6-e850-2e91-84da-74dae93a35e2%40reagle.org.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Read + convert (content from) Google Doc with public URL?
       [not found]     ` <ad7c70b6-e850-2e91-84da-74dae93a35e2-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>
@ 2020-09-23 18:21       ` BPJ
  2020-09-24  8:07         ` Martin Post
  0 siblings, 1 reply; 9+ messages in thread
From: BPJ @ 2020-09-23 18:21 UTC (permalink / raw)
  To: pandoc-discuss

[-- Attachment #1: Type: text/plain, Size: 1985 bytes --]

You have to download it, but you may have better luck by downloading as
HTML (which gives you a zip file which you must unpack) than as DOCX since
much formatting is lost going from DOCX.

-- 
Better --help|less than helpless

Den tis 22 sep. 2020 18:07Joseph Reagle <joseph.2011-T1oY19WcHSwdnm+yROfE0A@public.gmane.org> skrev:

>
>
> On 9/22/20 10:51 AM, Martin Post wrote:
> > Is it possible (without too many extra steps/tools…) to have Pandoc read
> and convert the content of a Google Docs document with a public (viewing)
> URL?
>
> Not that I know of. I think you'd want to download it as a docx and then
> convert from there.
>
> > I.e., if I have a Google doc containing only few lines of Markdown,
> could Pandoc read and parse these as such, ignoring the documents styles? I
> tried and only get a zsh shell error (“no matches found”).
>
> Ignoring GDocs styles and simply working on raw text (which happens to be
> markdown) is a different question, and probably addressed by something like
> this:
>
> ```
> lynx -dump https://docs.google.com/document/... | pandoc
> ```
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/ad7c70b6-e850-2e91-84da-74dae93a35e2%40reagle.org
> .
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhAUW%3DHWiqVDFh0KGq8xopaiKR2vUb7-CyKHMbvv_JdA%3DA%40mail.gmail.com.

[-- Attachment #2: Type: text/html, Size: 2958 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Read + convert (content from) Google Doc with public URL?
  2020-09-23 18:21       ` BPJ
@ 2020-09-24  8:07         ` Martin Post
       [not found]           ` <bab82269-d7e3-4854-b4e5-7d5dc214d8d9n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Martin Post @ 2020-09-24  8:07 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2703 bytes --]

Well, I’m looking for something that will allow authors to collaborate on a 
text over the web using basic Markdown and let me pick up the result for 
conversion in Pandoc, preferably without physically downloading single 
files. Basically a poor man’s single source publishing solution. Google 
Docs has comments and versions, and many people know it – but there may be 
better, more simple alternatives, e.g. Etherpad-style editors.

It also seems that it’s possible to extract plain text from a Google Doc 
for further processing:
https://developers.google.com/docs/api/samples/extract-text



On Wednesday, September 23, 2020 at 8:21:52 PM UTC+2 BPJ wrote:

> You have to download it, but you may have better luck by downloading as 
> HTML (which gives you a zip file which you must unpack) than as DOCX since 
> much formatting is lost going from DOCX.
>
> -- 
> Better --help|less than helpless
>
> Den tis 22 sep. 2020 18:07Joseph Reagle <josep...-T1oY19WcHSwdnm+yROfE0A@public.gmane.org> skrev:
>
>>
>>
>> On 9/22/20 10:51 AM, Martin Post wrote:
>> > Is it possible (without too many extra steps/tools…) to have Pandoc 
>> read and convert the content of a Google Docs document with a public 
>> (viewing) URL?
>>
>> Not that I know of. I think you'd want to download it as a docx and then 
>> convert from there.
>>
>> > I.e., if I have a Google doc containing only few lines of Markdown, 
>> could Pandoc read and parse these as such, ignoring the documents styles? I 
>> tried and only get a zsh shell error (“no matches found”).
>>
>> Ignoring GDocs styles and simply working on raw text (which happens to be 
>> markdown) is a different question, and probably addressed by something like 
>> this:
>>
>> ```
>> lynx -dump https://docs.google.com/document/... | pandoc
>> ```
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/pandoc-discuss/ad7c70b6-e850-2e91-84da-74dae93a35e2%40reagle.org
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/bab82269-d7e3-4854-b4e5-7d5dc214d8d9n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 4452 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Read + convert (content from) Google Doc with public URL?
       [not found]           ` <bab82269-d7e3-4854-b4e5-7d5dc214d8d9n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2020-09-24  9:55             ` Albert Krewinkel
       [not found]               ` <87y2kzjy84.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
  2020-09-24 18:38             ` BPJ
  1 sibling, 1 reply; 9+ messages in thread
From: Albert Krewinkel @ 2020-09-24  9:55 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw


Martin Post writes:

> Well, I’m looking for something that will allow authors to collaborate on a
> text over the web using basic Markdown and let me pick up the result for
> conversion in Pandoc, preferably without physically downloading single
> files. Basically a poor man’s single source publishing solution. Google
> Docs has comments and versions, and many people know it – but there may be
> better, more simple alternatives, e.g. Etherpad-style editors.

I like to use [HackMD] for collaboration. There is also an open source
version called [CodiMD].

More complex to set up, but also really nice: GitLab with CI pipelines
for automatic doc generation, combined with [Netlify CMS] for easy
metadata handling.

[HackMD]: https://hackmd.io
[CodiMD]: https://github.com/hackmdio/codimd
[Netlify CMS]: https://www.netlifycms.org/

--
Albert Krewinkel
GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/87y2kzjy84.fsf%40zeitkraut.de.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Read + convert (content from) Google Doc with public URL?
       [not found]               ` <87y2kzjy84.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
@ 2020-09-24 14:15                 ` mb21
       [not found]                   ` <4c4a7da8-176b-469a-8c3c-b45d7b94f08an-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: mb21 @ 2020-09-24 14:15 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1533 bytes --]

there is also https://stackedit.io ...

On Thursday, September 24, 2020 at 11:56:18 AM UTC+2 Albert Krewinkel wrote:

>
> Martin Post writes:
>
> > Well, I’m looking for something that will allow authors to collaborate 
> on a
> > text over the web using basic Markdown and let me pick up the result for
> > conversion in Pandoc, preferably without physically downloading single
> > files. Basically a poor man’s single source publishing solution. Google
> > Docs has comments and versions, and many people know it – but there may 
> be
> > better, more simple alternatives, e.g. Etherpad-style editors.
>
> I like to use [HackMD] for collaboration. There is also an open source
> version called [CodiMD].
>
> More complex to set up, but also really nice: GitLab with CI pipelines
> for automatic doc generation, combined with [Netlify CMS] for easy
> metadata handling.
>
> [HackMD]: https://hackmd.io
> [CodiMD]: https://github.com/hackmdio/codimd
> [Netlify CMS]: https://www.netlifycms.org/
>
> --
> Albert Krewinkel
> GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/4c4a7da8-176b-469a-8c3c-b45d7b94f08an%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 2827 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Read + convert (content from) Google Doc with public URL?
       [not found]                   ` <4c4a7da8-176b-469a-8c3c-b45d7b94f08an-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2020-09-24 14:35                     ` David Denton
  0 siblings, 0 replies; 9+ messages in thread
From: David Denton @ 2020-09-24 14:35 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 2305 bytes --]

Hackmd is very good.

David

On Thu., Sep. 24, 2020, 11:15 a.m. mb21, <mauro.bieg-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> there is also https://stackedit.io ...
>
> On Thursday, September 24, 2020 at 11:56:18 AM UTC+2 Albert Krewinkel
> wrote:
>
>>
>> Martin Post writes:
>>
>> > Well, I’m looking for something that will allow authors to collaborate
>> on a
>> > text over the web using basic Markdown and let me pick up the result
>> for
>> > conversion in Pandoc, preferably without physically downloading single
>> > files. Basically a poor man’s single source publishing solution. Google
>> > Docs has comments and versions, and many people know it – but there may
>> be
>> > better, more simple alternatives, e.g. Etherpad-style editors.
>>
>> I like to use [HackMD] for collaboration. There is also an open source
>> version called [CodiMD].
>>
>> More complex to set up, but also really nice: GitLab with CI pipelines
>> for automatic doc generation, combined with [Netlify CMS] for easy
>> metadata handling.
>>
>> [HackMD]: https://hackmd.io
>> [CodiMD]: https://github.com/hackmdio/codimd
>> [Netlify CMS]: https://www.netlifycms.org/
>>
>> --
>> Albert Krewinkel
>> GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124
>>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/4c4a7da8-176b-469a-8c3c-b45d7b94f08an%40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/4c4a7da8-176b-469a-8c3c-b45d7b94f08an%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAEjmFJpSHmKOskVncJxrLR2ZYACYUQYSfSGDJmxPG%3D2V06hv0A%40mail.gmail.com.

[-- Attachment #2: Type: text/html, Size: 3654 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Read + convert (content from) Google Doc with public URL?
       [not found]           ` <bab82269-d7e3-4854-b4e5-7d5dc214d8d9n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2020-09-24  9:55             ` Albert Krewinkel
@ 2020-09-24 18:38             ` BPJ
       [not found]               ` <CADAJKhDju=hgJOiLZrxT8d_0i=c77NBsFRQXCDGtBZaHzUwiAQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 9+ messages in thread
From: BPJ @ 2020-09-24 18:38 UTC (permalink / raw)
  To: pandoc-discuss

[-- Attachment #1: Type: text/plain, Size: 3954 bytes --]

I found this blog post. Hopefully the info is not outdated.

https://www.techrepublic.com/google-amp/article/how-to-collaborate-with-markdown-in-google-docs-and-google-drive/

https://v.gd/UbmlUY

You could probably cobble together a script which converts DOCX <->
Markdown with Pandoc and uses the Google Docs API to up-/down load. I found
this:

https://developers.google.com/docs/api

https://developers.google.com/docs/api/quickstart/python

-- 
Better --help|less than helpless

Den tors 24 sep. 2020 10:08Martin Post <martinpostberlin-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> skrev:

> Well, I’m looking for something that will allow authors to collaborate on
> a text over the web using basic Markdown and let me pick up the result for
> conversion in Pandoc, preferably without physically downloading single
> files. Basically a poor man’s single source publishing solution. Google
> Docs has comments and versions, and many people know it – but there may be
> better, more simple alternatives, e.g. Etherpad-style editors.
>
> It also seems that it’s possible to extract plain text from a Google Doc
> for further processing:
> https://developers.google.com/docs/api/samples/extract-text
>
>
>
> On Wednesday, September 23, 2020 at 8:21:52 PM UTC+2 BPJ wrote:
>
>> You have to download it, but you may have better luck by downloading as
>> HTML (which gives you a zip file which you must unpack) than as DOCX since
>> much formatting is lost going from DOCX.
>>
>> --
>> Better --help|less than helpless
>>
>> Den tis 22 sep. 2020 18:07Joseph Reagle <josep...-T1oY19WcHSwdnm+yROfE0A@public.gmane.org> skrev:
>>
>>>
>>>
>>> On 9/22/20 10:51 AM, Martin Post wrote:
>>> > Is it possible (without too many extra steps/tools…) to have Pandoc
>>> read and convert the content of a Google Docs document with a public
>>> (viewing) URL?
>>>
>>> Not that I know of. I think you'd want to download it as a docx and then
>>> convert from there.
>>>
>>> > I.e., if I have a Google doc containing only few lines of Markdown,
>>> could Pandoc read and parse these as such, ignoring the documents styles? I
>>> tried and only get a zsh shell error (“no matches found”).
>>>
>>> Ignoring GDocs styles and simply working on raw text (which happens to
>>> be markdown) is a different question, and probably addressed by something
>>> like this:
>>>
>>> ```
>>> lynx -dump https://docs.google.com/document/... | pandoc
>>> ```
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "pandoc-discuss" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/pandoc-discuss/ad7c70b6-e850-2e91-84da-74dae93a35e2%40reagle.org
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/bab82269-d7e3-4854-b4e5-7d5dc214d8d9n%40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/bab82269-d7e3-4854-b4e5-7d5dc214d8d9n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhDju%3DhgJOiLZrxT8d_0i%3Dc77NBsFRQXCDGtBZaHzUwiAQ%40mail.gmail.com.

[-- Attachment #2: Type: text/html, Size: 6396 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Read + convert (content from) Google Doc with public URL?
       [not found]               ` <CADAJKhDju=hgJOiLZrxT8d_0i=c77NBsFRQXCDGtBZaHzUwiAQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2020-10-02 13:42                 ` Martin Post
  0 siblings, 0 replies; 9+ messages in thread
From: Martin Post @ 2020-10-02 13:42 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 5248 bytes --]

Sorry for not contributing to this thread (which I started) for a week; I 
was a bit under the weather.

Great and helpful contributions – thank you, everyone!  So, in summary:

- Direct “Pandoc scraping” from a Google Doc URL doesn’t work.
- I installed and tried Lynx with -dump, but again, no dice. The problem 
seems to be that Lynx “sees” all the editor JS code and stumbles before 
parsing the visible content.
- HackMD is great. I had bookmarked this ages ago and forgotten about it. 
Will look into it again as an alternative to the Google Docs workflow. @ 
Albert: I may get back to you about the suggested Netflify workflow.
- StackEdit - not quite what I was looking for, but obviously an excellent 
(web) app.
- @ BP: Great catch; this article and the two linked Google Docs add-ons 
are what I was looking for. It’s not fully “automatic” (Pandoc reading & 
converting Google Docs content directly), but close enough. Contributors 
can use comments, versioning etc., and I can still get Markdown-formatted 
content straight from the browser without downloading.

Thanks again.


On Thursday, September 24, 2020 at 8:38:32 PM UTC+2 BP wrote:

> I found this blog post. Hopefully the info is not outdated.
>
>
> https://www.techrepublic.com/google-amp/article/how-to-collaborate-with-markdown-in-google-docs-and-google-drive/
>
> https://v.gd/UbmlUY
>
> You could probably cobble together a script which converts DOCX <-> 
> Markdown with Pandoc and uses the Google Docs API to up-/down load. I found 
> this:
>
> https://developers.google.com/docs/api
>
> https://developers.google.com/docs/api/quickstart/python
>
> -- 
> Better --help|less than helpless
> Den tors 24 sep. 2020 10:08Martin Post <martinpo...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> skrev:
>
>> Well, I’m looking for something that will allow authors to collaborate on 
>> a text over the web using basic Markdown and let me pick up the result for 
>> conversion in Pandoc, preferably without physically downloading single 
>> files. Basically a poor man’s single source publishing solution. Google 
>> Docs has comments and versions, and many people know it – but there may be 
>> better, more simple alternatives, e.g. Etherpad-style editors.
>>
>> It also seems that it’s possible to extract plain text from a Google Doc 
>> for further processing:
>> https://developers.google.com/docs/api/samples/extract-text
>>
>>
>>
>> On Wednesday, September 23, 2020 at 8:21:52 PM UTC+2 BPJ wrote:
>>
>>> You have to download it, but you may have better luck by downloading as 
>>> HTML (which gives you a zip file which you must unpack) than as DOCX since 
>>> much formatting is lost going from DOCX.
>>>
>>> -- 
>>> Better --help|less than helpless
>>>
>>> Den tis 22 sep. 2020 18:07Joseph Reagle <josep...-T1oY19WcHSwdnm+yROfE0A@public.gmane.org> skrev:
>>>
>>>>
>>>>
>>>> On 9/22/20 10:51 AM, Martin Post wrote:
>>>> > Is it possible (without too many extra steps/tools…) to have Pandoc 
>>>> read and convert the content of a Google Docs document with a public 
>>>> (viewing) URL?
>>>>
>>>> Not that I know of. I think you'd want to download it as a docx and 
>>>> then convert from there.
>>>>
>>>> > I.e., if I have a Google doc containing only few lines of Markdown, 
>>>> could Pandoc read and parse these as such, ignoring the documents styles? I 
>>>> tried and only get a zsh shell error (“no matches found”).
>>>>
>>>> Ignoring GDocs styles and simply working on raw text (which happens to 
>>>> be markdown) is a different question, and probably addressed by something 
>>>> like this:
>>>>
>>>> ```
>>>> lynx -dump https://docs.google.com/document/... | pandoc
>>>> ```
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "pandoc-discuss" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/pandoc-discuss/ad7c70b6-e850-2e91-84da-74dae93a35e2%40reagle.org
>>>> .
>>>>
>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/pandoc-discuss/bab82269-d7e3-4854-b4e5-7d5dc214d8d9n%40googlegroups.com 
>> <https://groups.google.com/d/msgid/pandoc-discuss/bab82269-d7e3-4854-b4e5-7d5dc214d8d9n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/19c81526-bc8d-45e9-9877-60af6074e9e0n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 9850 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-10-02 13:42 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-22 14:51 Read + convert (content from) Google Doc with public URL? Martin Post
     [not found] ` <c20c4344-fd78-4d47-8700-7573f159df2bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-09-22 16:06   ` Joseph Reagle
     [not found]     ` <ad7c70b6-e850-2e91-84da-74dae93a35e2-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>
2020-09-23 18:21       ` BPJ
2020-09-24  8:07         ` Martin Post
     [not found]           ` <bab82269-d7e3-4854-b4e5-7d5dc214d8d9n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-09-24  9:55             ` Albert Krewinkel
     [not found]               ` <87y2kzjy84.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
2020-09-24 14:15                 ` mb21
     [not found]                   ` <4c4a7da8-176b-469a-8c3c-b45d7b94f08an-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-09-24 14:35                     ` David Denton
2020-09-24 18:38             ` BPJ
     [not found]               ` <CADAJKhDju=hgJOiLZrxT8d_0i=c77NBsFRQXCDGtBZaHzUwiAQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2020-10-02 13:42                 ` Martin Post

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).