public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Plain text as input source?
@ 2020-10-04 10:27 Igor M.
  0 siblings, 0 replies; 10+ messages in thread
From: Igor M. @ 2020-10-04 10:27 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 97 bytes --]

Hello,

Is it possible to use plain text format for Pandoc conversion? It looks 
like it is not.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Plain text as input source?
       [not found]                 ` <m2lfglsv4x.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
@ 2020-10-04 18:54                   ` EBkysko
  0 siblings, 0 replies; 10+ messages in thread
From: EBkysko @ 2020-10-04 18:54 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 962 bytes --]

Accessing directly the raw contents would be the point of the 
Identity/Plain reader ('Null' not appropriate after all), without the need 
of code/rawblock or html comment hacks... but since the hacks exist, such a 
reader is not so pressing.

Custom reader: Yes, that was the theoretical idea, and parsing to feed the 
constructors (especially without lpeg!) would be the non-trivial part... as 
can be seen in the effort you and others have put in the Readers!
(at least we'd have some pointers with `lunamark`, and up to a point 
`commonmark.js`!)

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/df734fb0-b615-4925-adf6-40a5e22e67fen%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 1301 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Plain text as input source?
       [not found] ` <CAB667SRpXR8oEb1A0xkNCoQybg3Q+fkK8YZZM0BcOBoyt10qtw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2020-10-04 12:02   ` Albert Krewinkel
  2020-10-04 12:06   ` Albert Krewinkel
@ 2020-10-04 18:48   ` Marc Chantreux
  2 siblings, 0 replies; 10+ messages in thread
From: Marc Chantreux @ 2020-10-04 18:48 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

hello Igor,

> Is it possible to use plain text as the input source for Pandoc
> conversion? It looks like it isn't.

i would like to mention that groff (w/o any roff instruction)
does a a good job on this

* indent with tabs
* paragraph with empty lines
* more empty lines for more spaces
* (i probably miss things there)

i wrote an example here https://github.com/eiro/roff-experience/tree/master/writer-guide

i started using troff because of its tty formatted output (something pandoc
don't target AFAIK) but the most useful outputs are available: html,
pdf, ps.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Plain text as input source?
       [not found]             ` <33e61065-ecf7-413c-a6fc-871cd4a1e431n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2020-10-04 18:15               ` EBkysko
@ 2020-10-04 18:28               ` John MacFarlane
       [not found]                 ` <m2lfglsv4x.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
  1 sibling, 1 reply; 10+ messages in thread
From: John MacFarlane @ 2020-10-04 18:28 UTC (permalink / raw)
  To: EBkysko, pandoc-discuss


For the "custom lua reader" application you describe, the
most useful thing would be access to the raw contents of
the original file or files.

This could be achived by a reader (where it be called `plain`
or something else) that simply reads the entire input stream
and puts it into a CodeBlock (or perhaps RawBlock (Format "plain")?)
where its contents would be available for parsing.

With this setup, you could create a lua filter that does the
parsing and AST generation, using the pandoc AST-generating code
that is already exposed to filters.

If we wanted to go in this direction (supporting custom lua
readers as filters), it might be interesting to include lpeg
with our built-in lua.

EBkysko <ebkysko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> If there's ever such a reader, I hope there will be the option of having 
> all the text in a single string in a `Para/Str`, without any other elements 
> like `Space` or `Softbreak`. This would be a "pure" plain text reader, or a 
> "Null" reader.
>
> Since the subject has reappeared, I'll repeat a few things I've said in 
> issue #6393.
>
> * Could be useful for preprocessing with the embedded Lua, and then read 
> back in the AST with `pandoc.read` (or just return as Plain Text).
>
> * This could offer a way to introduce a "custom (lua) reader", just as we 
> have a "custom (lua) writer". (Not trivial to create a parser once read, 
> but the symmetry is pleasing :) )
>
> * I vaguely remember tarleb/AK wishing eventually to have `pandoc` behave 
> like `lualatex`, in having a way to use the embedded Lua purely as is 
> (heavily paraphrasing, must find source); this wouldn't exactly fit the 
> bill, but still have some of its attributes, as in:
>
> text -> [Null reader] -> [Lua filter, return single Para/Str] -> [Plain 
> writer] -> modified text
>
> * As mb21 suggested, there could be an option to break into `Para`s on 
> newlines or double newlines, but that can be done from within a lua filter.
>
> Anyway, this isn't essential, and can be simulated as already noted:
>
> - As I said in #6393, this can be simulated by bracketing with html 
> comments, but the text must not have html comments itself.
>
> - As you said in #2705 (which I didn't know about when commenting in 
> #6393), this can also be simulated by bracketing with code block fences; 
> but one must take care these fences are longer than any other fences in the 
> text (and the solution is to just bracket the text between very long fences 
> that one wouldn't use in any text).
>
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/33e61065-ecf7-413c-a6fc-871cd4a1e431n%40googlegroups.com.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Plain text as input source?
       [not found]             ` <33e61065-ecf7-413c-a6fc-871cd4a1e431n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2020-10-04 18:15               ` EBkysko
  2020-10-04 18:28               ` John MacFarlane
  1 sibling, 0 replies; 10+ messages in thread
From: EBkysko @ 2020-10-04 18:15 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 718 bytes --]

(well, ok, my 3rd point above isn't quite thought out and a bit crazy... 
but if the initial text is itself a lua file... and the lua filter 
processes and executes the instructions, and the output is the result... 
discard the idea at will! I was just trying to play with the idea.. sorry 
for the noise.)

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/418b249d-6933-410d-aada-e228152450d9n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 1010 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Plain text as input source?
       [not found]         ` <m2wo06rpfr.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
@ 2020-10-04 17:54           ` EBkysko
       [not found]             ` <33e61065-ecf7-413c-a6fc-871cd4a1e431n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: EBkysko @ 2020-10-04 17:54 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2122 bytes --]


If there's ever such a reader, I hope there will be the option of having 
all the text in a single string in a `Para/Str`, without any other elements 
like `Space` or `Softbreak`. This would be a "pure" plain text reader, or a 
"Null" reader.

Since the subject has reappeared, I'll repeat a few things I've said in 
issue #6393.

* Could be useful for preprocessing with the embedded Lua, and then read 
back in the AST with `pandoc.read` (or just return as Plain Text).

* This could offer a way to introduce a "custom (lua) reader", just as we 
have a "custom (lua) writer". (Not trivial to create a parser once read, 
but the symmetry is pleasing :) )

* I vaguely remember tarleb/AK wishing eventually to have `pandoc` behave 
like `lualatex`, in having a way to use the embedded Lua purely as is 
(heavily paraphrasing, must find source); this wouldn't exactly fit the 
bill, but still have some of its attributes, as in:

text -> [Null reader] -> [Lua filter, return single Para/Str] -> [Plain 
writer] -> modified text

* As mb21 suggested, there could be an option to break into `Para`s on 
newlines or double newlines, but that can be done from within a lua filter.

Anyway, this isn't essential, and can be simulated as already noted:

- As I said in #6393, this can be simulated by bracketing with html 
comments, but the text must not have html comments itself.

- As you said in #2705 (which I didn't know about when commenting in 
#6393), this can also be simulated by bracketing with code block fences; 
but one must take care these fences are longer than any other fences in the 
text (and the solution is to just bracket the text between very long fences 
that one wouldn't use in any text).

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/33e61065-ecf7-413c-a6fc-871cd4a1e431n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 2514 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Plain text as input source?
       [not found]     ` <87a6x2p55a.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
@ 2020-10-04 15:17       ` John MacFarlane
       [not found]         ` <m2wo06rpfr.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: John MacFarlane @ 2020-10-04 15:17 UTC (permalink / raw)
  To: Albert Krewinkel, pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw


Is this what you're looking for:  producing a pandoc AST with
just Str, Space, SoftBreak elements, divided into Para by blank
lines?

As noted, that would be easy to add, but I'm never quite sure
what would be most useful for a "plain text reader."


Albert Krewinkel <albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> writes:

> Igor Maslennikov writes:
>
>> Is it possible to use plain text as the input source for Pandoc
>> conversion? It looks like it isn't.
>
> Correction, this is actually a better resource:
> https://github.com/jgm/pandoc/issues/6393
>
>
> -- 
> Albert Krewinkel
> GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124
>
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/87a6x2p55a.fsf%40zeitkraut.de.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Plain text as input source?
       [not found] ` <CAB667SRpXR8oEb1A0xkNCoQybg3Q+fkK8YZZM0BcOBoyt10qtw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2020-10-04 12:02   ` Albert Krewinkel
@ 2020-10-04 12:06   ` Albert Krewinkel
       [not found]     ` <87a6x2p55a.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
  2020-10-04 18:48   ` Marc Chantreux
  2 siblings, 1 reply; 10+ messages in thread
From: Albert Krewinkel @ 2020-10-04 12:06 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw


Igor Maslennikov writes:

> Is it possible to use plain text as the input source for Pandoc
> conversion? It looks like it isn't.

Correction, this is actually a better resource:
https://github.com/jgm/pandoc/issues/6393


-- 
Albert Krewinkel
GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Plain text as input source?
       [not found] ` <CAB667SRpXR8oEb1A0xkNCoQybg3Q+fkK8YZZM0BcOBoyt10qtw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2020-10-04 12:02   ` Albert Krewinkel
  2020-10-04 12:06   ` Albert Krewinkel
  2020-10-04 18:48   ` Marc Chantreux
  2 siblings, 0 replies; 10+ messages in thread
From: Albert Krewinkel @ 2020-10-04 12:02 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw


Igor Maslennikov writes:

> Is it possible to use plain text as the input source for Pandoc
> conversion? It looks like it isn't.

See the discussion in this issue for why that's the case:
https://github.com/jgm/pandoc/issues/2705


-- 
Albert Krewinkel
GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Plain text as input source?
@ 2020-10-04 10:26 Igor Maslennikov
       [not found] ` <CAB667SRpXR8oEb1A0xkNCoQybg3Q+fkK8YZZM0BcOBoyt10qtw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Igor Maslennikov @ 2020-10-04 10:26 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Hello,

Is it possible to use plain text as the input source for Pandoc
conversion? It looks like it isn't.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-10-04 18:54 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-04 10:27 Plain text as input source? Igor M.
  -- strict thread matches above, loose matches on Subject: below --
2020-10-04 10:26 Igor Maslennikov
     [not found] ` <CAB667SRpXR8oEb1A0xkNCoQybg3Q+fkK8YZZM0BcOBoyt10qtw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2020-10-04 12:02   ` Albert Krewinkel
2020-10-04 12:06   ` Albert Krewinkel
     [not found]     ` <87a6x2p55a.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
2020-10-04 15:17       ` John MacFarlane
     [not found]         ` <m2wo06rpfr.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
2020-10-04 17:54           ` EBkysko
     [not found]             ` <33e61065-ecf7-413c-a6fc-871cd4a1e431n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-10-04 18:15               ` EBkysko
2020-10-04 18:28               ` John MacFarlane
     [not found]                 ` <m2lfglsv4x.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
2020-10-04 18:54                   ` EBkysko
2020-10-04 18:48   ` Marc Chantreux

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).