public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Plain text writer?
@ 2014-06-23 15:18 David Cortesi
       [not found] ` <93c13503-9c76-4c74-9a14-75e3e2c0cc88-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 22+ messages in thread
From: David Cortesi @ 2014-06-23 15:18 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 987 bytes --]

I have a need to take a marked-up text (say in Pandoc Extended Markdown) 
and produce a completely plain etext -- with heads set off by newlines, and 
paragraph text reflowed to some reasonable margin like 72 chars, and the 
other various markups rendered as reasonably as possible using only spaces 
and newlines.

I don't see such an option in the list of writers. Am I overlooking it, or 
is this not supported?

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/93c13503-9c76-4c74-9a14-75e3e2c0cc88%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 1408 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Plain text writer?
       [not found] ` <93c13503-9c76-4c74-9a14-75e3e2c0cc88-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2014-06-23 15:53   ` Shahbaz Youssefi
       [not found]     ` <CALeOzZ9d5y=vuandCy_LiTmYg6088sd6+zBSWrknjO54TX8XAA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2014-06-23 15:54   ` Daniel Staal
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 22+ messages in thread
From: Shahbaz Youssefi @ 2014-06-23 15:53 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 2151 bytes --]

Out of curiosity, why would you need something like that? Markdown is
designed to be as readable as plain text. Nonetheless, you have the option
to convert the pandoc markdown to strict markdown which should strip some
of the formatting. I don't believe there is a "plain-text" writer.


On Mon, Jun 23, 2014 at 5:18 PM, David Cortesi <davecortesi-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
wrote:

> I have a need to take a marked-up text (say in Pandoc Extended Markdown)
> and produce a completely plain etext -- with heads set off by newlines, and
> paragraph text reflowed to some reasonable margin like 72 chars, and the
> other various markups rendered as reasonably as possible using only spaces
> and newlines.
>
> I don't see such an option in the list of writers. Am I overlooking it, or
> is this not supported?
>
>  --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/93c13503-9c76-4c74-9a14-75e3e2c0cc88%40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/93c13503-9c76-4c74-9a14-75e3e2c0cc88%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CALeOzZ9d5y%3DvuandCy_LiTmYg6088sd6%2BzBSWrknjO54TX8XAA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 3341 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Plain text writer?
       [not found] ` <93c13503-9c76-4c74-9a14-75e3e2c0cc88-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2014-06-23 15:53   ` Shahbaz Youssefi
@ 2014-06-23 15:54   ` Daniel Staal
  2014-06-23 16:09   ` John Gabriele
  2014-06-26 15:50   ` David Cortesi
  3 siblings, 0 replies; 22+ messages in thread
From: Daniel Staal @ 2014-06-23 15:54 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

--As of June 23, 2014 8:18:21 AM -0700, David Cortesi is alleged to have 
said:

> I have a need to take a marked-up text (say in Pandoc Extended Markdown)
> and produce a completely plain etext -- with heads set off by newlines,
> and paragraph text reflowed to some reasonable margin like 72 chars, and
> the other various markups rendered as reasonably as possible using only
> spaces and newlines.
>
> I don't see such an option in the list of writers. Am I overlooking it,
> or is this not supported?

--As for the rest, it is mine.

You are overlooking it.  ;)  The format you want to convert to is 'plain'.

You may still need to set a couple of other options for wrapping and such.

Daniel T. Staal

---------------------------------------------------------------
This email copyright the author.  Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes.  This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Plain text writer?
       [not found] ` <93c13503-9c76-4c74-9a14-75e3e2c0cc88-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2014-06-23 15:53   ` Shahbaz Youssefi
  2014-06-23 15:54   ` Daniel Staal
@ 2014-06-23 16:09   ` John Gabriele
  2014-06-26 15:50   ` David Cortesi
  3 siblings, 0 replies; 22+ messages in thread
From: John Gabriele @ 2014-06-23 16:09 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Mon, Jun 23, 2014, at 08:18 AM, David Cortesi wrote:
> I have a need to take a marked-up text (say in Pandoc Extended Markdown) 
> and produce a completely plain etext -- with heads set off by newlines,
> and 
> paragraph text reflowed to some reasonable margin like 72 chars, and the 
> other various markups rendered as reasonably as possible using only
> spaces 
> and newlines.
> 
> I don't see such an option in the list of writers. Am I overlooking it,
> or 
> is this not supported?

Try:

~~~
pandoc -t markdown my-input.txt
# and compare with
pandoc -t plain my-input.txt
~~~

Though, I don't know if there's a way to customize the formatting of the
resulting plain or markdown output...

-- John


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Plain text writer?
       [not found]     ` <CALeOzZ9d5y=vuandCy_LiTmYg6088sd6+zBSWrknjO54TX8XAA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-06-24  5:05       ` David Cortesi
       [not found]         ` <3ee59df7-6a6c-48e2-ab21-18d6abfa991a-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 22+ messages in thread
From: David Cortesi @ 2014-06-24  5:05 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 2638 bytes --]

Plain text, with paragraphs reflowed to 72 or so chars wide, formatted with 
only spaces and newlines, is the base standard for works published by 
Project Gutenberg. Most books are also presented in HTML a/o EPUB, but 
"plain" is a requirement.

http://www.gutenberg.org/wiki/Gutenberg:General_FAQ#G.17._Why_is_Project_Gutenberg_so_set_on_using_Plain_Vanilla_ASCII.3F

On Monday, June 23, 2014 8:53:50 AM UTC-7, Shahbaz Youssefi wrote:
>
> Out of curiosity, why would you need something like that? Markdown is 
> designed to be as readable as plain text. Nonetheless, you have the option 
> to convert the pandoc markdown to strict markdown which should strip some 
> of the formatting. I don't believe there is a "plain-text" writer.
>
>
> On Mon, Jun 23, 2014 at 5:18 PM, David Cortesi <davec...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org 
> <javascript:>> wrote:
>
>> I have a need to take a marked-up text (say in Pandoc Extended Markdown) 
>> and produce a completely plain etext -- with heads set off by newlines, and 
>> paragraph text reflowed to some reasonable margin like 72 chars, and the 
>> other various markups rendered as reasonably as possible using only spaces 
>> and newlines.
>>
>> I don't see such an option in the list of writers. Am I overlooking it, 
>> or is this not supported?
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>.
>> To post to this group, send email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org 
>> <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/pandoc-discuss/93c13503-9c76-4c74-9a14-75e3e2c0cc88%40googlegroups.com 
>> <https://groups.google.com/d/msgid/pandoc-discuss/93c13503-9c76-4c74-9a14-75e3e2c0cc88%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/3ee59df7-6a6c-48e2-ab21-18d6abfa991a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 4572 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Plain text writer?
       [not found]         ` <3ee59df7-6a6c-48e2-ab21-18d6abfa991a-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2014-06-24 17:43           ` Daniel Staal
  2014-06-25 15:01             ` David Cortesi
  0 siblings, 1 reply; 22+ messages in thread
From: Daniel Staal @ 2014-06-24 17:43 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

--As of June 23, 2014 10:05:23 PM -0700, David Cortesi is alleged to have 
said:

> Plain text, with paragraphs reflowed to 72 or so chars wide, formatted
> with only spaces and newlines, is the base standard for works published
> by Project Gutenberg. Most books are also presented in HTML a/o EPUB, but
> "plain" is a requirement.
>
> http://www.gutenberg.org/wiki/Gutenberg:General_FAQ#G.17._Why_is_Project_
> Gutenberg_so_set_on_using_Plain_Vanilla_ASCII.3F

--As for the rest, it is mine.

Agreed and they have a good point - however, for their purposes and 
description Markdown *is* plain text.  It even uses the same emphasis 
marker: single underscores.  Going to 'plain' will loose valuable 
information, and doesn't gain any particular usability or portability.

Now, Markdown doesn't exactly match to their guidelines I'll admit - but 
it's as close as 'plain' (which will require you to go through and 
re-insert emphasis, handle TOC, etc) is, really.

Daniel T. Staal

---------------------------------------------------------------
This email copyright the author.  Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes.  This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Plain text writer?
  2014-06-24 17:43           ` Daniel Staal
@ 2014-06-25 15:01             ` David Cortesi
       [not found]               ` <bc06c1a1-18b3-4e9c-a7d1-c4ba55c43a3d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 22+ messages in thread
From: David Cortesi @ 2014-06-25 15:01 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 1798 bytes --]



On Tuesday, June 24, 2014 10:43:10 AM UTC-7, Daniel Staal wrote:
>
> ...for their [Project Gutenberg's] purposes and 
> description Markdown *is* plain text.
>

Very possibly true. However PG is one of the oldest volunteer content 
providers (in fact, THE oldest, estab. 1971!) on the internet and my 
impression (on relatively few interactions) is they do not change anything 
quickly.

New etexts are accepted to the library only after complete inspection by 
volunteer "whitewashers" who operate under detailed and strict guidelines. 
Markdown deviates in a number of ways from those guidelines. Examples: in a 
PG etext, all paragraphs are reflowed; lines of a paragraph in a markdown 
text are rarely reflowed to fill the margins. A PG block quote is shown by 
white-space indention (and reflowed), Markdown uses leading &gt's and 
doesn't reflow. A Markdown list item begins in the left margin, signalled 
by a leading asterisk; a PG  list item is indented from the margin. PG 
shows the typography of a poem or playscript using indents and newlines to 
approximate the original layout. Standard markdown does not support this at 
all; Pandoc Extended requires pipes in the left margin to preserve varying 
indention.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/bc06c1a1-18b3-4e9c-a7d1-c4ba55c43a3d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 2367 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Plain text writer?
       [not found]               ` <bc06c1a1-18b3-4e9c-a7d1-c4ba55c43a3d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2014-06-25 15:32                 ` Paulo Ney de Souza
       [not found]                   ` <CAFVhNZPLe_3nm=LBTPRrnr1TBuj8d1vEhWT3Mxq=e69OPdZK3A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 22+ messages in thread
From: Paulo Ney de Souza @ 2014-06-25 15:32 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 3138 bytes --]

Markdown is text, just not the brand of text of PG-standards! Said that, it
would be very nice to have a reader/writer of "pgt" (PG text). It would
allow first - people to code in their own way and contribute texts to PG,
but above all, it would allow a better translation from PG texts to PDF and
ePub. Lots of translators have been written for that, including some by PG
themselves, but none of them as good as the ones in Pandoc!

Paulo Ney


On Wed, Jun 25, 2014 at 12:01 PM, David Cortesi <davecortesi-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
wrote:

>
>
> On Tuesday, June 24, 2014 10:43:10 AM UTC-7, Daniel Staal wrote:
>>
>> ...for their [Project Gutenberg's] purposes and
>> description Markdown *is* plain text.
>>
>
> Very possibly true. However PG is one of the oldest volunteer content
> providers (in fact, THE oldest, estab. 1971!) on the internet and my
> impression (on relatively few interactions) is they do not change anything
> quickly.
>
> New etexts are accepted to the library only after complete inspection by
> volunteer "whitewashers" who operate under detailed and strict guidelines.
> Markdown deviates in a number of ways from those guidelines. Examples: in a
> PG etext, all paragraphs are reflowed; lines of a paragraph in a markdown
> text are rarely reflowed to fill the margins. A PG block quote is shown by
> white-space indention (and reflowed), Markdown uses leading &gt's and
> doesn't reflow. A Markdown list item begins in the left margin, signalled
> by a leading asterisk; a PG  list item is indented from the margin. PG
> shows the typography of a poem or playscript using indents and newlines to
> approximate the original layout. Standard markdown does not support this at
> all; Pandoc Extended requires pipes in the left margin to preserve varying
> indention.
>
>  --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/bc06c1a1-18b3-4e9c-a7d1-c4ba55c43a3d%40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/bc06c1a1-18b3-4e9c-a7d1-c4ba55c43a3d%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAFVhNZPLe_3nm%3DLBTPRrnr1TBuj8d1vEhWT3Mxq%3De69OPdZK3A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 4474 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Plain text writer?
       [not found]                   ` <CAFVhNZPLe_3nm=LBTPRrnr1TBuj8d1vEhWT3Mxq=e69OPdZK3A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-06-25 19:46                     ` John MacFarlane
       [not found]                       ` <20140625194640.GA74174-bi+AKbBUZKbivNSvqvJHCtPlBySK3R6THiGdP5j34PU@public.gmane.org>
  0 siblings, 1 reply; 22+ messages in thread
From: John MacFarlane @ 2014-06-25 19:46 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

I think a pg writer is a nice idea.  It would be fairly easy, I think,
to do.  Currently the plain writer is implemented as an option in the
markdown writer.  The pg writer would involve a new option and
a few different behaviors.

Is there documentation someone for pg text standards?

+++ Paulo Ney de Souza [Jun 25 14 12:32 ]:
>   Markdown is text, just not the brand of text of PG-standards! Said
>   that, it would be very nice to have a reader/writer of "pgt" (PG text).
>   It would allow first - people to code in their own way and contribute
>   texts to PG, but above all, it would allow a better translation from PG
>   texts to PDF and ePub. Lots of translators have been written for that,
>   including some by PG themselves, but none of them as good as the ones
>   in Pandoc!
>   Paulo Ney
>
>   On Wed, Jun 25, 2014 at 12:01 PM, David Cortesi
>   <[1]davecortesi-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
>   On Tuesday, June 24, 2014 10:43:10 AM UTC-7, Daniel Staal wrote:
>
>     ...for their [Project Gutenberg's] purposes and
>     description Markdown *is* plain text.
>
>   Very possibly true. However PG is one of the oldest volunteer content
>   providers (in fact, THE oldest, estab. 1971!) on the internet and my
>   impression (on relatively few interactions) is they do not change
>   anything quickly.
>   New etexts are accepted to the library only after complete inspection
>   by volunteer "whitewashers" who operate under detailed and strict
>   guidelines. Markdown deviates in a number of ways from those
>   guidelines. Examples: in a PG etext, all paragraphs are reflowed; lines
>   of a paragraph in a markdown text are rarely reflowed to fill the
>   margins. A PG block quote is shown by white-space indention (and
>   reflowed), Markdown uses leading &gt's and doesn't reflow. A Markdown
>   list item begins in the left margin, signalled by a leading asterisk; a
>   PG  list item is indented from the margin. PG shows the typography of a
>   poem or playscript using indents and newlines to approximate the
>   original layout. Standard markdown does not support this at all; Pandoc
>   Extended requires pipes in the left margin to preserve varying
>   indention.
>
>     --
>     You received this message because you are subscribed to the Google
>     Groups "pandoc-discuss" group.
>     To unsubscribe from this group and stop receiving emails from it,
>     send an email to [2]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>     To post to this group, send email to
>     [3]pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>     To view this discussion on the web visit
>     [4]https://groups.google.com/d/msgid/pandoc-discuss/bc06c1a1-18b3-4e
>     9c-a7d1-c4ba55c43a3d%40googlegroups.com.
>     For more options, visit [5]https://groups.google.com/d/optout.
>
>   --
>   You received this message because you are subscribed to the Google
>   Groups "pandoc-discuss" group.
>   To unsubscribe from this group and stop receiving emails from it, send
>   an email to [6]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>   To post to this group, send email to
>   [7]pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>   To view this discussion on the web visit
>   [8]https://groups.google.com/d/msgid/pandoc-discuss/CAFVhNZPLe_3nm%3DLB
>   TPRrnr1TBuj8d1vEhWT3Mxq%3De69OPdZK3A%40mail.gmail.com.
>   For more options, visit [9]https://groups.google.com/d/optout.
>
>References
>
>   1. mailto:davecortesi-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
>   2. mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>   3. mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>   4. https://groups.google.com/d/msgid/pandoc-discuss/bc06c1a1-18b3-4e9c-a7d1-c4ba55c43a3d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org?utm_medium=email&utm_source=footer
>   5. https://groups.google.com/d/optout
>   6. mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>   7. mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
>   8. https://groups.google.com/d/msgid/pandoc-discuss/CAFVhNZPLe_3nm=LBTPRrnr1TBuj8d1vEhWT3Mxq=e69OPdZK3A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org?utm_medium=email&utm_source=footer
>   9. https://groups.google.com/d/optout


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Plain text writer?
       [not found]                       ` <20140625194640.GA74174-bi+AKbBUZKbivNSvqvJHCtPlBySK3R6THiGdP5j34PU@public.gmane.org>
@ 2014-06-25 22:44                         ` Daniel Staal
  2014-06-26 12:43                           ` BP Jonsson
  2014-06-29 18:27                           ` John MacFarlane
  0 siblings, 2 replies; 22+ messages in thread
From: Daniel Staal @ 2014-06-25 22:44 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

--As of June 25, 2014 12:46:40 PM -0700, John MacFarlane is alleged to have 
said:

> I think a pg writer is a nice idea.  It would be fairly easy, I think,
> to do.  Currently the plain writer is implemented as an option in the
> markdown writer.  The pg writer would involve a new option and
> a few different behaviors.
>
> Is there documentation someone for pg text standards?

--As for the rest, it is mine.

Here's what I found:
<http://www.gutenberg.org/wiki/Gutenberg:Volunteers%27_FAQ#About_the_formatting_of_a_text_file>

The main difference I see from Markdown is actually tables: They expect 
tables to be just formatted columns with no separators.  (Although they do 
mention borders if you need them, but prefer not to.)  Oh, and em-dashes 
are two hyphens, not three.  (And headings are paragraphs with extra space 
around them.)

They don't specifically mention lists in that section, so I assume it's 
probably convention-defined - probably as an extension of 'preformatted 
text', which they indent.

There may be more detailed descriptions someplace, but that seems like a 
start.  Of course it's not a 'formal' definition: It's guidelines they 
expect human editors to follow, not exact rules.

Daniel T. Staal

---------------------------------------------------------------
This email copyright the author.  Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes.  This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Plain text writer?
  2014-06-25 22:44                         ` Daniel Staal
@ 2014-06-26 12:43                           ` BP Jonsson
       [not found]                             ` <53AC1584.5000807-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
       [not found]                             ` <54948B80A67EB226AD63A8D6@192.168.1.50>
  2014-06-29 18:27                           ` John MacFarlane
  1 sibling, 2 replies; 22+ messages in thread
From: BP Jonsson @ 2014-06-26 12:43 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

2014-06-26 00:44, Daniel Staal skrev:
> --As of June 25, 2014 12:46:40 PM -0700, John MacFarlane is
> alleged to have said:
>
>> I think a pg writer is a nice idea. It would be fairly easy, I
>> think, to do. Currently the plain writer is implemented as an
>> option in the markdown writer. The pg writer would involve a
>> new option and a few different behaviors.
>>
>> Is there documentation someone for pg text standards?
>
> --As for the rest, it is mine.
>
> Here's what I found: <http://www.gutenberg.org/wiki/Gutenberg:-
> Volunteers%27_FAQ#About_the_formatting_of_a_text_file>
>
>
> The main difference I see from Markdown is actually tables:
> They expect tables to be just formatted columns with no
> separators. (Although they do mention borders if you need
> them, but prefer not to.) Oh, and em-dashes are two hyphens,
> not three. (And headings are paragraphs with extra space
> around them.)
>
> They don't specifically mention lists in that section, so I
> assume it's probably convention-defined - probably as an
> extension of 'preformatted text', which they indent.
>
> There may be more detailed descriptions someplace, but that
> seems like a start. Of course it's not a 'formal' definition:
> It's guidelines they expect human editors to follow, not
> exact rules.

It's pleasing to hear that my guess that modifying the markdown
writer into a PG writer. Allow me to suggest that writer/mode be
called 'gutenberg' or 'plain_gutenberg'; 'pg' is totally
uninformative for the uninitiated and many initiated alike.
Imagine Goiogling for 'project gutenberg' and not finding
pandoc's 'pg' mode.

t seems strange though if it is true that PG is dead set against
(plain) markdown formatted files. Clearly everyone would benefit
from PG accepting Markdown-formatted submissions. Not least
because there is a great tool for converting HTML to Markdown
called Pandoc! It would need to be educated about HTML tables
though. Is that in the works? I've been thinking of rewriting my
HTML table to Pandoc Markdown script but I'd rather see it became
superfluous, and not out of laziness!

/bpj


/bpj


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Plain text writer?
       [not found] ` <93c13503-9c76-4c74-9a14-75e3e2c0cc88-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
                     ` (2 preceding siblings ...)
  2014-06-23 16:09   ` John Gabriele
@ 2014-06-26 15:50   ` David Cortesi
  3 siblings, 0 replies; 22+ messages in thread
From: David Cortesi @ 2014-06-26 15:50 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 869 bytes --]

The main source of new PG etexts is Distributed Proofreaders, who 
crowd-source the cleanup and formatting of OCR'd texts. The following is a 
key part of their directions to the "Post Processor", the volunteer who 
creates the final version: 

http://www.pgdp.net/c/faq/post_proof.php#formatting

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/ffafef33-c2ac-412a-a999-903468dd15c2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 1292 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Plain text writer?
       [not found]                             ` <53AC1584.5000807-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2014-06-26 18:12                               ` Daniel Staal
  0 siblings, 0 replies; 22+ messages in thread
From: Daniel Staal @ 2014-06-26 18:12 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

--As of June 26, 2014 2:43:48 PM +0200, BP Jonsson is alleged to have said:

> It seems strange though if it is true that PG is dead set against
> (plain) markdown formatted files. Clearly everyone would benefit
> from PG accepting Markdown-formatted submissions. Not least
> because there is a great tool for converting HTML to Markdown
> called Pandoc! It would need to be educated about HTML tables
> though. Is that in the works? I've been thinking of rewriting my
> HTML table to Pandoc Markdown script but I'd rather see it became
> superfluous, and not out of laziness!

--As for the rest, it is mine.

My take wouldn't be that they are dead set against it so much as they have 
their own format, which they developed before Markdown existed, and expect 
people to use that.  Markdown might be better, but there's a lot to be said 
for 'this is the standard we use'.

Daniel T. Staal

---------------------------------------------------------------
This email copyright the author.  Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes.  This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Plain text writer?
       [not found]                               ` <54948B80A67EB226AD63A8D6-Q0ErXNX1RuZz+/J76PBWHg@public.gmane.org>
@ 2014-06-26 18:31                                 ` Paulo Ney de Souza
       [not found]                                   ` <CAFVhNZMM-ASog+2Km3MSfyDKPMXqWfY=Vz71L6EvV=95azU8dA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
       [not found]                                   ` <E5E9640CFF97B8EA4C7F509C@192.168.1.50>
  0 siblings, 2 replies; 22+ messages in thread
From: Paulo Ney de Souza @ 2014-06-26 18:31 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 3543 bytes --]

Their format was developed a LONG, LONG time ago ... and obviously is not
well-defined. They have problems with conversions and even with a tool to
test their own files for consistency (gutencheck or somethink like that)
that works only some of the time.

They have changed the acceptance standards quite a bit - the first time a
French book was prepared I remember them insisting of some weird
tex-look-alike symbology for the accented characters!

Along the way they started to accept HTML, other encodings and even UTF-8
these days... They started to accept HTML after a large conversion of the
old texts to HTML by automated tools.

One of the (unstated) reasons why they will not accept Markdown is that no
one has offerred to convert the backfile to Markdown. I am pretty sure they
would love to have a good generation of ePub and all of that that Pandoc
would allow.

Paulo Ney


On Thu, Jun 26, 2014 at 11:12 AM, Daniel Staal <DStaal-Jdbf3xiKgS8@public.gmane.org> wrote:

> --As of June 26, 2014 2:43:48 PM +0200, BP Jonsson is alleged to have said:
>
>  It seems strange though if it is true that PG is dead set against
>> (plain) markdown formatted files. Clearly everyone would benefit
>> from PG accepting Markdown-formatted submissions. Not least
>> because there is a great tool for converting HTML to Markdown
>> called Pandoc! It would need to be educated about HTML tables
>> though. Is that in the works? I've been thinking of rewriting my
>> HTML table to Pandoc Markdown script but I'd rather see it became
>> superfluous, and not out of laziness!
>>
>
> --As for the rest, it is mine.
>
> My take wouldn't be that they are dead set against it so much as they have
> their own format, which they developed before Markdown existed, and expect
> people to use that.  Markdown might be better, but there's a lot to be said
> for 'this is the standard we use'.
>
> Daniel T. Staal
>
> ---------------------------------------------------------------
> This email copyright the author.  Unless otherwise noted, you
> are expressly allowed to retransmit, quote, or otherwise use
> the contents for non-commercial purposes.  This copyright will
> expire 5 years after the author's death, or in 30 years,
> whichever is longer, unless such a period is in excess of
> local copyright law.
> ---------------------------------------------------------------
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/pandoc-discuss/54948B80A67EB226AD63A8D6%40%5B192.168.1.50%5D.
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAFVhNZMM-ASog%2B2Km3MSfyDKPMXqWfY%3DVz71L6EvV%3D95azU8dA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 4992 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Plain text writer?
       [not found]                                   ` <CAFVhNZMM-ASog+2Km3MSfyDKPMXqWfY=Vz71L6EvV=95azU8dA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-06-26 18:36                                     ` Daniel Staal
  2014-06-26 19:17                                       ` Jesse Rosenthal
  0 siblings, 1 reply; 22+ messages in thread
From: Daniel Staal @ 2014-06-26 18:36 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

--As of June 26, 2014 11:31:25 AM -0700, Paulo Ney de Souza is alleged to 
have said:

> One of the (unstated) reasons why they will not accept Markdown is that
> no one has offerred to convert the backfile to Markdown. I am pretty sure
> they would love to have a good generation of ePub and all of that that
> Pandoc would allow.

--As for the rest, it is mine.

Hmm.  If pandoc's going to be doing a writer anyway, how hard is a reader? 
Then the question is how many people we can have feeding documents...

Daniel T. Staal

---------------------------------------------------------------
This email copyright the author.  Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes.  This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Plain text writer?
       [not found]                                     ` <E5E9640CFF97B8EA4C7F509C-Q0ErXNX1RuZz+/J76PBWHg@public.gmane.org>
@ 2014-06-26 18:57                                       ` Paulo Ney de Souza
  0 siblings, 0 replies; 22+ messages in thread
From: Paulo Ney de Souza @ 2014-06-26 18:57 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 963 bytes --]

That is a lot harder! Maybe not for aces like John, but I have struggled
doing a reader for RTF...

Paulo Ney


On Thu, Jun 26, 2014 at 11:36 AM, Daniel Staal <DStaal-Jdbf3xiKgS8@public.gmane.org> wrote:

>
> Hmm.  If pandoc's going to be doing a writer anyway, how hard is a reader?
> Then the question is how many people we can have feeding documents...
>
> Daniel T. Staal
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAFVhNZOyhdnaKrwKp7oaoa2hjqCa58aVvemqN2u_XVzNz4e68g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 1769 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Plain text writer?
  2014-06-26 18:36                                     ` Daniel Staal
@ 2014-06-26 19:17                                       ` Jesse Rosenthal
       [not found]                                         ` <87k3837a6g.fsf-4GNroTWusrE@public.gmane.org>
  0 siblings, 1 reply; 22+ messages in thread
From: Jesse Rosenthal @ 2014-06-26 19:17 UTC (permalink / raw)
  To: Daniel Staal, pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Daniel Staal <DStaal-Jdbf3xiKgS8@public.gmane.org> writes:
> Hmm.  If pandoc's going to be doing a writer anyway, how hard is a reader? 
> Then the question is how many people we can have feeding documents...

Well, it would be mainly hard because there are a ton of ambiguities in
the old PG conventions---especially as it relates to using CAPS for
emphasis. If you're converting to underscore emphasis, there's no good
way to know which words should be capitalized or not (though I've come
up with some fairly reliable heuristics for making epubs). And plenty of
books have places where people indignantly insist that "*I* would never
do such a thing"---such emphasis on the single capital letter is totally
lost.

Plus, chapter and header standards are pretty ad-hoc. I mean, I can
understand it, so a smart parser probably could too, but I think it
might be fairly hard to get it consitently right-ish. And only right-ish
is probably worse for them than what they have.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Plain text writer?
       [not found]                                         ` <87k3837a6g.fsf-4GNroTWusrE@public.gmane.org>
@ 2014-06-26 20:18                                           ` Daniel Staal
  2014-06-26 22:26                                             ` BP Jonsson
  2014-06-26 21:06                                           ` Paulo Ney de Souza
  1 sibling, 1 reply; 22+ messages in thread
From: Daniel Staal @ 2014-06-26 20:18 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

--As of June 26, 2014 3:17:11 PM -0400, Jesse Rosenthal is alleged to have 
said:

> Daniel Staal <DStaal-Jdbf3xiKgS8@public.gmane.org> writes:
>> Hmm.  If pandoc's going to be doing a writer anyway, how hard is a
>> reader?  Then the question is how many people we can have feeding
>> documents...
>
> Well, it would be mainly hard because there are a ton of ambiguities in
> the old PG conventions---especially as it relates to using CAPS for
> emphasis. If you're converting to underscore emphasis, there's no good
> way to know which words should be capitalized or not (though I've come
> up with some fairly reliable heuristics for making epubs). And plenty of
> books have places where people indignantly insist that "*I* would never
> do such a thing"---such emphasis on the single capital letter is totally
> lost.
>
> Plus, chapter and header standards are pretty ad-hoc. I mean, I can
> understand it, so a smart parser probably could too, but I think it
> might be fairly hard to get it consitently right-ish. And only right-ish
> is probably worse for them than what they have.

--As for the rest, it is mine.

I know, I know...  But it'd be a good start, changing the problem from 
'convert the whole document' to 'edit the conversion', and from there 
people could start submitting 'markdown' versions of popular books.  Get 
enough of a start into it, and then someone can bring up changing the 
default.

Daniel T. Staal

---------------------------------------------------------------
This email copyright the author.  Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes.  This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Plain text writer?
       [not found]                                         ` <87k3837a6g.fsf-4GNroTWusrE@public.gmane.org>
  2014-06-26 20:18                                           ` Daniel Staal
@ 2014-06-26 21:06                                           ` Paulo Ney de Souza
  1 sibling, 0 replies; 22+ messages in thread
From: Paulo Ney de Souza @ 2014-06-26 21:06 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw; +Cc: Daniel Staal

[-- Attachment #1: Type: text/plain, Size: 3187 bytes --]

A lot of the talk of PG on the use of plain text made sense 30 years ago,
but not anymore. Today there is very little difference in the longevity of
plain text and an ePub, there is some but minimal...and books are more than
just the text they contain.

Back in the mid-80's people pleaded with Michael Hart to keep an image file
and allow the use of LaTeX, but would never budge.

The backfile should really be Marked up, but that is very hard because of
lack of an image file and the current size of PG.

Said that, the best way to accomplish anything near that goal would be to
have something close to a PGT Pandoc reader, so some of it could be
processed automatically.

Lots of PGT readers have been written and even PG has a format-checker that
sort of works most of the time...but they all lack, and when I want to read
any PG material I find myself marking it up in LaTeX first to process and
send it to a tablet.

Paulo Ney
On Jun 26, 2014 4:16 PM, "Jesse Rosenthal" <jrosenthal-4GNroTWusrE@public.gmane.org> wrote:

> Daniel Staal <DStaal-Jdbf3xiKgS8@public.gmane.org> writes:
> > Hmm.  If pandoc's going to be doing a writer anyway, how hard is a
> reader?
> > Then the question is how many people we can have feeding documents...
>
> Well, it would be mainly hard because there are a ton of ambiguities in
> the old PG conventions---especially as it relates to using CAPS for
> emphasis. If you're converting to underscore emphasis, there's no good
> way to know which words should be capitalized or not (though I've come
> up with some fairly reliable heuristics for making epubs). And plenty of
> books have places where people indignantly insist that "*I* would never
> do such a thing"---such emphasis on the single capital letter is totally
> lost.
>
> Plus, chapter and header standards are pretty ad-hoc. I mean, I can
> understand it, so a smart parser probably could too, but I think it
> might be fairly hard to get it consitently right-ish. And only right-ish
> is probably worse for them than what they have.
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/87k3837a6g.fsf%40jhu.edu.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAFVhNZPdcXNdY%3DYx6VYG60r%2BwOZtKKD%3Dqzk81NgmS%2BJS3MOGkw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 4424 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Plain text writer?
  2014-06-26 20:18                                           ` Daniel Staal
@ 2014-06-26 22:26                                             ` BP Jonsson
  0 siblings, 0 replies; 22+ messages in thread
From: BP Jonsson @ 2014-06-26 22:26 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

2014-06-26 22:18, Daniel Staal skrev:
> But it'd be a good start, changing the problem from 'convert the
> whole document' to 'edit the conversion',

Indeed.  That's what I do all the time.
I go doc(x) -> odt -> html -> perl filter -> pandoc -> markdown
-> hand editing -> pandoc -> LaTeX -> PDF
rather than 'dump the docx as plaintext' -> 'mark up LaTeX by hand'.

It actually takes less than half the time. Even the LibreOffice
extensions which output LaTeX are bad for me, because I need
Unicode-XeLaTeX-oriented LaTeX.

/bpj


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Plain text writer?
  2014-06-25 22:44                         ` Daniel Staal
  2014-06-26 12:43                           ` BP Jonsson
@ 2014-06-29 18:27                           ` John MacFarlane
       [not found]                             ` <20140629182730.GA14549-bi+AKbBUZKbivNSvqvJHCtPlBySK3R6THiGdP5j34PU@public.gmane.org>
  1 sibling, 1 reply; 22+ messages in thread
From: John MacFarlane @ 2014-06-29 18:27 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

I've started a gutenberg branch on github.  It should be fairly
easy to add a writer that uses PG conventions.

+++ Daniel Staal [Jun 25 14 18:44 ]:
>--As of June 25, 2014 12:46:40 PM -0700, John MacFarlane is alleged to 
>have said:
>
>>I think a pg writer is a nice idea.  It would be fairly easy, I think,
>>to do.  Currently the plain writer is implemented as an option in the
>>markdown writer.  The pg writer would involve a new option and
>>a few different behaviors.
>>
>>Is there documentation someone for pg text standards?
>
>--As for the rest, it is mine.
>
>Here's what I found:
><http://www.gutenberg.org/wiki/Gutenberg:Volunteers%27_FAQ#About_the_formatting_of_a_text_file>
>
>The main difference I see from Markdown is actually tables: They 
>expect tables to be just formatted columns with no separators.  
>(Although they do mention borders if you need them, but prefer not 
>to.)  Oh, and em-dashes are two hyphens, not three.  (And headings are 
>paragraphs with extra space around them.)
>
>They don't specifically mention lists in that section, so I assume 
>it's probably convention-defined - probably as an extension of 
>'preformatted text', which they indent.
>
>There may be more detailed descriptions someplace, but that seems like 
>a start.  Of course it's not a 'formal' definition: It's guidelines 
>they expect human editors to follow, not exact rules.
>
>Daniel T. Staal
>
>---------------------------------------------------------------
>This email copyright the author.  Unless otherwise noted, you
>are expressly allowed to retransmit, quote, or otherwise use
>the contents for non-commercial purposes.  This copyright will
>expire 5 years after the author's death, or in 30 years,
>whichever is longer, unless such a period is in excess of
>local copyright law.
>---------------------------------------------------------------
>
>-- 
>You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
>To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/145CA48A0EFA880377B1C6DA%40%5B192.168.1.50%5D.
>For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Plain text writer?
       [not found]                             ` <20140629182730.GA14549-bi+AKbBUZKbivNSvqvJHCtPlBySK3R6THiGdP5j34PU@public.gmane.org>
@ 2014-06-29 18:29                               ` Fred Zimmerman
  0 siblings, 0 replies; 22+ messages in thread
From: Fred Zimmerman @ 2014-06-29 18:29 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 3970 bytes --]

i'm very interested in the gutenberg branch -- great idea.

Fred Zimmerman
Ann Arbor, Michigan, USA
"a fox, not a hedgehog" -- Isaiah Berlin


On Sun, Jun 29, 2014 at 2:27 PM, John MacFarlane <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> wrote:

> I've started a gutenberg branch on github.  It should be fairly
> easy to add a writer that uses PG conventions.
>
> +++ Daniel Staal [Jun 25 14 18:44 ]:
>
>  --As of June 25, 2014 12:46:40 PM -0700, John MacFarlane is alleged to
>> have said:
>>
>>  I think a pg writer is a nice idea.  It would be fairly easy, I think,
>>> to do.  Currently the plain writer is implemented as an option in the
>>> markdown writer.  The pg writer would involve a new option and
>>> a few different behaviors.
>>>
>>> Is there documentation someone for pg text standards?
>>>
>>
>> --As for the rest, it is mine.
>>
>> Here's what I found:
>> <http://www.gutenberg.org/wiki/Gutenberg:Volunteers%27_
>> FAQ#About_the_formatting_of_a_text_file>
>>
>> The main difference I see from Markdown is actually tables: They expect
>> tables to be just formatted columns with no separators.  (Although they do
>> mention borders if you need them, but prefer not to.)  Oh, and em-dashes
>> are two hyphens, not three.  (And headings are paragraphs with extra space
>> around them.)
>>
>> They don't specifically mention lists in that section, so I assume it's
>> probably convention-defined - probably as an extension of 'preformatted
>> text', which they indent.
>>
>> There may be more detailed descriptions someplace, but that seems like a
>> start.  Of course it's not a 'formal' definition: It's guidelines they
>> expect human editors to follow, not exact rules.
>>
>> Daniel T. Staal
>>
>> ---------------------------------------------------------------
>> This email copyright the author.  Unless otherwise noted, you
>> are expressly allowed to retransmit, quote, or otherwise use
>> the contents for non-commercial purposes.  This copyright will
>> expire 5 years after the author's death, or in 30 years,
>> whichever is longer, unless such a period is in excess of
>> local copyright law.
>> ---------------------------------------------------------------
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> To view this discussion on the web visit https://groups.google.com/d/
>> msgid/pandoc-discuss/145CA48A0EFA880377B1C6DA%40%5B192.168.1.50%5D.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/pandoc-discuss/20140629182730.GA14549%40localhost.hsd1.ca.comcast.
> net.
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAJ4fGLs9HQ%2B%2BDHCAPtSwN%3DSwnVetq3QrQKpF94OxY6CjnZWPAw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 6135 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2014-06-29 18:29 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-23 15:18 Plain text writer? David Cortesi
     [not found] ` <93c13503-9c76-4c74-9a14-75e3e2c0cc88-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2014-06-23 15:53   ` Shahbaz Youssefi
     [not found]     ` <CALeOzZ9d5y=vuandCy_LiTmYg6088sd6+zBSWrknjO54TX8XAA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-06-24  5:05       ` David Cortesi
     [not found]         ` <3ee59df7-6a6c-48e2-ab21-18d6abfa991a-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2014-06-24 17:43           ` Daniel Staal
2014-06-25 15:01             ` David Cortesi
     [not found]               ` <bc06c1a1-18b3-4e9c-a7d1-c4ba55c43a3d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2014-06-25 15:32                 ` Paulo Ney de Souza
     [not found]                   ` <CAFVhNZPLe_3nm=LBTPRrnr1TBuj8d1vEhWT3Mxq=e69OPdZK3A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-06-25 19:46                     ` John MacFarlane
     [not found]                       ` <20140625194640.GA74174-bi+AKbBUZKbivNSvqvJHCtPlBySK3R6THiGdP5j34PU@public.gmane.org>
2014-06-25 22:44                         ` Daniel Staal
2014-06-26 12:43                           ` BP Jonsson
     [not found]                             ` <53AC1584.5000807-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-06-26 18:12                               ` Daniel Staal
     [not found]                             ` <54948B80A67EB226AD63A8D6@192.168.1.50>
     [not found]                               ` <54948B80A67EB226AD63A8D6-Q0ErXNX1RuZz+/J76PBWHg@public.gmane.org>
2014-06-26 18:31                                 ` Paulo Ney de Souza
     [not found]                                   ` <CAFVhNZMM-ASog+2Km3MSfyDKPMXqWfY=Vz71L6EvV=95azU8dA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-06-26 18:36                                     ` Daniel Staal
2014-06-26 19:17                                       ` Jesse Rosenthal
     [not found]                                         ` <87k3837a6g.fsf-4GNroTWusrE@public.gmane.org>
2014-06-26 20:18                                           ` Daniel Staal
2014-06-26 22:26                                             ` BP Jonsson
2014-06-26 21:06                                           ` Paulo Ney de Souza
     [not found]                                   ` <E5E9640CFF97B8EA4C7F509C@192.168.1.50>
     [not found]                                     ` <E5E9640CFF97B8EA4C7F509C-Q0ErXNX1RuZz+/J76PBWHg@public.gmane.org>
2014-06-26 18:57                                       ` Paulo Ney de Souza
2014-06-29 18:27                           ` John MacFarlane
     [not found]                             ` <20140629182730.GA14549-bi+AKbBUZKbivNSvqvJHCtPlBySK3R6THiGdP5j34PU@public.gmane.org>
2014-06-29 18:29                               ` Fred Zimmerman
2014-06-23 15:54   ` Daniel Staal
2014-06-23 16:09   ` John Gabriele
2014-06-26 15:50   ` David Cortesi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).