* markdown writer line wrapping @ 2010-11-24 1:51 Nathan Gass [not found] ` <4CEC6F95.5000408-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org> 0 siblings, 1 reply; 10+ messages in thread From: Nathan Gass @ 2010-11-24 1:51 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw Some questions about the line wrapping implementation for the markdown writer. Is it correct that line wrapping only happens at the top level, so something like *very long emphasized text ... end of it* currently does not get wrapped? And how would I go about to implement arbitrary deep line wrapping, so that something like *very long empahsized text [with a very long citation inside for @key p. 10]* gets wrapped correctly? How does line-wrapping work in pandoc? Nathan ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <4CEC6F95.5000408-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>]
* Re: markdown writer line wrapping [not found] ` <4CEC6F95.5000408-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org> @ 2010-11-24 3:25 ` John MacFarlane [not found] ` <20101124032534.GB25133-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org> 0 siblings, 1 reply; 10+ messages in thread From: John MacFarlane @ 2010-11-24 3:25 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw +++ Nathan Gass [Nov 24 10 02:51 ]: > Some questions about the line wrapping implementation for the > markdown writer. > > Is it correct that line wrapping only happens at the top level, so > something like *very long emphasized text ... end of it* currently > does not get wrapped? Right. > And how would I go about to implement arbitrary deep line wrapping, > so that something like *very long empahsized text [with a very long > citation inside for @key p. 10]* gets wrapped correctly? > > How does line-wrapping work in pandoc? See wrapped in Text.Pandoc.Shared. (Also wrappedMarkdown in the Markdown writer, which handles complications due to line breaks.) You're right, it splits an [Inline] by Space at the top level, then applies a function [Inline] -> m Doc to the sublists that result, then applies fsep (from the PrettyPrint library) to combine the resulting [Doc] into a single Doc with line wrapping. Unfortunately, I can't see an easy fix (one that doesn't require major architectural changes). John ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <20101124032534.GB25133-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>]
* Re: markdown writer line wrapping [not found] ` <20101124032534.GB25133-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org> @ 2010-12-13 4:38 ` John MacFarlane 2010-12-18 21:46 ` John MacFarlane 1 sibling, 0 replies; 10+ messages in thread From: John MacFarlane @ 2010-12-13 4:38 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw +++ John MacFarlane [Nov 23 10 19:25 ]: > +++ Nathan Gass [Nov 24 10 02:51 ]: > > Some questions about the line wrapping implementation for the > > markdown writer. > > > > Is it correct that line wrapping only happens at the top level, so > > something like *very long emphasized text ... end of it* currently > > does not get wrapped? > > Right. > > > And how would I go about to implement arbitrary deep line wrapping, > > so that something like *very long empahsized text [with a very long > > citation inside for @key p. 10]* gets wrapped correctly? > > > > How does line-wrapping work in pandoc? > > See wrapped in Text.Pandoc.Shared. (Also wrappedMarkdown in the > Markdown writer, which handles complications due to line breaks.) > > You're right, it splits an [Inline] by Space at the top level, then applies a > function [Inline] -> m Doc to the sublists that result, then applies > fsep (from the PrettyPrint library) to combine the resulting [Doc] > into a single Doc with line wrapping. > > Unfortunately, I can't see an easy fix (one that doesn't require major > architectural changes). I've been working on a small prettyprinting library to use instead of the one from 'pretty'. It's designed to be a better fit from pandoc, and it solves this line wrapping issue. Unfortunately, it's currently slower than the standard prettyprinting library. If you want to look at it, it's in the 'pretty' branch of jgm/pandoc on github. I haven't worked much yet on optimizing it, and so far I've just worked it into the markdown writer -- and only incompletely. Benchmarks show that it's significantly slower than the old version. 33 ms vs 18 ms. Nonetheless I'm thinking about using it to replace Text.PrettyPrint.HughesPJ throughout, if it can be optimized a bit... Right now I'm using DLists; I might try using Blaze.Builder. John ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: markdown writer line wrapping [not found] ` <20101124032534.GB25133-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org> 2010-12-13 4:38 ` John MacFarlane @ 2010-12-18 21:46 ` John MacFarlane [not found] ` <20101218214621.GA3416-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org> 1 sibling, 1 reply; 10+ messages in thread From: John MacFarlane @ 2010-12-18 21:46 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw +++ John MacFarlane [Nov 23 10 19:25 ]: > +++ Nathan Gass [Nov 24 10 02:51 ]: > > Some questions about the line wrapping implementation for the > > markdown writer. > > > > Is it correct that line wrapping only happens at the top level, so > > something like *very long emphasized text ... end of it* currently > > does not get wrapped? > > Right. > > > And how would I go about to implement arbitrary deep line wrapping, > > so that something like *very long empahsized text [with a very long > > citation inside for @key p. 10]* gets wrapped correctly? > > > > How does line-wrapping work in pandoc? > > See wrapped in Text.Pandoc.Shared. (Also wrappedMarkdown in the > Markdown writer, which handles complications due to line breaks.) > > You're right, it splits an [Inline] by Space at the top level, then applies a > function [Inline] -> m Doc to the sublists that result, then applies > fsep (from the PrettyPrint library) to combine the resulting [Doc] > into a single Doc with line wrapping. > > Unfortunately, I can't see an easy fix (one that doesn't require major > architectural changes). OK, I've made the major architectural changes that were required. HEAD now contains a new prettyprinting library that is much better suited to pandoc than Text.PrettyPrint.HughesPJ (which is designed for source code, not text). Wrapping in markdown, plain, and rst now works much better. Also, duplicate blank lines are eliminated. It still remains to adapt the other writers to use the new library, but this can happen gradually. I've also added a --columns option to pandoc. John ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <20101218214621.GA3416-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>]
* Re: markdown writer line wrapping [not found] ` <20101218214621.GA3416-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org> @ 2010-12-18 22:58 ` John MacFarlane [not found] ` <20101218225821.GC4805-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org> 0 siblings, 1 reply; 10+ messages in thread From: John MacFarlane @ 2010-12-18 22:58 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw +++ John MacFarlane [Dec 18 10 13:46 ]: > +++ John MacFarlane [Nov 23 10 19:25 ]: > > +++ Nathan Gass [Nov 24 10 02:51 ]: > > > Some questions about the line wrapping implementation for the > > > markdown writer. > > > > > > Is it correct that line wrapping only happens at the top level, so > > > something like *very long emphasized text ... end of it* currently > > > does not get wrapped? > > > > Right. > > > > > And how would I go about to implement arbitrary deep line wrapping, > > > so that something like *very long empahsized text [with a very long > > > citation inside for @key p. 10]* gets wrapped correctly? > > > > > > How does line-wrapping work in pandoc? > > > > See wrapped in Text.Pandoc.Shared. (Also wrappedMarkdown in the > > Markdown writer, which handles complications due to line breaks.) > > > > You're right, it splits an [Inline] by Space at the top level, then applies a > > function [Inline] -> m Doc to the sublists that result, then applies > > fsep (from the PrettyPrint library) to combine the resulting [Doc] > > into a single Doc with line wrapping. > > > > Unfortunately, I can't see an easy fix (one that doesn't require major > > architectural changes). > > OK, I've made the major architectural changes that were required. > HEAD now contains a new prettyprinting library that is much better > suited to pandoc than Text.PrettyPrint.HughesPJ (which is designed > for source code, not text). > > Wrapping in markdown, plain, and rst now works much better. > Also, duplicate blank lines are eliminated. > > It still remains to adapt the other writers to use the new library, > but this can happen gradually. > > I've also added a --columns option to pandoc. To clarify: this specifies the column width for text wrapping. I should also note that the new prettyprinting library is significantly faster - so we have good speed improvements in the writers that use it. John ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <20101218225821.GC4805-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>]
* Re: markdown writer line wrapping [not found] ` <20101218225821.GC4805-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org> @ 2010-12-18 23:46 ` Simon Michael 2010-12-20 18:50 ` BP Jonsson 1 sibling, 0 replies; 10+ messages in thread From: Simon Michael @ 2010-12-18 23:46 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw Very nice. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: markdown writer line wrapping [not found] ` <20101218225821.GC4805-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org> 2010-12-18 23:46 ` Simon Michael @ 2010-12-20 18:50 ` BP Jonsson [not found] ` <4D0FA58A.8090001-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 1 sibling, 1 reply; 10+ messages in thread From: BP Jonsson @ 2010-12-20 18:50 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw 2010-12-18 23:58, John MacFarlane skrev: > +++ John MacFarlane [Dec 18 10 13:46 ]: >> +++ John MacFarlane [Nov 23 10 19:25 ]: >>> +++ Nathan Gass [Nov 24 10 02:51 ]: >>>> Some questions about the line wrapping implementation for the >>>> markdown writer. >>>> >>>> Is it correct that line wrapping only happens at the top level, so >>>> something like *very long emphasized text ... end of it* currently >>>> does not get wrapped? >>> >>> Right. >>> >>>> And how would I go about to implement arbitrary deep line wrapping, >>>> so that something like *very long empahsized text [with a very long >>>> citation inside for @key p. 10]* gets wrapped correctly? >>>> >>>> How does line-wrapping work in pandoc? >>> >>> See wrapped in Text.Pandoc.Shared. (Also wrappedMarkdown in the >>> Markdown writer, which handles complications due to line breaks.) >>> >>> You're right, it splits an [Inline] by Space at the top level, then applies a >>> function [Inline] -> m Doc to the sublists that result, then applies >>> fsep (from the PrettyPrint library) to combine the resulting [Doc] >>> into a single Doc with line wrapping. >>> >>> Unfortunately, I can't see an easy fix (one that doesn't require major >>> architectural changes). >> >> OK, I've made the major architectural changes that were required. >> HEAD now contains a new prettyprinting library that is much better >> suited to pandoc than Text.PrettyPrint.HughesPJ (which is designed >> for source code, not text). >> >> Wrapping in markdown, plain, and rst now works much better. >> Also, duplicate blank lines are eliminated. >> >> It still remains to adapt the other writers to use the new library, >> but this can happen gradually. >> >> I've also added a --columns option to pandoc. > > To clarify: this specifies the column width for text wrapping. > > I should also note that the new prettyprinting library is significantly > faster - so we have good speed improvements in the writers that use > it. > > John > When will these things be in the release? And what about the other things I whined ;-) about recently? > * I prefer asterisk or plus as list bullets, the markdown writer > uses hyphen. > * I like to use underscores for emphasis and asterisks for strong > emphasis. The markdown writer uses asterisks for both. > * I prefer a smaller wrap width than 75 columns. > * I prefer + newline for hard breaks, the markdown writer uses > two spaces + newline. > * The markdown writer squeezes tables laterally. I prefer the > left margin of columns to fall at tabstops. > * Delimited code blocks are converted to indented code blocks, > and any highlighting classes are lost -- which is really serious. > > The first three are minor annoyances which can be fixed with a perl > oneliner but the others are each more serious than the preceding. > The desirability of configurability and a config file rears its > ugly head again. Is the google code bug tracker still the place to go? And should I enter them as one or several issues? I guess one enhancement request for configurability of the first four and one bug each for the last two. /bpj ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <4D0FA58A.8090001-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>]
* Re: markdown writer line wrapping [not found] ` <4D0FA58A.8090001-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> @ 2010-12-20 19:16 ` John MacFarlane [not found] ` <20101220191623.GA15603-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org> 0 siblings, 1 reply; 10+ messages in thread From: John MacFarlane @ 2010-12-20 19:16 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw +++ BP Jonsson [Dec 20 10 19:50 ]: > 2010-12-18 23:58, John MacFarlane skrev: >> > >>I've also added a --columns option to pandoc. > > > >To clarify: this specifies the column width for text wrapping. > > > >I should also note that the new prettyprinting library is significantly > >faster - so we have good speed improvements in the writers that use > >it. > > > >John > > > > When will these things be in the release? They will be in the 1.7 release. I expect this will happen within a few weeks. > And what about the other things I whined ;-) about recently? > > >* I prefer asterisk or plus as list bullets, the markdown writer > > uses hyphen. Everyone has different preferences here. I can't satisfy everyone! > >* I like to use underscores for emphasis and asterisks for strong > > emphasis. The markdown writer uses asterisks for both. See above. > >* I prefer a smaller wrap width than 75 columns. You can get that now using --columns. > >* I prefer \ + newline for hard breaks, the markdown writer uses > > two spaces + newline. The reason for this is that the 2 spaces + newline is compatible with standard markdown. I've tried to keep the writer's output compatible, where possible. This seems valuable. > >* The markdown writer squeezes tables laterally. I prefer the > > left margin of columns to fall at tabstops. Pandoc tries to size the table columns to the same proportions as they were in the original document. The new writer should be a bit better at preserving absolute widths in the case where your input and output have the same column size. But there are always potential rounding issues. > >* Delimited code blocks are converted to indented code blocks, > > and any highlighting classes are lost -- which is really serious. Again, this was motivated by the desire to keep the output compatible with standard markdown, which doesn't have delimited code blocks. Think about the use case of someone converting HTML to markdown. But maybe what I should do is make the writer sensitive to the --strict flag? > Is the google code bug tracker still the place to go? Yes. John ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <20101220191623.GA15603-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>]
* Re: markdown writer line wrapping [not found] ` <20101220191623.GA15603-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org> @ 2010-12-21 3:50 ` John MacFarlane [not found] ` <20101221035024.GA13268-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org> 0 siblings, 1 reply; 10+ messages in thread From: John MacFarlane @ 2010-12-21 3:50 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw +++ John MacFarlane [Dec 20 10 11:16 ]: > +++ BP Jonsson [Dec 20 10 19:50 ]: > > 2010-12-18 23:58, John MacFarlane skrev: > >> > > >>I've also added a --columns option to pandoc. > > > > > >To clarify: this specifies the column width for text wrapping. > > > > > >I should also note that the new prettyprinting library is significantly > > >faster - so we have good speed improvements in the writers that use > > >it. > > > > > >John > > > > > > > When will these things be in the release? > > They will be in the 1.7 release. I expect this will happen within > a few weeks. > > > And what about the other things I whined ;-) about recently? > > > > >* I prefer asterisk or plus as list bullets, the markdown writer > > > uses hyphen. > > Everyone has different preferences here. I can't satisfy > everyone! > > > >* I like to use underscores for emphasis and asterisks for strong > > > emphasis. The markdown writer uses asterisks for both. > > See above. > > > >* I prefer a smaller wrap width than 75 columns. > > You can get that now using --columns. > > > >* I prefer \ + newline for hard breaks, the markdown writer uses > > > two spaces + newline. > > The reason for this is that the 2 spaces + newline is compatible with > standard markdown. I've tried to keep the writer's output compatible, > where possible. This seems valuable. > > > >* The markdown writer squeezes tables laterally. I prefer the > > > left margin of columns to fall at tabstops. > > Pandoc tries to size the table columns to the same proportions as they were > in the original document. The new writer should be a bit better at preserving > absolute widths in the case where your input and output have the same column > size. But there are always potential rounding issues. > > > >* Delimited code blocks are converted to indented code blocks, > > > and any highlighting classes are lost -- which is really serious. > > Again, this was motivated by the desire to keep the output compatible > with standard markdown, which doesn't have delimited code blocks. > Think about the use case of someone converting HTML to markdown. > > But maybe what I should do is make the writer sensitive to the --strict > flag? OK, I've changed the markdown writer so that, provided you haven't used the --strict option, - it will use \ for line breaks - it will use a delimited code block if you've specified any attributes John ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <20101221035024.GA13268-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>]
* Re: markdown writer line wrapping [not found] ` <20101221035024.GA13268-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org> @ 2010-12-22 18:57 ` BP Jonsson 0 siblings, 0 replies; 10+ messages in thread From: BP Jonsson @ 2010-12-22 18:57 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw 2010-12-21 04:50, John MacFarlane skrev: > OK, I've changed the markdown writer so that, provided you haven't > used the --strict option, > > - it will use \ for line breaks > - it will use a delimited code block if you've specified any attributes > Thanks! > Everyone has different preferences here. I can't satisfy >> everyone! >> Yeah, that's why I wanted it to be configurable With these changes and the --columns option there won't be anything I can't fix with a rather simple Perl script. (Well, the table-realigning script is perhaps not that simple... ;-) /bpj ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2010-12-22 18:57 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-11-24 1:51 markdown writer line wrapping Nathan Gass [not found] ` <4CEC6F95.5000408-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org> 2010-11-24 3:25 ` John MacFarlane [not found] ` <20101124032534.GB25133-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org> 2010-12-13 4:38 ` John MacFarlane 2010-12-18 21:46 ` John MacFarlane [not found] ` <20101218214621.GA3416-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org> 2010-12-18 22:58 ` John MacFarlane [not found] ` <20101218225821.GC4805-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org> 2010-12-18 23:46 ` Simon Michael 2010-12-20 18:50 ` BP Jonsson [not found] ` <4D0FA58A.8090001-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 2010-12-20 19:16 ` John MacFarlane [not found] ` <20101220191623.GA15603-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org> 2010-12-21 3:50 ` John MacFarlane [not found] ` <20101221035024.GA13268-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org> 2010-12-22 18:57 ` BP Jonsson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).