* Some thoughts for markdown syntax extensions @ 2010-11-06 13:24 BP Jonsson [not found] ` <4CD556FA.6090807-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: BP Jonsson @ 2010-11-06 13:24 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw # Some thoughts for markdown syntax extensions ## Verbatim text There is a need for some syntax to render text 'verbatim', i.e. passing it trough exactly as-is without any conversion, of special characters to entities/escapes. This would be useful when writing for a templating system or similar which has it's own markup (like `<? ?> <% %> <& &>` and whatnot). The suggestion is that * any inline text beginning with at least two doublequotes and extending to a matching number of doublequotes "" ... "" or """ ... """ or """" ... """" or... * or a block beginning and ending with at least three doublequotes on their own line (but optionally followed by by whitespace -- similar to the `~~~` for delimited code blocks), be just passed through, except that that these multi- doublequotes delimiters be removed, and that if the closing delimiter is followed by a newline that newline is preserved. This way *any* kind of extended target markup can be used without hardcoding specific styles. The reason for a delimiter consisting of a variable but matching number of characters is of course that code often contains an empty pair of doublequotes, and some languages have a block comment style with `""" ... """`. Compare the `C<>` and similar markup in POD which can have any number of delimiting angles greater than the greatest number of consecutive `>`s which occur in the enclosed code. On the other hand double doublequotes don't normally occur in *text*, so that one wouldn't need to use `\"\"` very often, if ever. ## Dashes TeX-style convertsion of `--` to U+2013 EN DASH and `---` to U+2014 EM DASH should be supported in --smart mode. ## Abbreviations, acronyms and definitions There should be syntaxes for abbreviations, acronyms and definitions. ### Abbreviations and acronyms The abbreviation and acronym syntaxes should be reference-style and cause any instance of their argument in the text to be surrounded by appropriate markup: *[etc.]: Et cetera - and so on **[HTML]: Hypertext Markup Language These two should both use the HTML `<abbr>` tag, since `<acronym>` is deprecated, but the latter should have a class="acronym" so that one can apply different CSS if desired: <abbr title="Et cetera - and so on">etc.</abbr> <abbr class="acronym" title="Hypertext Markup Language" >HTML</abbr> abbr { text-decoration: none; border-bottom: 1pt dotted #000; } abbr.acronym { text-decoration: none; border-bottom: 1pt dotted #000; text-transform: lowercase; font-variant: small-caps; } Furthermore the definition should optionally contain an URL to a definition of the abbreviation/acronym: **[HTML]: Hypertext Markup Language "http://www.acronyms.net/h.xhtml#html-acronym" To be rendered in HTML as <abbr class="acronym" title="Hypertext Markup Language" ><a href="http://www.acronyms.net/h.xhtml#html-acronym" >title="Hypertext Markup Language">HTML</a></abbr> N.B. the title attribute *needs* to be duplicated, and the anchor needs to be present because some browsers -- notably screen readers :-( and not just the usual culprit -- are dumb. ### Definitions Definitions should be either inline or reference: Most ?[Romance languages](The modern languages descended from Latin) have palatalized consonants. <dfn title="The modern languages descended from Latin" >Romance languages</dfn> Most ?[Romance languages] have palatalized consonants. ?[Romance languages]: The modern languages descended from Latin "http://en.wikipedia.org/wiki/Romance_languages" <dfn title="The modern languages descended from Latin" ><a href="http://en.wikipedia.org/wiki/Romance_languages" >title="The modern languages descended from Latin" >>Romance languages</a></dfn> Of course only the reference style should allow for an URL (it got to be better at something, right?) ## Small-caps The thorniest problem with finding a syntax for small-caps is that * the markdown should preferably *use* capitals so that it doesn't just look like ordinary lowercase/mixed case text in funny delimiters -- that HTML/CSS/LaTeX smallcaps err on that acount is no excuse for markdown --, * and at the same time it should be possible to include big-caps: "Caesar" in small-caps should be (faked with Unicode phonetic letters which I hope everyone can see!) "Cᴀᴇꜱᴀʀ", not "ᴄᴀᴇꜱᴀʀ". A possible solution is that the markup for small-caps be capital letters delimited by double pipe characters, and that any letter inside which is preceded by a single unescaped pipe character be rendered as a (big) capital: If naturally descended |||IULIU |CAESARE|| would have would perhaps have become _Juil Cierre_ rather than _Jules César_ in French. <p>If naturally descended <span class="smallcaps">Iuliu Caesare</span> would have would have become <em>Juil Ciestre</em> rather than <em>Jules César</em> in French.</p> Truth to tell I did prefer `^^^IULIU ^CAESARE^^`, because carets are small things pointing upwards, but one may want to include superscripts in a small-caps span, and seen side by side `|||ROMANICE||` looks less messy than `^^^ROMANICE^^`, or conversely three carets in a row look worse than three pipes in a row, probably because of the white gap below them. /bpj -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en. ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <4CD556FA.6090807-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>]
* Re: Some thoughts for markdown syntax extensions [not found] ` <4CD556FA.6090807-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> @ 2010-11-06 13:44 ` BP Jonsson 2010-11-06 18:47 ` John MacFarlane 2010-11-06 21:45 ` Ivan Lazar Miljenovic 2 siblings, 0 replies; 6+ messages in thread From: BP Jonsson @ 2010-11-06 13:44 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw 2010-11-06 14:24, BP Jonsson skrev: > * any inline text beginning with at least two doublequotes > and extending to a matching number of doublequotes > > "" ... "" or """ ... """ or """" ... """" > or... > > * or a block beginning and ending with at least three > doublequotes on their own line (but optionally followed by > by whitespace -- similar to the `~~~` for delimited code > blocks), > > be just passed through, except that that these multi- > doublequotes delimiters be removed, and that if the closing > delimiter is followed by a newline that newline is > preserved. > > This way *any* kind of extended target markup can be used > without hardcoding specific styles. I forgot to mention that this also could be used to get inline markdown in a LaTeX command argument converted: ""\MyCmd{""Argument with *emphasized* word""}"" though currently I have a rather well-working trick for that: \Foo{MyCmd}Argument with *emphasized* word\ooF Then a script which replaces \Foo{MyCmd} with \MyCmd{ and \ooF with }. Voilà! /bpj -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Some thoughts for markdown syntax extensions [not found] ` <4CD556FA.6090807-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 2010-11-06 13:44 ` BP Jonsson @ 2010-11-06 18:47 ` John MacFarlane [not found] ` <20101106184722.GA21524-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org> 2010-11-06 21:45 ` Ivan Lazar Miljenovic 2 siblings, 1 reply; 6+ messages in thread From: John MacFarlane @ 2010-11-06 18:47 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw +++ BP Jonsson [Nov 06 10 14:24 ]: > # Some thoughts for markdown syntax extensions > > ## Verbatim text > > There is a need for some syntax to render text 'verbatim', > i.e. passing it trough exactly as-is without any conversion, > of special characters to entities/escapes. This would be > useful when writing for a templating system or similar which > has it's own markup (like `<? ?> <% %> <& &>` and whatnot). > The suggestion is that > > * any inline text beginning with at least two doublequotes > and extending to a matching number of doublequotes > > "" ... "" or """ ... """ or """" ... """" > or... > > * or a block beginning and ending with at least three > doublequotes on their own line (but optionally followed by > by whitespace -- similar to the `~~~` for delimited code > blocks), > > be just passed through, except that that these multi- > doublequotes delimiters be removed, and that if the closing > delimiter is followed by a newline that newline is > preserved. > > This way *any* kind of extended target markup can be used > without hardcoding specific styles. I'm reluctant to add a feature like this. I think that pandoc should aim to produce valid X for any output format X. This feature would break that guarantee. The verbatim text would only make sense for one particular output format. A better solution, I think, would be to change the parser so that <? ... >, <% .. >, <& ...> and other standard template tags are recognized as raw HTML. > ## Dashes > > TeX-style convertsion of `--` to U+2013 EN DASH and `---` to > U+2014 EM DASH should be supported in --smart mode. This has been discussed before on the list. See the earlier, inconclusive discussion. Currently pandoc will convert both `--` and `---` to an EM DASH. That is because non-TeXers are likely to use the symbols this way. Pandoc will automatically convert a `-` between digits to an EN DASH. This sacrifices some flexibility but promotes the goals of markdown -- you should be able to write with normal email conventions. > ## Abbreviations, acronyms and definitions > > There should be syntaxes for abbreviations, acronyms and definitions. It's not clear what these would mean in formats other than HTML, so I'm reluctant to complicate pandoc for this. (For HTML output, you can just use raw HTML.) But maybe I could be persuaded. > ### Abbreviations and acronyms > > The abbreviation and acronym syntaxes should be reference-style > and cause any instance of their argument in the text to be > surrounded by appropriate markup: > > *[etc.]: Et cetera - and so on > **[HTML]: Hypertext Markup Language > > These two should both use the HTML `<abbr>` tag, since `<acronym>` > is deprecated, but the latter should have a class="acronym" so > that one can apply different CSS if desired: > > <abbr title="Et cetera - and so on">etc.</abbr> > > <abbr class="acronym" title="Hypertext Markup Language" > >HTML</abbr> > > abbr { text-decoration: none; border-bottom: 1pt dotted #000; } > > abbr.acronym { > text-decoration: none; > border-bottom: 1pt dotted #000; > text-transform: lowercase; > font-variant: small-caps; > } > > Furthermore the definition should optionally contain an URL to > a definition of the abbreviation/acronym: > > **[HTML]: Hypertext Markup Language > "http://www.acronyms.net/h.xhtml#html-acronym" > > To be rendered in HTML as > > <abbr class="acronym" title="Hypertext Markup Language" > ><a href="http://www.acronyms.net/h.xhtml#html-acronym" > >title="Hypertext Markup Language">HTML</a></abbr> > > N.B. the title attribute *needs* to be duplicated, and the > anchor needs to be present because some browsers -- notably > screen readers :-( and not just the usual culprit -- are > dumb. > > ### Definitions See above on abbreviations/acronyms. > Definitions should be either inline or reference: > > Most ?[Romance languages](The modern languages descended from > Latin) have palatalized consonants. > > <dfn title="The modern languages descended from Latin" > >Romance languages</dfn> > > Most ?[Romance languages] have palatalized consonants. > > ?[Romance languages]: The modern languages descended from Latin > "http://en.wikipedia.org/wiki/Romance_languages" > > <dfn title="The modern languages descended from Latin" > ><a href="http://en.wikipedia.org/wiki/Romance_languages" > >title="The modern languages descended from Latin" > >>Romance languages</a></dfn> > > Of course only the reference style should allow for an URL > (it got to be better at something, right?) > > ## Small-caps > > The thorniest problem with finding a syntax for small-caps > is that > > * the markdown should preferably *use* capitals so that it > doesn't just look like ordinary lowercase/mixed case text > in funny delimiters -- that HTML/CSS/LaTeX smallcaps err > on that acount is no excuse for markdown --, > * and at the same time it should be possible to include > big-caps: "Caesar" in small-caps should be (faked with > Unicode phonetic letters which I hope everyone can see!) > "Cᴀᴇꜱᴀʀ", not "ᴄᴀᴇꜱᴀʀ". > > A possible solution is that the markup for small-caps be > capital letters delimited by > double pipe characters, and that any letter inside which is > preceded by a single unescaped pipe character be rendered as > a (big) capital: > > If naturally descended |||IULIU |CAESARE|| would have > would perhaps have become _Juil Cierre_ rather than > _Jules César_ in French. This is an interesting proposal; I think the || idea looks fairly natural. But we might want to use | in an alternative table syntax. > <p>If naturally descended <span class="smallcaps">Iuliu > Caesare</span> would have would have become <em>Juil > Ciestre</em> rather than <em>Jules César</em> in French.</p> > > Truth to tell I did prefer `^^^IULIU ^CAESARE^^`, because > carets are small things pointing upwards, but one may want > to include superscripts in a small-caps span, and seen side > by side `|||ROMANICE||` looks less messy than `^^^ROMANICE^^`, > or conversely three carets in a row look worse than three > pipes in a row, probably because of the white gap below them. > > /bpj > > -- > You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. > To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en. > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en. ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <20101106184722.GA21524-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>]
* Re: Some thoughts for markdown syntax extensions [not found] ` <20101106184722.GA21524-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org> @ 2010-11-07 10:16 ` Tillmann Rendel 2010-11-07 10:59 ` Nathan Gass 1 sibling, 0 replies; 6+ messages in thread From: Tillmann Rendel @ 2010-11-07 10:16 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw John MacFarlane wrote: >> The suggestion is that >> >> * any inline text beginning with at least two doublequotes >> and extending to a matching number of doublequotes >> >> "" ... "" or """ ... """ or """" ... """" >> or... >> >> be just passed through. >> > I'm reluctant to add a feature like this. I think that pandoc should aim > to produce valid X for any output format X. This feature would break that > guarantee. The verbatim text would only make sense for one particular > output format. > > A better solution, I think, would be to change the parser > so that<? ...>,<% ..>,<& ...> and other standard template > tags are recognized as raw HTML. I agree that changing the parser is the better solution here, but what about non-standard template tags? What about non-HTML target formats? A syntax for "raw text" not to be processed by pandoc would offer some extra flexibility: A (power-) user could write a little preprocessor to wrap non-standard tags into pando's "raw text" tags. For example, such a preprocessor could convert <? ... ?> into """<? ... ?>""". Now, pandoc can be run as usual and the overall pipeline will behave like pandoc with a changed parser. This allows a user to extend pandoc's parser without extending its Haskell source code. By the way, with latex output, pandoc can already be tricked to do this as follows: \newcommand{\ignoreThis}[1]{#1} ... \ignoreThis{text to be passed through} ... Pandoc will wrap "text to be passed through" in an \ignoreThis call, but at macro expansion time, TeX will expand it away. So in many situations, this allows to use TeX commands with non-standard syntax. For example, the pgf/tikz package for drawing pictures supports the following syntax \tikz <textual description of a picture> . So the textual description is terminated with a full-stop. Pandoc's parser would usually not realize that the textual description should be passed as-is to the Latex output, but with the above hack, one can write \ignoreThis{\itkz <textual description of a picture> .} instead, and it works. Or one can write a preprocessor which searchs for "\tikz ... ." and wraps it in a \ignoreThis call. Adding support for "raw text" in pandoc would allow similar tricks in a wider range of situations, including other target formats than latex. > I think that pandoc should aim to produce valid X for any output format X. I guess there are different "usage scenarios" for pandoc. One scenario is that one wants to generate different formats from a single source file, and for that scenario, the described property is obviously important. But another scenario is that one needs, for some reason, to produce a specific format, but doesn't want to actually write one's document in that format. For example, I need to produce latex documents for scientific articles, but rather want to write markdown instead, because it fits better into my workflow. (I want to have a smooth transition between programming, writing emails about the programs, and condensing the emails into an article). In that scenario, it is important that pandoc supports all the bells and whistles of the target format one happens to need. Clearly, because of the first scenario, pandoc should not support all these bells and whistles natively. Instead, I propose, it should be easily extendible to support them. Already, the pandoc API is an important tool in that regard, but syntax for "raw text" would be another step in a good direction. Tillmann ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Some thoughts for markdown syntax extensions [not found] ` <20101106184722.GA21524-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org> 2010-11-07 10:16 ` Tillmann Rendel @ 2010-11-07 10:59 ` Nathan Gass 1 sibling, 0 replies; 6+ messages in thread From: Nathan Gass @ 2010-11-07 10:59 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw On 06.11.10 19:47, John MacFarlane wrote: > +++ BP Jonsson [Nov 06 10 14:24 ]: >> # Some thoughts for markdown syntax extensions >> >> ## Verbatim text >> >> There is a need for some syntax to render text 'verbatim', >> i.e. passing it trough exactly as-is without any conversion, >> of special characters to entities/escapes. This would be >> useful when writing for a templating system or similar which >> has it's own markup (like `<? ?> <% %> <& &>` and whatnot). >> The suggestion is that >> >> * any inline text beginning with at least two doublequotes >> and extending to a matching number of doublequotes >> >> "" ... "" or """ ... """ or """" ... """" >> or... >> >> * or a block beginning and ending with at least three >> doublequotes on their own line (but optionally followed by >> by whitespace -- similar to the `~~~` for delimited code >> blocks), >> >> be just passed through, except that that these multi- >> doublequotes delimiters be removed, and that if the closing >> delimiter is followed by a newline that newline is >> preserved. >> >> This way *any* kind of extended target markup can be used >> without hardcoding specific styles. > > I'm reluctant to add a feature like this. I think that pandoc should aim > to produce valid X for any output format X. This feature would break that > guarantee. The verbatim text would only make sense for one particular > output format. > > A better solution, I think, would be to change the parser > so that<? ...>,<% ..>,<& ...> and other standard template > tags are recognized as raw HTML. I'm not personally interested in this topic, but wanted to toss in an idea anyway: The new syntax just needs a way to select for which writer(s) the verbatim block should be written. So something like: "":latex nonstandard verbatim latex"" "":html,s5 nonstandard verbatim html"" This is longer than the proposed form, but it should be as rarely used as possible anyway. Another "solution" could be to give the intended output format in metadata, if we add a syntax for metadata. This way it is clear that the document only works in one output format and you have the added benefit of a shorter command line as you don't need to declare your output format. By the way, I wonder if this syntax is a bit wasted on this feature, as it is very easy to type. Of course, I think this because I don't personally need this feature. Anyway, its probably useful to think about using a less practical char and keeping this syntax free to use for some other feature. > >> ## Dashes >> >> TeX-style convertsion of `--` to U+2013 EN DASH and `---` to >> U+2014 EM DASH should be supported in --smart mode. > > This has been discussed before on the list. See the earlier, > inconclusive discussion. > > Currently pandoc will convert both `--` and `---` to > an EM DASH. That is because non-TeXers are likely to use the > symbols this way. Pandoc will automatically convert a `-` > between digits to an EN DASH. This sacrifices some > flexibility but promotes the goals of markdown -- you should > be able to write with normal email conventions. > >> ## Abbreviations, acronyms and definitions >> >> There should be syntaxes for abbreviations, acronyms and definitions. > > It's not clear what these would mean in formats other than HTML, so > I'm reluctant to complicate pandoc for this. (For HTML output, you can just > use raw HTML.) But maybe I could be persuaded. > >> ### Abbreviations and acronyms >> >> The abbreviation and acronym syntaxes should be reference-style >> and cause any instance of their argument in the text to be >> surrounded by appropriate markup: >> >> *[etc.]: Et cetera - and so on >> **[HTML]: Hypertext Markup Language >> >> These two should both use the HTML `<abbr>` tag, since `<acronym>` >> is deprecated, but the latter should have a class="acronym" so >> that one can apply different CSS if desired: >> >> <abbr title="Et cetera - and so on">etc.</abbr> >> >> <abbr class="acronym" title="Hypertext Markup Language" >> >HTML</abbr> >> >> abbr { text-decoration: none; border-bottom: 1pt dotted #000; } >> >> abbr.acronym { >> text-decoration: none; >> border-bottom: 1pt dotted #000; >> text-transform: lowercase; >> font-variant: small-caps; >> } >> >> Furthermore the definition should optionally contain an URL to >> a definition of the abbreviation/acronym: >> >> **[HTML]: Hypertext Markup Language >> "http://www.acronyms.net/h.xhtml#html-acronym" >> >> To be rendered in HTML as >> >> <abbr class="acronym" title="Hypertext Markup Language" >> ><a href="http://www.acronyms.net/h.xhtml#html-acronym" >> >title="Hypertext Markup Language">HTML</a></abbr> >> >> N.B. the title attribute *needs* to be duplicated, and the >> anchor needs to be present because some browsers -- notably >> screen readers :-( and not just the usual culprit -- are >> dumb. >> >> ### Definitions > > See above on abbreviations/acronyms. > >> Definitions should be either inline or reference: >> >> Most ?[Romance languages](The modern languages descended from >> Latin) have palatalized consonants. >> >> <dfn title="The modern languages descended from Latin" >> >Romance languages</dfn> >> >> Most ?[Romance languages] have palatalized consonants. >> >> ?[Romance languages]: The modern languages descended from Latin >> "http://en.wikipedia.org/wiki/Romance_languages" >> >> <dfn title="The modern languages descended from Latin" >> ><a href="http://en.wikipedia.org/wiki/Romance_languages" >> >title="The modern languages descended from Latin" >> >>Romance languages</a></dfn> >> >> Of course only the reference style should allow for an URL >> (it got to be better at something, right?) >> >> ## Small-caps >> >> The thorniest problem with finding a syntax for small-caps >> is that >> >> * the markdown should preferably *use* capitals so that it >> doesn't just look like ordinary lowercase/mixed case text >> in funny delimiters -- that HTML/CSS/LaTeX smallcaps err >> on that acount is no excuse for markdown --, >> * and at the same time it should be possible to include >> big-caps: "Caesar" in small-caps should be (faked with >> Unicode phonetic letters which I hope everyone can see!) >> "Cᴀᴇꜱᴀʀ", not "ᴄᴀᴇꜱᴀʀ". >> >> A possible solution is that the markup for small-caps be >> capital letters delimited by >> double pipe characters, and that any letter inside which is >> preceded by a single unescaped pipe character be rendered as >> a (big) capital: >> >> If naturally descended |||IULIU |CAESARE|| would have >> would perhaps have become _Juil Cierre_ rather than >> _Jules César_ in French. > > This is an interesting proposal; I think the || idea looks > fairly natural. But we might want to use | in an alternative > table syntax. Imho, |Iuliu Caesare| is more readable to my eyes than |||IULIU |CAESAR|| and not to far appart from the desired output. So I'd rather simply have normal text between pipes render in small-caps and leave big-caps between pipes as big-caps. I don't think the ugly syntax just for being able to write everything in big-caps is worth it. Especially as they are visually not that close to small-caps. Just my 2 cents. Nathan > >> <p>If naturally descended<span class="smallcaps">Iuliu >> Caesare</span> would have would have become<em>Juil >> Ciestre</em> rather than<em>Jules César</em> in French.</p> >> >> Truth to tell I did prefer `^^^IULIU ^CAESARE^^`, because >> carets are small things pointing upwards, but one may want >> to include superscripts in a small-caps span, and seen side >> by side `|||ROMANICE||` looks less messy than `^^^ROMANICE^^`, >> or conversely three carets in a row look worse than three >> pipes in a row, probably because of the white gap below them. >> >> /bpj >> >> -- >> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. >> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com. >> For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en. >> > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Some thoughts for markdown syntax extensions [not found] ` <4CD556FA.6090807-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 2010-11-06 13:44 ` BP Jonsson 2010-11-06 18:47 ` John MacFarlane @ 2010-11-06 21:45 ` Ivan Lazar Miljenovic 2 siblings, 0 replies; 6+ messages in thread From: Ivan Lazar Miljenovic @ 2010-11-06 21:45 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw On 7 November 2010 00:24, BP Jonsson <bpjonsson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > # Some thoughts for markdown syntax extensions > > ## Verbatim text > > There is a need for some syntax to render text 'verbatim', > i.e. passing it trough exactly as-is without any conversion, > of special characters to entities/escapes. This would be > useful when writing for a templating system or similar which > has it's own markup (like `<? ?> <% %> <& &>` and whatnot). > The suggestion is that > > * any inline text beginning with at least two doublequotes > and extending to a matching number of doublequotes > > "" ... "" or """ ... """ or """" ... """" > or... > > * or a block beginning and ending with at least three > doublequotes on their own line (but optionally followed by > by whitespace -- similar to the `~~~` for delimited code > blocks), > > be just passed through, except that that these multi- > doublequotes delimiters be removed, and that if the closing > delimiter is followed by a newline that newline is > preserved. > > This way *any* kind of extended target markup can be used > without hardcoding specific styles. > > The reason for a delimiter consisting of a variable but > matching number of characters is of course that code often > contains an empty pair of doublequotes, and some languages > have a block comment style with `""" ... """`. Compare the > `C<>` and similar markup in POD which can have any number of > delimiting angles greater than the greatest number of > consecutive `>`s which occur in the enclosed code. > > On the other hand double doublequotes don't normally occur > in *text*, so that one wouldn't need to use `\"\"` very > often, if ever. > > ## Dashes > > TeX-style convertsion of `--` to U+2013 EN DASH and `---` to > U+2014 EM DASH should be supported in --smart mode. > > ## Abbreviations, acronyms and definitions > > There should be syntaxes for abbreviations, acronyms and definitions. > > ### Abbreviations and acronyms > > The abbreviation and acronym syntaxes should be reference-style > and cause any instance of their argument in the text to be > surrounded by appropriate markup: > > *[etc.]: Et cetera - and so on > **[HTML]: Hypertext Markup Language > > These two should both use the HTML `<abbr>` tag, since `<acronym>` > is deprecated, but the latter should have a class="acronym" so > that one can apply different CSS if desired: > > <abbr title="Et cetera - and so on">etc.</abbr> > > <abbr class="acronym" title="Hypertext Markup Language" > >HTML</abbr> > > abbr { text-decoration: none; border-bottom: 1pt dotted #000; } > > abbr.acronym { > text-decoration: none; > border-bottom: 1pt dotted #000; > text-transform: lowercase; > font-variant: small-caps; > } > > Furthermore the definition should optionally contain an URL to > a definition of the abbreviation/acronym: > > **[HTML]: Hypertext Markup Language > "http://www.acronyms.net/h.xhtml#html-acronym" > > To be rendered in HTML as > > <abbr class="acronym" title="Hypertext Markup Language" > ><a href="http://www.acronyms.net/h.xhtml#html-acronym" > >title="Hypertext Markup Language">HTML</a></abbr> > > N.B. the title attribute *needs* to be duplicated, and the > anchor needs to be present because some browsers -- notably > screen readers :-( and not just the usual culprit -- are > dumb. > > ### Definitions > > Definitions should be either inline or reference: > > Most ?[Romance languages](The modern languages descended from > Latin) have palatalized consonants. > > <dfn title="The modern languages descended from Latin" > >Romance languages</dfn> > > Most ?[Romance languages] have palatalized consonants. > > ?[Romance languages]: The modern languages descended from Latin > "http://en.wikipedia.org/wiki/Romance_languages" > > <dfn title="The modern languages descended from Latin" > ><a href="http://en.wikipedia.org/wiki/Romance_languages" > >title="The modern languages descended from Latin" > >>Romance languages</a></dfn> > > Of course only the reference style should allow for an URL > (it got to be better at something, right?) > > ## Small-caps > > The thorniest problem with finding a syntax for small-caps > is that > > * the markdown should preferably *use* capitals so that it > doesn't just look like ordinary lowercase/mixed case text > in funny delimiters -- that HTML/CSS/LaTeX smallcaps err > on that acount is no excuse for markdown --, > * and at the same time it should be possible to include > big-caps: "Caesar" in small-caps should be (faked with > Unicode phonetic letters which I hope everyone can see!) > "Cᴀᴇꜱᴀʀ", not "ᴄᴀᴇꜱᴀʀ". > > A possible solution is that the markup for small-caps be > capital letters delimited by > double pipe characters, and that any letter inside which is > preceded by a single unescaped pipe character be rendered as > a (big) capital: > > If naturally descended |||IULIU |CAESARE|| would have > would perhaps have become _Juil Cierre_ rather than > _Jules César_ in French. > > <p>If naturally descended <span class="smallcaps">Iuliu > Caesare</span> would have would have become <em>Juil > Ciestre</em> rather than <em>Jules César</em> in French.</p> > > Truth to tell I did prefer `^^^IULIU ^CAESARE^^`, because > carets are small things pointing upwards, but one may want > to include superscripts in a small-caps span, and seen side > by side `|||ROMANICE||` looks less messy than `^^^ROMANICE^^`, > or conversely three carets in a row look worse than three > pipes in a row, probably because of the white gap below them. If we're discussing markdown extensions, I'd like to bring back up the topic of syntax for comments in markdown. Such an inclusion might make me actually go write that split pragma I've been procrastinating about for a while (so that you can specify how to split a single markdown document into multiple documents; that way you can have an all-in-one textual help document and have it split into multiple HTML pages). -- Ivan Lazar Miljenovic Ivan.Miljenovic-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org IvanMiljenovic.wordpress.com -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2010-11-07 10:59 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-11-06 13:24 Some thoughts for markdown syntax extensions BP Jonsson [not found] ` <4CD556FA.6090807-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 2010-11-06 13:44 ` BP Jonsson 2010-11-06 18:47 ` John MacFarlane [not found] ` <20101106184722.GA21524-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org> 2010-11-07 10:16 ` Tillmann Rendel 2010-11-07 10:59 ` Nathan Gass 2010-11-06 21:45 ` Ivan Lazar Miljenovic
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).