From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from scc-mailout-kit-02.scc.kit.edu (scc-mailout-kit-02.scc.kit.edu [129.13.231.82]) by fantadrom.bsd.lv (OpenSMTPD) with ESMTP id 76e69264 for ; Thu, 22 Jun 2017 21:57:15 -0500 (EST) Received: from asta-nat.asta.uni-karlsruhe.de ([172.22.63.82] helo=hekate.usta.de) by scc-mailout-kit-02.scc.kit.edu with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (envelope-from ) id 1dOEmX-0003iv-Ks; Fri, 23 Jun 2017 04:57:14 +0200 Received: from donnerwolke.usta.de ([172.24.96.3]) by hekate.usta.de with esmtp (Exim 4.77) (envelope-from ) id 1dOEmW-0005i9-9u; Fri, 23 Jun 2017 04:57:12 +0200 Received: from athene.usta.de ([172.24.96.10]) by donnerwolke.usta.de with esmtp (Exim 4.84_2) (envelope-from ) id 1dOEmW-0000B1-4J; Fri, 23 Jun 2017 04:57:12 +0200 Received: from localhost (athene.usta.de [local]) by athene.usta.de (OpenSMTPD) with ESMTPA id cfe25b9d; Fri, 23 Jun 2017 04:57:12 +0200 (CEST) Date: Fri, 23 Jun 2017 04:57:12 +0200 From: Ingo Schwarze To: "Anthony J. Bentley" Cc: tech@mdocml.bsd.lv Subject: Re: MathML and , , Message-ID: <20170623025712.GE77030@athene.usta.de> References: <45090.1497945869@cathet.us> X-Mailinglist: mdocml-tech Reply-To: tech@mdocml.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <45090.1497945869@cathet.us> User-Agent: Mutt/1.6.2 (2016-07-01) Hi Anthony, Anthony J. Bentley wrote on Tue, Jun 20, 2017 at 02:04:29AM -0600: > Consider the quadratic formula: > > x={-b +- sqrt{b sup 2 - 4ac}} over 2a > > Wikipedia suggests it should be rendered in MathML like so (leaving > out invisible operators): > > > x > = > > > > b > ± > > > b > 2 > > > 4 > a > c > > > > 2 > a > > > After committing the patch appended below, mandoc now renders as follows: x = - b ± b 2 4 ac 2 a The ac does not seem wrong. If you write "ac", mandoc cannot be sure whether this is a two-letter identifier (which is correctly marked up above) or the product of two identifiers. In this case, you should probably write "a c" (with a blank) to make it clear that these are two identifiers, and then it will render as ac. > - mandoc only uses , not or . Fixed. > - mandoc will transform a '-' into U+2212, but only when it's not > directly adjacent to a digit. Open. > - In Firefox, only seems to italicize single letters. That is required by the MathML standard, see the description of . > It looks like adjacent variables, numbers, and operators should be split: > - 'x=' should become x= Done. > - '-b' should become b Done except U+2212. > - '-4ac' should become 4ac I disagree: '4ac' is fine as it is, and '4a c' does become what you ask for. > The MathML standard says (MathML 3.0 2e # 3.2.33) that "sin" is > appropriately marked up with . So sin should be enough to > correctly render eqn's mathematical words. It seems that for > non-mathematical words to be rendered with italics by default, they > should be rendered with a per letter? That would be possible, but it is not required, and it gives strange results for multi-letter identifiers. Yours, Ingo Log Message: ----------- Write text boxes as , , or as appropriate, and write fontstyle or fontweight attributes where required. Missing features reported by bentley@. Modified Files: -------------- mdocml: eqn_html.c html.c html.h Revision Data ------------- Index: html.h =================================================================== RCS file: /home/cvs/mdocml/mdocml/html.h,v retrieving revision 1.85 retrieving revision 1.86 diff -Lhtml.h -Lhtml.h -u -p -r1.85 -r1.86 --- html.h +++ html.h @@ -51,6 +51,7 @@ enum htmltag { TAG_MATH, TAG_MROW, TAG_MI, + TAG_MN, TAG_MO, TAG_MSUP, TAG_MSUB, Index: html.c =================================================================== RCS file: /home/cvs/mdocml/mdocml/html.c,v retrieving revision 1.214 retrieving revision 1.215 diff -Lhtml.c -Lhtml.c -u -p -r1.214 -r1.215 --- html.c +++ html.c @@ -87,6 +87,7 @@ static const struct htmldata htmltags[TA {"math", HTML_NLALL | HTML_INDENT}, {"mrow", 0}, {"mi", 0}, + {"mn", 0}, {"mo", 0}, {"msup", 0}, {"msub", 0}, Index: eqn_html.c =================================================================== RCS file: /home/cvs/mdocml/mdocml/eqn_html.c,v retrieving revision 1.12 retrieving revision 1.13 diff -Leqn_html.c -Leqn_html.c -u -p -r1.12 -r1.13 --- eqn_html.c +++ eqn_html.c @@ -20,6 +20,7 @@ #include #include +#include #include #include #include @@ -33,7 +34,10 @@ eqn_box(struct html *p, const struct eqn { struct tag *post, *row, *cell, *t; const struct eqn_box *child, *parent; + const unsigned char *cp; size_t i, j, rows; + enum htmltag tag; + enum eqn_fontt font; if (NULL == bp) return; @@ -136,9 +140,51 @@ eqn_box(struct html *p, const struct eqn print_otag(p, TAG_MTD, ""); } - if (NULL != bp->text) { - assert(NULL == post); - post = print_otag(p, TAG_MI, ""); + if (bp->text != NULL) { + assert(post == NULL); + tag = TAG_MI; + cp = (unsigned char *)bp->text; + if (isdigit(cp[0]) || (cp[0] == '.' && isdigit(cp[1]))) { + tag = TAG_MN; + while (*++cp != '\0') { + if (*cp != '.' && !isdigit(*cp)) { + tag = TAG_MI; + break; + } + } + } else if (*cp != '\0' && isalpha(*cp) == 0) { + tag = TAG_MO; + while (*++cp != '\0') { + if (isalnum(*cp)) { + tag = TAG_MI; + break; + } + } + } + font = bp->font; + if (bp->text[0] != '\0' && + (((tag == TAG_MN || tag == TAG_MO) && + font == EQNFONT_ROMAN) || + (tag == TAG_MI && font == (bp->text[1] == '\0' ? + EQNFONT_ITALIC : EQNFONT_ROMAN)))) + font = EQNFONT_NONE; + switch (font) { + case EQNFONT_NONE: + post = print_otag(p, tag, ""); + break; + case EQNFONT_ROMAN: + post = print_otag(p, tag, "?", "fontstyle", "normal"); + break; + case EQNFONT_BOLD: + case EQNFONT_FAT: + post = print_otag(p, tag, "?", "fontweight", "bold"); + break; + case EQNFONT_ITALIC: + post = print_otag(p, tag, "?", "fontstyle", "italic"); + break; + default: + abort(); + } print_text(p, bp->text); } else if (NULL == post) { if (NULL != bp->left || NULL != bp->right) -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv