From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from scc-mailout-kit-02.scc.kit.edu (scc-mailout-kit-02.scc.kit.edu [129.13.231.82]) by fantadrom.bsd.lv (OpenSMTPD) with ESMTP id a2b68c67 for ; Wed, 21 Jun 2017 15:59:55 -0500 (EST) Received: from asta-nat.asta.uni-karlsruhe.de ([172.22.63.82] helo=hekate.usta.de) by scc-mailout-kit-02.scc.kit.edu with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (envelope-from ) id 1dNmjC-0003AM-4j; Wed, 21 Jun 2017 22:59:55 +0200 Received: from donnerwolke.usta.de ([172.24.96.3]) by hekate.usta.de with esmtp (Exim 4.77) (envelope-from ) id 1dNmjC-00081V-0E; Wed, 21 Jun 2017 22:59:54 +0200 Received: from athene.usta.de ([172.24.96.10]) by donnerwolke.usta.de with esmtp (Exim 4.84_2) (envelope-from ) id 1dNmjB-0005KH-Ri; Wed, 21 Jun 2017 22:59:53 +0200 Received: from localhost (athene.usta.de [local]) by athene.usta.de (OpenSMTPD) with ESMTPA id 778d18c7; Wed, 21 Jun 2017 22:59:53 +0200 (CEST) Date: Wed, 21 Jun 2017 22:59:53 +0200 From: Ingo Schwarze To: "Anthony J. Bentley" Cc: tech@mdocml.bsd.lv Subject: Re: MathML and , , Message-ID: <20170621205953.GC51095@athene.usta.de> References: <45090.1497945869@cathet.us> X-Mailinglist: mdocml-tech Reply-To: tech@mdocml.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <45090.1497945869@cathet.us> User-Agent: Mutt/1.6.2 (2016-07-01) Hi Anthony, Anthony J. Bentley wrote on Tue, Jun 20, 2017 at 02:04:29AM -0600: > - mandoc only uses , not or . > - mandoc will transform a '-' into U+2212, but only when it's not > directly adjacent to a digit. These are still open. > - In Firefox, only seems to italicize single letters. > > It looks like adjacent variables, numbers, and operators should be split: > - 'x=' should become x= > - '-b' should become b > - '-4ac' should become 4ac > > The MathML standard says (MathML 3.0 2e # 3.2.33) that "sin" is > appropriately marked up with . So sin should be enough to > correctly render eqn's mathematical words. It seems that for > non-mathematical words to be rendered with italics by default, they > should be rendered with a per letter? The following commit implements the parser side parts needed to fix that. Some formatter parts are still open. Thanks for the analysis, Ingo Log Message: ----------- Outside explicit font context, give every letter its own box. The formatters need this to correctly select fonts. Missing feature reported by bentley@. Modified Files: -------------- mdocml: eqn.c Revision Data ------------- Index: eqn.c =================================================================== RCS file: /home/cvs/mdocml/mdocml/eqn.c,v retrieving revision 1.65 retrieving revision 1.66 diff -Leqn.c -Leqn.c -u -p -r1.65 -r1.66 --- eqn.c +++ eqn.c @@ -20,6 +20,7 @@ #include #include +#include #include #include #include @@ -718,8 +719,8 @@ static enum rofferr eqn_parse(struct eqn_node *ep, struct eqn_box *parent) { char sym[64]; - struct eqn_box *cur; - const char *start; + struct eqn_box *cur, *fontp, *nbox; + const char *cp, *cpn, *start; char *p; size_t sz; enum eqn_tok tok, subtok; @@ -1092,21 +1093,51 @@ this_tok: */ while (parent->args == parent->expectargs) parent = parent->parent; - if (tok == EQN_TOK_FUNC) { - for (cur = parent; cur != NULL; cur = cur->parent) - if (cur->font != EQNFONT_NONE) - break; - if (cur == NULL || cur->font != EQNFONT_ROMAN) { - parent = eqn_box_alloc(ep, parent); - parent->type = EQN_LISTONE; - parent->font = EQNFONT_ROMAN; - parent->expectargs = 1; - } + /* + * Wrap well-known function names in a roman box, + * unless they already are in roman context. + */ + for (fontp = parent; fontp != NULL; fontp = fontp->parent) + if (fontp->font != EQNFONT_NONE) + break; + if (tok == EQN_TOK_FUNC && + (fontp == NULL || fontp->font != EQNFONT_ROMAN)) { + parent = fontp = eqn_box_alloc(ep, parent); + parent->type = EQN_LISTONE; + parent->font = EQNFONT_ROMAN; + parent->expectargs = 1; } cur = eqn_box_alloc(ep, parent); cur->type = EQN_TEXT; cur->text = p; - + /* + * If not inside any explicit font context, + * give every letter its own box. + */ + if (fontp == NULL && *p != '\0') { + cp = p; + for (;;) { + cpn = cp + 1; + if (*cp == '\\') + mandoc_escape(&cpn, NULL, NULL); + if (*cpn == '\0') + break; + if (isalpha((unsigned char)*cp) == 0 && + isalpha((unsigned char)*cpn) == 0) { + cp = cpn; + continue; + } + nbox = eqn_box_alloc(ep, parent); + nbox->type = EQN_TEXT; + nbox->text = mandoc_strdup(cpn); + p = mandoc_strndup(cur->text, + cpn - cur->text); + free(cur->text); + cur->text = p; + cur = nbox; + cp = nbox->text; + } + } /* * Post-process list status. */ -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv