tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
From: Ingo Schwarze <schwarze@usta.de>
To: "Anthony J. Bentley" <anthony@anjbe.name>
Cc: tech@mdocml.bsd.lv
Subject: Re: MathML and <mo>, <mi>, <mn>
Date: Wed, 21 Jun 2017 22:59:53 +0200	[thread overview]
Message-ID: <20170621205953.GC51095@athene.usta.de> (raw)
In-Reply-To: <45090.1497945869@cathet.us>

Hi Anthony,

Anthony J. Bentley wrote on Tue, Jun 20, 2017 at 02:04:29AM -0600:

> - mandoc only uses <mi>, not <mo> or <mn>.
> - mandoc will transform a '-' into U+2212, but only when it's not
>   directly adjacent to a digit.

These are still open.

> - In Firefox, <mi> only seems to italicize single letters.
> 
> It looks like adjacent variables, numbers, and operators should be split:
>     - 'x=' should become <mi>x</mi><mo>=</mo>
>     - '-b' should become <mo>&#8722;</mo><mi>b</mi>
>     - '-4ac' should become <mo>&#8722;</mo><mn>4</mn><mi>a</mi><mi>c</mi>
> 
> The MathML standard says (MathML 3.0 2e # 3.2.33) that "sin" is
> appropriately marked up with <mi>. So <mi>sin</mi> should be enough to
> correctly render eqn's mathematical words. It seems that for
> non-mathematical words to be rendered with italics by default, they
> should be rendered with a <mi> per letter?

The following commit implements the parser side parts needed to fix
that.  Some formatter parts are still open.

Thanks for the analysis,
  Ingo


Log Message:
-----------
Outside explicit font context, give every letter its own box.
The formatters need this to correctly select fonts.
Missing feature reported by bentley@.

Modified Files:
--------------
    mdocml:
        eqn.c

Revision Data
-------------
Index: eqn.c
===================================================================
RCS file: /home/cvs/mdocml/mdocml/eqn.c,v
retrieving revision 1.65
retrieving revision 1.66
diff -Leqn.c -Leqn.c -u -p -r1.65 -r1.66
--- eqn.c
+++ eqn.c
@@ -20,6 +20,7 @@
 #include <sys/types.h>
 
 #include <assert.h>
+#include <ctype.h>
 #include <limits.h>
 #include <stdio.h>
 #include <stdlib.h>
@@ -718,8 +719,8 @@ static enum rofferr
 eqn_parse(struct eqn_node *ep, struct eqn_box *parent)
 {
 	char		 sym[64];
-	struct eqn_box	*cur;
-	const char	*start;
+	struct eqn_box	*cur, *fontp, *nbox;
+	const char	*cp, *cpn, *start;
 	char		*p;
 	size_t		 sz;
 	enum eqn_tok	 tok, subtok;
@@ -1092,21 +1093,51 @@ this_tok:
 		 */
 		while (parent->args == parent->expectargs)
 			parent = parent->parent;
-		if (tok == EQN_TOK_FUNC) {
-			for (cur = parent; cur != NULL; cur = cur->parent)
-				if (cur->font != EQNFONT_NONE)
-					break;
-			if (cur == NULL || cur->font != EQNFONT_ROMAN) {
-				parent = eqn_box_alloc(ep, parent);
-				parent->type = EQN_LISTONE;
-				parent->font = EQNFONT_ROMAN;
-				parent->expectargs = 1;
-			}
+		/*
+		 * Wrap well-known function names in a roman box,
+		 * unless they already are in roman context.
+		 */
+		for (fontp = parent; fontp != NULL; fontp = fontp->parent)
+			if (fontp->font != EQNFONT_NONE)
+				break;
+		if (tok == EQN_TOK_FUNC &&
+		    (fontp == NULL || fontp->font != EQNFONT_ROMAN)) {
+			parent = fontp = eqn_box_alloc(ep, parent);
+			parent->type = EQN_LISTONE;
+			parent->font = EQNFONT_ROMAN;
+			parent->expectargs = 1;
 		}
 		cur = eqn_box_alloc(ep, parent);
 		cur->type = EQN_TEXT;
 		cur->text = p;
-
+		/*
+		 * If not inside any explicit font context,
+		 * give every letter its own box.
+		 */
+		if (fontp == NULL && *p != '\0') {
+			cp = p;
+			for (;;) {
+				cpn = cp + 1;
+				if (*cp == '\\')
+					mandoc_escape(&cpn, NULL, NULL);
+				if (*cpn == '\0')
+					break;
+				if (isalpha((unsigned char)*cp) == 0 &&
+				    isalpha((unsigned char)*cpn) == 0) {
+					cp = cpn;
+					continue;
+				}
+				nbox = eqn_box_alloc(ep, parent);
+				nbox->type = EQN_TEXT;
+				nbox->text = mandoc_strdup(cpn);
+				p = mandoc_strndup(cur->text,
+				    cpn - cur->text);
+				free(cur->text);
+				cur->text = p;
+				cur = nbox;
+				cp = nbox->text;
+			}
+		}
 		/*
 		 * Post-process list status.
 		 */
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

  reply	other threads:[~2017-06-21 20:59 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-20  8:04 Anthony J. Bentley
2017-06-21 20:59 ` Ingo Schwarze [this message]
2017-06-23  2:57 ` Ingo Schwarze
2017-06-23  3:22   ` Anthony J. Bentley
2017-06-23 21:24   ` Ingo Schwarze

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170621205953.GC51095@athene.usta.de \
    --to=schwarze@usta.de \
    --cc=anthony@anjbe.name \
    --cc=tech@mdocml.bsd.lv \
    --subject='Re: MathML and <mo>, <mi>, <mn>' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).