From: Ingo Schwarze <schwarze@usta.de>
To: "Anthony J. Bentley" <anthony@anjbe.name>
Cc: tech@mdocml.bsd.lv
Subject: Re: MathML and <mo>, <mi>, <mn>
Date: Fri, 23 Jun 2017 04:57:12 +0200 [thread overview]
Message-ID: <20170623025712.GE77030@athene.usta.de> (raw)
In-Reply-To: <45090.1497945869@cathet.us>
Hi Anthony,
Anthony J. Bentley wrote on Tue, Jun 20, 2017 at 02:04:29AM -0600:
> Consider the quadratic formula:
>
> x={-b +- sqrt{b sup 2 - 4ac}} over 2a
>
> Wikipedia suggests it should be rendered in MathML like so (leaving
> out invisible operators):
>
> <mrow>
> <mi>x</mi>
> <mo>=</mo>
> <mfrac>
> <mrow>
> <mo>−</mo>
> <mi>b</mi>
> <mo>±</mo>
> <msqrt>
> <msup>
> <mi>b</mi>
> <mn>2</mn>
> </msup>
> <mo>−</mo>
> <mn>4</mn>
> <mi>a</mi>
> <mi>c</mi>
> </msqrt>
> </mrow>
> <mrow>
> <mn>2</mn>
> <mi>a</mi>
> </mrow>
> </mfrac>
> </mrow>
After committing the patch appended below, mandoc now renders
as follows:
<mrow>
<mi>x</mi> <!-- new identifier/operator splitting -->
<mo>=</mo> <!-- new operator element -->
<mfrac>
<mrow>
<mo>-</mo> <!-- XXX still no U+2212 -->
<mi>b</mi>
<mo>±</mo>
<msqrt>
<mrow> <!-- XXX no detection of needless rows yet -->
<msup>
<mi>b</mi>
<mn>2</mn> <!-- new number element -->
</msup>
<mi>−</mi> <!-- XXX no non-ASCII operator detection -->
<mn>4</mn>
<mi fontstyle="italic">ac</mi> <!-- SEE BELOW -->
</mrow>
</msqrt>
</mrow>
<mn>2</mn> <!-- XXX oops, do we need a row here? -->
<mi>a</mi>
</mfrac>
</mrow>
The <mi fontstyle="italic">ac</mi> does not seem wrong.
If you write "ac", mandoc cannot be sure whether this is
a two-letter identifier (which is correctly marked up above)
or the product of two identifiers.
In this case, you should probably write "a c" (with a blank)
to make it clear that these are two identifiers, and then it
will render as <mi>a</mi><mi>c</mi>.
> - mandoc only uses <mi>, not <mo> or <mn>.
Fixed.
> - mandoc will transform a '-' into U+2212, but only when it's not
> directly adjacent to a digit.
Open.
> - In Firefox, <mi> only seems to italicize single letters.
That is required by the MathML standard, see the description of <mi>.
> It looks like adjacent variables, numbers, and operators should be split:
> - 'x=' should become <mi>x</mi><mo>=</mo>
Done.
> - '-b' should become <mo>−</mo><mi>b</mi>
Done except U+2212.
> - '-4ac' should become <mo>−</mo><mn>4</mn><mi>a</mi><mi>c</mi>
I disagree: '4ac' is fine as it is, and '4a c' does become
what you ask for.
> The MathML standard says (MathML 3.0 2e # 3.2.33) that "sin" is
> appropriately marked up with <mi>. So <mi>sin</mi> should be enough to
> correctly render eqn's mathematical words. It seems that for
> non-mathematical words to be rendered with italics by default, they
> should be rendered with a <mi> per letter?
That would be possible, but it is not required, and it gives
strange results for multi-letter identifiers.
Yours,
Ingo
Log Message:
-----------
Write text boxes as <mi>, <mn>, or <mo> as appropriate,
and write fontstyle or fontweight attributes where required.
Missing features reported by bentley@.
Modified Files:
--------------
mdocml:
eqn_html.c
html.c
html.h
Revision Data
-------------
Index: html.h
===================================================================
RCS file: /home/cvs/mdocml/mdocml/html.h,v
retrieving revision 1.85
retrieving revision 1.86
diff -Lhtml.h -Lhtml.h -u -p -r1.85 -r1.86
--- html.h
+++ html.h
@@ -51,6 +51,7 @@ enum htmltag {
TAG_MATH,
TAG_MROW,
TAG_MI,
+ TAG_MN,
TAG_MO,
TAG_MSUP,
TAG_MSUB,
Index: html.c
===================================================================
RCS file: /home/cvs/mdocml/mdocml/html.c,v
retrieving revision 1.214
retrieving revision 1.215
diff -Lhtml.c -Lhtml.c -u -p -r1.214 -r1.215
--- html.c
+++ html.c
@@ -87,6 +87,7 @@ static const struct htmldata htmltags[TA
{"math", HTML_NLALL | HTML_INDENT},
{"mrow", 0},
{"mi", 0},
+ {"mn", 0},
{"mo", 0},
{"msup", 0},
{"msub", 0},
Index: eqn_html.c
===================================================================
RCS file: /home/cvs/mdocml/mdocml/eqn_html.c,v
retrieving revision 1.12
retrieving revision 1.13
diff -Leqn_html.c -Leqn_html.c -u -p -r1.12 -r1.13
--- eqn_html.c
+++ eqn_html.c
@@ -20,6 +20,7 @@
#include <sys/types.h>
#include <assert.h>
+#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
@@ -33,7 +34,10 @@ eqn_box(struct html *p, const struct eqn
{
struct tag *post, *row, *cell, *t;
const struct eqn_box *child, *parent;
+ const unsigned char *cp;
size_t i, j, rows;
+ enum htmltag tag;
+ enum eqn_fontt font;
if (NULL == bp)
return;
@@ -136,9 +140,51 @@ eqn_box(struct html *p, const struct eqn
print_otag(p, TAG_MTD, "");
}
- if (NULL != bp->text) {
- assert(NULL == post);
- post = print_otag(p, TAG_MI, "");
+ if (bp->text != NULL) {
+ assert(post == NULL);
+ tag = TAG_MI;
+ cp = (unsigned char *)bp->text;
+ if (isdigit(cp[0]) || (cp[0] == '.' && isdigit(cp[1]))) {
+ tag = TAG_MN;
+ while (*++cp != '\0') {
+ if (*cp != '.' && !isdigit(*cp)) {
+ tag = TAG_MI;
+ break;
+ }
+ }
+ } else if (*cp != '\0' && isalpha(*cp) == 0) {
+ tag = TAG_MO;
+ while (*++cp != '\0') {
+ if (isalnum(*cp)) {
+ tag = TAG_MI;
+ break;
+ }
+ }
+ }
+ font = bp->font;
+ if (bp->text[0] != '\0' &&
+ (((tag == TAG_MN || tag == TAG_MO) &&
+ font == EQNFONT_ROMAN) ||
+ (tag == TAG_MI && font == (bp->text[1] == '\0' ?
+ EQNFONT_ITALIC : EQNFONT_ROMAN))))
+ font = EQNFONT_NONE;
+ switch (font) {
+ case EQNFONT_NONE:
+ post = print_otag(p, tag, "");
+ break;
+ case EQNFONT_ROMAN:
+ post = print_otag(p, tag, "?", "fontstyle", "normal");
+ break;
+ case EQNFONT_BOLD:
+ case EQNFONT_FAT:
+ post = print_otag(p, tag, "?", "fontweight", "bold");
+ break;
+ case EQNFONT_ITALIC:
+ post = print_otag(p, tag, "?", "fontstyle", "italic");
+ break;
+ default:
+ abort();
+ }
print_text(p, bp->text);
} else if (NULL == post) {
if (NULL != bp->left || NULL != bp->right)
--
To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv
next prev parent reply other threads:[~2017-06-23 2:57 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-20 8:04 Anthony J. Bentley
2017-06-21 20:59 ` Ingo Schwarze
2017-06-23 2:57 ` Ingo Schwarze [this message]
2017-06-23 3:22 ` Anthony J. Bentley
2017-06-23 21:24 ` Ingo Schwarze
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170623025712.GE77030@athene.usta.de \
--to=schwarze@usta.de \
--cc=anthony@anjbe.name \
--cc=tech@mdocml.bsd.lv \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).