tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
* Rendering of math symbols in ASCII mode
@ 2017-06-19  0:04 Anthony J. Bentley
  2017-06-19 17:06 ` Ingo Schwarze
  0 siblings, 1 reply; 4+ messages in thread
From: Anthony J. Bentley @ 2017-06-19  0:04 UTC (permalink / raw)
  To: tech

Hi,

This beauty shows up in erf(3):

.Bd -filled -offset indent
.if n \{\
erf(x) = 2/sqrt(pi)*\|integral from 0 to x of exp(\-t*t) dt. \}
.if t \{\
erf\|(x) :=
(2/\(sr pi)\|\(is\d\s8\z0\s10\u\u\s8x\s10\d\|exp(\-t\u\s82\s10\d)\|dt. \}
.Ed

My first instinct was to replace it with eqn(7), a pleasant way to write
non-trivial equations:

.EQ
roman erf ( x ) = 2 over {sqrt pi} int sub 0 sup x e sup {- t sup 2} dt
.EN

This looks great in groff -Tpdf and mandoc -Thtml, and is serviceable
enough in mandoc -Tutf8. Unfortunately, in ASCII it looks like this:

    erf ( x ) = 2/(sqrt(n)) I_0^x e^(- t^2) dt

As per mandoc_char(7), the integral symbol is replaced with I and pi
is replaced with n. Awful!

While I'm inclined to think that attempting to render equations in pure
ASCII is somewhat futile, the human-readable nature of eqn(7) suggests
the possibility of simply displaying the symbol name for certain symbols
like Greek letters and integrals.

    erf ( x ) = 2/(sqrt(pi)) int_0^x e^(- t^2) dt

I think it compares favorably with the current rendering:

    erf(x) = 2/sqrt(pi)*integral from 0 to x of exp(-t*t) dt

This would make it practical to use eqn(7) in certain libm manuals,
which I think could be a nice improvement. Viewing the math-heavy libGL
manuals on man.openbsd.org is a real pleasure.

Thoughts?

-- 
Anthony J. Bentley
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Rendering of math symbols in ASCII mode
  2017-06-19  0:04 Rendering of math symbols in ASCII mode Anthony J. Bentley
@ 2017-06-19 17:06 ` Ingo Schwarze
  2017-06-20  4:27   ` Anthony J. Bentley
  0 siblings, 1 reply; 4+ messages in thread
From: Ingo Schwarze @ 2017-06-19 17:06 UTC (permalink / raw)
  To: Anthony J. Bentley; +Cc: tech

Hi Anthony,

Anthony J. Bentley wrote on Sun, Jun 18, 2017 at 06:04:23PM -0600:

> This beauty shows up in erf(3):
> 
> .Bd -filled -offset indent
> .if n \{\
> erf(x) = 2/sqrt(pi)*\|integral from 0 to x of exp(\-t*t) dt. \}
> .if t \{\
> erf\|(x) :=
> (2/\(sr pi)\|\(is\d\s8\z0\s10\u\u\s8x\s10\d\|exp(\-t\u\s82\s10\d)\|dt. \}
> .Ed
> 
> My first instinct was to replace it with eqn(7),

I agree with that direction.  While eqn(7) is better avoided where
it can reasonably be avoided, it is the best solution for those
unusual cases where non-trivial mathematical formulae are essential
content of a manual page.

> a pleasant way to write non-trivial equations:
> 
> .EQ
> roman erf ( x ) = 2 over {sqrt pi} int sub 0 sup x e sup {- t sup 2} dt
> .EN

Why roman?  Aren't function names usually set italic in mathematic
papers?

> This looks great in groff -Tpdf and mandoc -Thtml, and is serviceable
> enough in mandoc -Tutf8. Unfortunately, in ASCII it looks like this:
> 
>     erf ( x ) = 2/(sqrt(n)) I_0^x e^(- t^2) dt
> 
> As per mandoc_char(7), the integral symbol is replaced with I and pi
> is replaced with n. Awful!

Awful indeed, but not our choice.

Look at this in /usr/local/share/groff/1.22.3/tmac/tty-char.tmac:

.\" These definitions are chosen so that, as far as possible, they:
.\" - work with all of -Tascii, -Tlatin1, -Tutf8, and -Tcp1047.
.\" - work on devices that display only the last overstruck character
.\"   as well as on devices that support overstriking
.\" - represent the character's graphical shape (not its meaning)

> While I'm inclined to think that attempting to render equations in pure
> ASCII is somewhat futile, the human-readable nature of eqn(7) suggests
> the possibility of simply displaying the symbol name for certain symbols
> like Greek letters and integrals.
> 
>     erf ( x ) = 2/(sqrt(pi)) int_0^x e^(- t^2) dt

I agree that is a clear improvement.

> This would make it practical to use eqn(7) in certain libm manuals,
> which I think could be a nice improvement. Viewing the math-heavy libGL
> manuals on man.openbsd.org is a real pleasure.
> 
> Thoughts?

Currently, words like "integral" are translated to character escape
sequences by the parser, independent of the output format.  Moving
that translation to the formatters in order to avoid it for -Tascii
would cause code duplication because several formatters need the
same translation: -Tutf8, -Tpdf, -Thtml...

Besider, eqn(7) advertises that "the character escape sequences
documented in mandoc_char(7) can be used, too", in addition to the
translated words, so moving the tranlation to the formatters would
only solve part of the problem and leave people in the rain who -
legitimately - use \(is or \[integral] directly.

So i think the rendering of special characters in -Tascii needs to
be improved, but that requires building consensus with groff first,
hoping that they might be willing to move away from the "graphical
shape, not meaning" paradigm.

Here are some of the unintelligible renderings:

	\(Ah	N	ALEF SYMBOL
	\(fa	V	FOR ALL
	\(pd	a	PARTIAL DIFFERENTIAL
	\(te	3	THERE EXISTS
	\(mo	E	ELEMENT OF
	\(st	-)	CONTAINS AS MEMBER
	\(sr	\/	SQUARE ROOT
	\(ca	(^)	INTERSECTION
	\(cu	U	UNION
	\(is	I	INTEGRAL
	\(sb	(=	SUBSET OF
	\(su	=)	SUPERSET OF
	\(ib	(_	SUBSET OF OR EQUAL TO
	\(ip	_)	SUPERSET OF OR EQUAL TO

I don't think any of these allow intelligible one- or two-character
ASCII renderings.

Multi-character descriptive renderings cause a problem with
spacing.  For example, rendering

  \[sr]2\[mo]R

as

  sqrt2element ofR

is not very readable.

For some character escape sequences that render to nothing in
groff -Tascii, i decided to use descriptive renderings
enclosed in angle brackets in the past:

	\(sc	<sec>	SECTION SIGN
	\(de	<deg>	DEGREE SIGN
	\(ps	<par>	PILCROW SIGN
	\(CS	<lub>	BLACK CLUB SUIT

Do you consider that desirable for cases like the above, too?

	\(Ah	<Aleph>		ALEF SYMBOL
	\(fa	<for all>	FOR ALL
	\(te	<there exists>	THERE EXISTS
	\(mo	<element of>	ELEMENT OF
	\(sr	<sqrt>		SQUARE ROOT
	\(ca	<intersection>	INTERSECTION
	\(cu	<union>		UNION
	\(is	<integral>	INTEGRAL
	\(sb	<subset of>	SUBSET OF
	\(su	<superset of>	SUPERSET OF

There is a similar problem with small greek letters.

  e^iomegat

is not very readable.  Or is it good enough?  Is

  e^i<omega>t 

better?  If so, what about pi?

  cos(2pi) = 1

is obviously fine, but

  exp(2pii) = 1

would arguably be more readable as

  exp(2<pi>i) = -1

It might be a good idea to form consensus here before approaching
the groff community with patches.

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Rendering of math symbols in ASCII mode
  2017-06-19 17:06 ` Ingo Schwarze
@ 2017-06-20  4:27   ` Anthony J. Bentley
  2017-06-20 16:44     ` Ingo Schwarze
  0 siblings, 1 reply; 4+ messages in thread
From: Anthony J. Bentley @ 2017-06-20  4:27 UTC (permalink / raw)
  To: Ingo Schwarze; +Cc: tech

Hi Ingo,

Thanks for the thoughts.

Ingo Schwarze writes:
> > .EQ
> > roman erf ( x ) = 2 over {sqrt pi} int sub 0 sup x e sup {- t sup 2} dt
> > .EN
> 
> Why roman?  Aren't function names usually set italic in mathematic
> papers?

My impression is that single-letter functions are italic but words are
roman. I've formatted that way since reading this in the TeXBook several
years ago:

"The names of algebraic variables are usually italic or Greek letters,
but common mathematical functions like 'log' are always set in roman
type."

Reading it again, I'm not sure what qualifies as a "common" function.
Briefly browsing the dusty TAOCP from my shelf just now, I encountered
several italicized single-character function names, several romanized
multi-character function names, and no contrary examples.

> Currently, words like "integral" are translated to character escape
> sequences by the parser, independent of the output format.  Moving
> that translation to the formatters in order to avoid it for -Tascii
> would cause code duplication because several formatters need the
> same translation: -Tutf8, -Tpdf, -Thtml...

Indeed, that would be ugly.

> Multi-character descriptive renderings cause a problem with
> spacing.  For example, rendering
> 
>   \[sr]2\[mo]R
> 
> as
> 
>   sqrt2element ofR
> 
> is not very readable.

Nitpick: \(sr is not the right way to do square roots in eqn(7).
The sqrt keyword will wrap in parentheses. (But I get your point.)

> For some character escape sequences that render to nothing in
> groff -Tascii, i decided to use descriptive renderings
> enclosed in angle brackets in the past:
> 
> 	\(sc	<sec>	SECTION SIGN
> 	\(de	<deg>	DEGREE SIGN
> 	\(ps	<par>	PILCROW SIGN
> 	\(CS	<lub>	BLACK CLUB SUIT
> 
> Do you consider that desirable for cases like the above, too?

Indeed, I think this is a sensible and readable way to go for -Tascii,
particularly for characters likely to be used in eqn(7).

-- 
Anthony J. Bentley
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Rendering of math symbols in ASCII mode
  2017-06-20  4:27   ` Anthony J. Bentley
@ 2017-06-20 16:44     ` Ingo Schwarze
  0 siblings, 0 replies; 4+ messages in thread
From: Ingo Schwarze @ 2017-06-20 16:44 UTC (permalink / raw)
  To: Anthony J. Bentley; +Cc: tech

Hi Anthony,

Anthony J. Bentley wrote on Mon, Jun 19, 2017 at 10:27:00PM -0600:
> Ingo Schwarze writes:
>> Anthony Bentley wrote:

>>> .EQ
>>> roman erf ( x ) = 2 over {sqrt pi} int sub 0 sup x e sup {- t sup 2} dt
>>> .EN

>> Why roman?  Aren't function names usually set italic in mathematic
>> papers?

> My impression is that single-letter functions are italic but words
> are roman.

I stand corrected.  Actually, blowing the dust off my own diploma
thesis, i found several instances of arctan, exp, ln, log - all
roman.  Apparently, it has been too long since i looked at such
typesetting...

> Nitpick: \(sr is not the right way to do square roots in eqn(7).
> The sqrt keyword will wrap in parentheses.

I fully agree with that.

> (But I get your point.)

Once i came to the intermediate conclusion that we better work on
the level of character escape sequences than on the level of egn(7)
syntax, i considered all escapes, no matter whether they are needed
in the context of eqn(7).

>> For some character escape sequences that render to nothing in
>> groff -Tascii, i decided to use descriptive renderings
>> enclosed in angle brackets in the past:
>> 
>> 	\(sc	<sec>	SECTION SIGN
>> 	\(de	<deg>	DEGREE SIGN
>> 	\(ps	<par>	PILCROW SIGN
>> 	\(CS	<lub>	BLACK CLUB SUIT
>> 
>> Do you consider that desirable for cases like the above, too?

> Indeed, I think this is a sensible and readable way to go for -Tascii,
> particularly for characters likely to be used in eqn(7).

Following that, my inquiry on <groff@gnu.org> already elicited
positive feedback from Werner Lemberg.  So i'm quite hopeful
that we can get this done and make eqn(7) useful for your
intended purpose.

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-06-20 16:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-19  0:04 Rendering of math symbols in ASCII mode Anthony J. Bentley
2017-06-19 17:06 ` Ingo Schwarze
2017-06-20  4:27   ` Anthony J. Bentley
2017-06-20 16:44     ` Ingo Schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).