tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
* mdoc(7): improve description of .Em and .Sy
@ 2014-08-13 21:22 Ingo Schwarze
  2014-08-13 22:54 ` Guy Harris
       [not found] ` <20140814065720.GB7407@harkle.home.gateway>
  0 siblings, 2 replies; 5+ messages in thread
From: Ingo Schwarze @ 2014-08-13 21:22 UTC (permalink / raw)
  To: jmc; +Cc: tech

Hi,

when working on mandoc -Thtml, we noticed that the descriptions
of .Em and .Sy in the mandoc(1) manual are of dubious quality.

There are four problems with what we have:

 1) For both .Em and .Sy, mdoc(7) says they should not be used
    to mark up technical terms.  This neither matches historical
    nor current practice nor is it possible to follow that advice.
    Situations regularly arise where markup for technical terms
    is needed but none of the semantic macros fit.

 2) In these situations, mdoc(7) provides no advice whatsoever
    which macro to use.

 3) The examples given for .Em blatantly contradict the explanation
    given right above.  While the explanation talks about emphasis,
    the examples are typical for importance; see for example
    https://developer.mozilla.org/en-US/docs/Web/HTML/Element/em
    https://developer.mozilla.org/en-US/docs/Web/HTML/Element/strong
    for the distinction.  Also, the examples do not agree with
    current practice, neither in typography in general nor in our
    manuals, most of which correctly use .Sy for words like "Note:"
    and "Warning!"

 4) .Sy lacks any examples whatsoever.

To fix this, i dug up all existing versions of historical mdoc(7)
manuals to understand what these macros were originally intended
to be.

4.3BSD-Reno (1990) has in mdoc.samples(7):

  Symbolic
  The symbolic request is really a boldface request.  The need for
  this macro has not been established, it is included 'just in case'.
  Usage: .Sy symbol ...
  Example output: something bold

  Emphasis Request
  A portion of text may be stressed or emphasized with the .Em request.
  The font used is commonly italic.
  Usage: .Em argument ...

4.3BSD Net/2 (1991) mdoc.sample(7) up to current groff_mdoc(7) have:

  Symbolic
  The symbolic emphasis macro is generally a boldface macro in either
  the symbolic sense or the traditional English usage.
  Usage: .Sy symbol ...
  Example usage: .Sy Important Notice
  Example output: Important Notice

  Emphasis Macro
  Text may be stressed or emphasized with the `.Em' macro.  The usual
  font for emphasis is italic.
  Usage: .Em argument ...
  Example usage: .Em does not
  Example usage: .Em exceed 1024 .
  Example usage: .Em vide infra ) ) ,

4.4BSD (1993) mdoc(7) has in the macro overview:

  Sy: Symbolic (traditional English).
  Em: Emphasis (traditional English).

Our own MACRO OVERVIEW in mdoc(7) has:

  Physical markup:
  Em: italic font or underline (emphasis)
  Sy: boldface font (symbolic)

The central question is whether these macros should be considered
as semantic or as physical markup.  While the historic documents
may slightly, if inconsistently, favour the semantic standpoint,
in our MACRO OVERVIEW i called them "physical", and i'd like to
stick with that, for the following reason:  Even if we call them
semantic, we have to define such a broad range of semantic
meanings that translation into any other modern semantic markup
language, in particular HTML, becomes impossible.  In particular,
sometimes .Em would have to become <em>, sometimes <i>, .Sy
sometimes <strong> and sometimes <b>, but there is no way to
automatically decide which is the right one when finding one of
these macros in a manual page, so we would have to fall back to
physical markup anyway.  Calling the physical also reflects actual
usage better and isn't completely inconsistent with historical
documentation.

So, i'm proposing the patch appended below.  Note that i'm also
providing some (new) advice which one to use for what.  I'm well
aware that our manuals are not consistent about that, but i think
this is the right direction to move:

  bold:
  .Nm  for utility and page names
  .Fl  for utility options
  .Cm  for fixed strings passed as arguments
  .In  for include files
  .Fo  and .Fn for function names
  .Ms  for mathematical symbols
  .Sy  for other syntax elements to be given verbatim
  .Sy  for highlighting importance

  italic:
  .Pa  for file names
  .Vt  for variable types
  .Va  for variable names
  .Ft  for function return types
  .Fa  for function arguments
  .Ar  for other syntax element placeholders,
       in particular arguments to .Nm, .Fl, .Cm, and .Ic
  .Em  for other technical terms and placeholders, except syntax elements
  .Em  for stress emphasis

That list seems exhaustive to me, and rather close to existing practice.

OK?
  Ingo


Index: mdoc.7
===================================================================
RCS file: /cvs/src/share/man/man7/mdoc.7,v
retrieving revision 1.116
diff -u -p -r1.116 mdoc.7
--- mdoc.7	8 Aug 2014 16:32:17 -0000	1.116
+++ mdoc.7	13 Aug 2014 20:09:51 -0000
@@ -1482,16 +1482,29 @@ See also
 and
 .Sx \&It .
 .Ss \&Em
-Denotes text that should be
-.Em emphasised .
-Note that this is a presentation term and should not be used for
-stylistically decorating technical terms.
-Depending on the output device, this is usually represented
-using an italic font or underlined characters.
+Request an italic font.
+If the output device does not provide that, underline.
 .Pp
-Examples:
-.Dl \&.Em Warnings!
-.Dl \&.Em Remarks :
+This is most often used for stress emphasis (not to be confused with
+importance, see
+.Sx \&Sy ) .
+In the rare cases where none of the semantic markup macros fit,
+it can also be used for technical terms and placeholders, except
+that for syntax elements,
+.Sx \&Sy
+and
+.Sx \&Ar
+are preferred, respectively.
+.Pp
+Examples:
+.Bd -literal -compact -offset indent
+Selected lines are those
+\&.Em not
+matching any of the specified patterns.
+Some of the functions use a
+\&.Em hold space
+to save the pattern space for subsequent retrieval.
+.Ed
 .Pp
 See also
 .Sx \&Bf ,
@@ -2652,10 +2665,24 @@ See also
 and
 .Sx \&Ss .
 .Ss \&Sy
-Format enclosed arguments in symbolic
-.Pq Dq boldface .
-Note that this is a presentation term and should not be used for
-stylistically decorating technical terms.
+Request a boldface font.
+.Pp
+This is most often used to indicate importance or seriousness (not to be
+confused with stress emphasis, see
+.Sx \&Em ) .
+When none of the semantic macros fit, it is also adequate for syntax
+elements that have to be given or that appear verbatim.
+.Pp
+Examples:
+.Bd -literal -compact -offset indent
+\&.Sy Warning :
+If
+\&.Sy s
+appears in the owner permissions, set-user-ID mode is set.
+This utility replaces the former
+\&.Sy dumpdir
+program.
+.Ed
 .Pp
 See also
 .Sx \&Bf ,
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mdoc(7): improve description of .Em and .Sy
  2014-08-13 21:22 mdoc(7): improve description of .Em and .Sy Ingo Schwarze
@ 2014-08-13 22:54 ` Guy Harris
  2014-08-14 16:07   ` Ingo Schwarze
       [not found] ` <20140814065720.GB7407@harkle.home.gateway>
  1 sibling, 1 reply; 5+ messages in thread
From: Guy Harris @ 2014-08-13 22:54 UTC (permalink / raw)
  To: tech; +Cc: jmc


On Aug 13, 2014, at 2:22 PM, Ingo Schwarze <schwarze@usta.de> wrote:

> The central question is whether these macros should be considered
> as semantic or as physical markup.  While the historic documents
> may slightly, if inconsistently, favour the semantic standpoint,
> in our MACRO OVERVIEW i called them "physical", and i'd like to
> stick with that, for the following reason:  Even if we call them
> semantic, we have to define such a broad range of semantic
> meanings that translation into any other modern semantic markup
> language, in particular HTML, becomes impossible.  In particular,
> sometimes .Em would have to become <em>, sometimes <i>, .Sy
> sometimes <strong> and sometimes <b>, but there is no way to
> automatically decide which is the right one when finding one of
> these macros in a manual page, so we would have to fall back to
> physical markup anyway.  Calling the physical also reflects actual
> usage better and isn't completely inconsistent with historical
> documentation.

Well, there's physical, as in "display this in italic characters", and there's physical, as in "present this in some form that indicates emphasis"; the latter is "physical" in that it explicitly affects presentation but not "physical" in the sense that it explicitly *specifies* presentation (it could be read in an emphatic tone by a browser for vision-impaired users).

Unfortunately, mdoc, unlike HTML, never had "display this in italic characters", and somebody who had a reason to want the text displayed in italic characters for reasons *other* than emphasis had to fall back on .Em, so maybe retroactively redefining .Em to mean "display this in italic characters" is the least bad choice (adding .I to mdoc wouldn't help people who want to write man pages that will work with older versions of mandoc and with the mdoc macros).


--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mdoc(7): improve description of .Em and .Sy
       [not found] ` <20140814065720.GB7407@harkle.home.gateway>
@ 2014-08-14 15:13   ` Ingo Schwarze
       [not found]     ` <20140814201627.GD7407@harkle.home.gateway>
  0 siblings, 1 reply; 5+ messages in thread
From: Ingo Schwarze @ 2014-08-14 15:13 UTC (permalink / raw)
  To: Jason McIntyre; +Cc: tech

Hi Jason,

> i'm not sure what difference there is between "stress emphasis"
> and "importance". to my mind, they are the same.

Citing from 
http://www.w3.org/html/wg/drafts/html/CR/text-level-semantics.html

  Stress emphasis
  ---------------
  The placement of stress emphasis changes the meaning of the
  sentence.  The element thus forms an integral part of the content.
  The precise way in which stress is used in this way depends on
  the language.

  These examples show how changing the stress emphasis changes the
  meaning.  First, a general statement of fact, with no stress:

  <p>Cats are cute animals.</p>

  By emphasizing the first word, the statement implies that the kind
  of animal under discussion is in question (maybe someone is asserting
  that dogs are cute):

  <p><em>Cats</em> are cute animals.</p>

  Moving the stress to the verb, one highlights that the truth of the
  entire sentence is in question (maybe someone is saying cats are not
  cute):

  <p>Cats <em>are</em> cute animals.</p>
  [...]


  Importance
  ----------
  The "strong" element represents strong importance, seriousness,
  or urgency for its contents.

  Importance: The strong element can be used in a heading, caption,
  or paragraph to distinguish the part that really matters from
  other parts of it that might be more detailed, more jovial, or
  merely boilerplate.
  [...]

  Seriousness: The strong element can be used to mark up a warning
  or caution notice.

  Urgency: The strong element can be used to denote contents that
  the user needs to see sooner than other parts of the document.
  [...]

  Changing the importance of a piece of text with the strong element
  does not change the meaning of the sentence.

I think this distinction has usually been made in typography.  In
a professionally typeset technical manual, you might find "grep -v
selects the lines that are <em>not</em> matched by the pattern" or
"<strong>Warning:</strong> sed -i can destroy your file", but hardly
the other way round.  Even if people couldn't explain the rules, a
conforming text subconsciously helps understanding, just like good
orthography helps understanding even for people who make many
spelling errors when writing themselves.

That's not the main point of my patch, though.  Sure, I'm also
trying to make the description of .Sy (right now: "Format enclosed
arguments in symbolic" - ?!?) a bit clearer, but the main point is
to clearly mark .Em and .Sy as physical markup.

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mdoc(7): improve description of .Em and .Sy
  2014-08-13 22:54 ` Guy Harris
@ 2014-08-14 16:07   ` Ingo Schwarze
  0 siblings, 0 replies; 5+ messages in thread
From: Ingo Schwarze @ 2014-08-14 16:07 UTC (permalink / raw)
  To: Guy Harris; +Cc: tech, jmc

Hi Guy,

Guy Harris wrote on Wed, Aug 13, 2014 at 03:54:55PM -0700:

> On Aug 13, 2014, at 2:22 PM, Ingo Schwarze <schwarze@usta.de> wrote:

>> The central question is whether these macros should be considered
>> as semantic or as physical markup.  While the historic documents
>> may slightly, if inconsistently, favour the semantic standpoint,
>> in our MACRO OVERVIEW i called them "physical", and i'd like to
>> stick with that, for the following reason:  Even if we call them
>> semantic, we have to define such a broad range of semantic
>> meanings that translation into any other modern semantic markup
>> language, in particular HTML, becomes impossible.  In particular,
>> sometimes .Em would have to become <em>, sometimes <i>, .Sy
>> sometimes <strong> and sometimes <b>, but there is no way to
>> automatically decide which is the right one when finding one of
>> these macros in a manual page, so we would have to fall back to
>> physical markup anyway.  Calling the physical also reflects actual
>> usage better and isn't completely inconsistent with historical
>> documentation.

> Well, there's physical, as in "display this in italic characters",

That's exactly what people usually mean when talking about
physical (or presentational or visual) markup, see for example

http://www.math.grin.edu/~rebelsky/Tutorials/Design/EdMedia97/logical-vs-physical.html
http://www.augustana.ab.ca/~mohrj/courses/2000.fall/csc110/lecture_notes/html.html
http://webtips.dan.info/logical.html
http://www.cs.tut.fi/~jKorpela/HTML3.2/4.5.html
...

> and there's physical, as in "present this in some form that
> indicates emphasis"; the latter is "physical" in that it explicitly
> affects presentation [...]

"Affecting but not specifying presentation" is almost the definition
of semantic (or logic or structural) markup, which is the opposite
of physical markup.  So i fear you are confusing the terms here.

> Unfortunately, mdoc, unlike HTML, never had "display this in italic
> characters", and somebody who had a reason to want the text displayed
> in italic characters for reasons *other* than emphasis had to fall back
> on .Em,

Not quite.  When the development of mdoc(7) started in about 1989,
it predated HTML and the web, and people weren't aware at that time
that the distinction of physical and semantic markup is all that
important.  For example Tim Barners-Lee's original WWW proposal
does not mention "formatting" at all:

  http://www.w3.org/History/1989/proposal.html

There is a document about future "HTML directions" as late as 1992 (!)
talking about the possible introduction of <bold> and <italic> tags -
that's what is called <b> and <i> now - mentioning the terms "physical"
and "logical" markup, but not even mentioning that logical markup should
usually be preferred:

  http://www.w3.org/History/19921103-hypertext/hypertext/WWW/MarkUp/Future.html

People were blurry about it at the time, they didn't see that it
matters, you can clearly see that here in the 4.3BSD-Reno text
predating the HTML <bold> discussion by two years:

  Symbolic
  The symbolic request is really a boldface request.  The need for this
  macro has not been established, it is included 'just in case'.
  Usage: .Sy symbol ...
  Example output: something bold

  Emphasis Request
  A portion of text may be stressed or emphasized with the .Em request.
  The font used is commonly italic.
  Usage: .Em argument ...

One is described as physical ("boldface") with a fuzzy reference
to semantics ("symbolic"), the other is described as semantic
("emphasized").  People just didn't care back then.  That
neglect persists in the groff documentation to this day.

> so maybe retroactively redefining .Em to mean "display this in italic
> characters"

We are not retroactively redefining it.  The code always did just
that and the docs always hinted at both the formatting and the
semantics.  It's merely that at the time the documentation was
written, people didn't care about the distinction "<font>, usually
used for <purpose>" and "<purpose>, usually formatted in <font>".
So we are merely clarifying it.

> is the least bad choice (adding .I to mdoc wouldn't help people who
> want to write man pages that will work with older versions of mandoc
> and with the mdoc macros).

I don't see an urgent need to add any macros in this area either.

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mdoc(7): improve description of .Em and .Sy
       [not found]     ` <20140814201627.GD7407@harkle.home.gateway>
@ 2014-08-14 22:05       ` Ingo Schwarze
  0 siblings, 0 replies; 5+ messages in thread
From: Ingo Schwarze @ 2014-08-14 22:05 UTC (permalink / raw)
  To: Jason McIntyre; +Cc: tech

Hi Jason,

Jason McIntyre wrote on Thu, Aug 14, 2014 at 09:16:02PM +0100:
> On Thu, Aug 14, 2014 at 05:13:10PM +0200, Ingo Schwarze wrote:

>> Citing from 
>> http://www.w3.org/html/wg/drafts/html/CR/text-level-semantics.html
[...]

> these are nice tidy rules, though i note no example is given for
> importance.

Oh, they do have an example for importance, i just cut it for brevity.

> anyway, i'm just trying to say it's blurry.

That is certainly true.  All the more reason for trying to at least
make the idea as clear als possible, even though cases where the
application is ambiguous will no doubt occur in practice.

> emphasis is not
> just about "stress" (more a spoken thing anyway). yes, in that case you
> can change the meaning of a sentence. what about:
> 
> 	The shell is a
> 	.Em command line interpreter .
> 
> 	The shell is a
> 	.Sy command line interpreter .

Actually, it depends on the context.  I would consider the following
good usage:

  When espie@ complained what a
  .Em horribly
  designed programming language the shell is,
  halex@ tried to argue that it's not actually
  .Em that
  bad, and jmc@ remarked that it's just a
  .Em command line interpreter .

  Some commonly installed programs:
  .Bl -bullet
  .It
  scp is a
  .Em remote copy program.
  If is part of the OpenSSH suite...
  .It
  The shell is a
  .Sy command line interpreter .
  Many flavours are available: ksh, csh, zsh, bash, ...
  .It
  sed is a non-interactive
  .Em stream editor
  ...

> this is not stress emphasis, but i'd pick Em.
> maybe i'd be wrong (along with 1000 other man pages).

Hm, i'm not quite sure which context you are thinking of.

> yes, i can see the mess you're trying to clean up.
> i'm fine with your diffs, and haven;t a better solution to hand.

Good, so it's in.

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-08-14 22:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-13 21:22 mdoc(7): improve description of .Em and .Sy Ingo Schwarze
2014-08-13 22:54 ` Guy Harris
2014-08-14 16:07   ` Ingo Schwarze
     [not found] ` <20140814065720.GB7407@harkle.home.gateway>
2014-08-14 15:13   ` Ingo Schwarze
     [not found]     ` <20140814201627.GD7407@harkle.home.gateway>
2014-08-14 22:05       ` Ingo Schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).