Hi, when working on mandoc -Thtml, we noticed that the descriptions of .Em and .Sy in the mandoc(1) manual are of dubious quality. There are four problems with what we have: 1) For both .Em and .Sy, mdoc(7) says they should not be used to mark up technical terms. This neither matches historical nor current practice nor is it possible to follow that advice. Situations regularly arise where markup for technical terms is needed but none of the semantic macros fit. 2) In these situations, mdoc(7) provides no advice whatsoever which macro to use. 3) The examples given for .Em blatantly contradict the explanation given right above. While the explanation talks about emphasis, the examples are typical for importance; see for example https://developer.mozilla.org/en-US/docs/Web/HTML/Element/em https://developer.mozilla.org/en-US/docs/Web/HTML/Element/strong for the distinction. Also, the examples do not agree with current practice, neither in typography in general nor in our manuals, most of which correctly use .Sy for words like "Note:" and "Warning!" 4) .Sy lacks any examples whatsoever. To fix this, i dug up all existing versions of historical mdoc(7) manuals to understand what these macros were originally intended to be. 4.3BSD-Reno (1990) has in mdoc.samples(7): Symbolic The symbolic request is really a boldface request. The need for this macro has not been established, it is included 'just in case'. Usage: .Sy symbol ... Example output: something bold Emphasis Request A portion of text may be stressed or emphasized with the .Em request. The font used is commonly italic. Usage: .Em argument ... 4.3BSD Net/2 (1991) mdoc.sample(7) up to current groff_mdoc(7) have: Symbolic The symbolic emphasis macro is generally a boldface macro in either the symbolic sense or the traditional English usage. Usage: .Sy symbol ... Example usage: .Sy Important Notice Example output: Important Notice Emphasis Macro Text may be stressed or emphasized with the `.Em' macro. The usual font for emphasis is italic. Usage: .Em argument ... Example usage: .Em does not Example usage: .Em exceed 1024 . Example usage: .Em vide infra ) ) , 4.4BSD (1993) mdoc(7) has in the macro overview: Sy: Symbolic (traditional English). Em: Emphasis (traditional English). Our own MACRO OVERVIEW in mdoc(7) has: Physical markup: Em: italic font or underline (emphasis) Sy: boldface font (symbolic) The central question is whether these macros should be considered as semantic or as physical markup. While the historic documents may slightly, if inconsistently, favour the semantic standpoint, in our MACRO OVERVIEW i called them "physical", and i'd like to stick with that, for the following reason: Even if we call them semantic, we have to define such a broad range of semantic meanings that translation into any other modern semantic markup language, in particular HTML, becomes impossible. In particular, sometimes .Em would have to become <em>, sometimes <i>, .Sy sometimes <strong> and sometimes <b>, but there is no way to automatically decide which is the right one when finding one of these macros in a manual page, so we would have to fall back to physical markup anyway. Calling the physical also reflects actual usage better and isn't completely inconsistent with historical documentation. So, i'm proposing the patch appended below. Note that i'm also providing some (new) advice which one to use for what. I'm well aware that our manuals are not consistent about that, but i think this is the right direction to move: bold: .Nm for utility and page names .Fl for utility options .Cm for fixed strings passed as arguments .In for include files .Fo and .Fn for function names .Ms for mathematical symbols .Sy for other syntax elements to be given verbatim .Sy for highlighting importance italic: .Pa for file names .Vt for variable types .Va for variable names .Ft for function return types .Fa for function arguments .Ar for other syntax element placeholders, in particular arguments to .Nm, .Fl, .Cm, and .Ic .Em for other technical terms and placeholders, except syntax elements .Em for stress emphasis That list seems exhaustive to me, and rather close to existing practice. OK? Ingo Index: mdoc.7 =================================================================== RCS file: /cvs/src/share/man/man7/mdoc.7,v retrieving revision 1.116 diff -u -p -r1.116 mdoc.7 --- mdoc.7 8 Aug 2014 16:32:17 -0000 1.116 +++ mdoc.7 13 Aug 2014 20:09:51 -0000 @@ -1482,16 +1482,29 @@ See also and .Sx \&It . .Ss \&Em -Denotes text that should be -.Em emphasised . -Note that this is a presentation term and should not be used for -stylistically decorating technical terms. -Depending on the output device, this is usually represented -using an italic font or underlined characters. +Request an italic font. +If the output device does not provide that, underline. .Pp -Examples: -.Dl \&.Em Warnings! -.Dl \&.Em Remarks : +This is most often used for stress emphasis (not to be confused with +importance, see +.Sx \&Sy ) . +In the rare cases where none of the semantic markup macros fit, +it can also be used for technical terms and placeholders, except +that for syntax elements, +.Sx \&Sy +and +.Sx \&Ar +are preferred, respectively. +.Pp +Examples: +.Bd -literal -compact -offset indent +Selected lines are those +\&.Em not +matching any of the specified patterns. +Some of the functions use a +\&.Em hold space +to save the pattern space for subsequent retrieval. +.Ed .Pp See also .Sx \&Bf , @@ -2652,10 +2665,24 @@ See also and .Sx \&Ss . .Ss \&Sy -Format enclosed arguments in symbolic -.Pq Dq boldface . -Note that this is a presentation term and should not be used for -stylistically decorating technical terms. +Request a boldface font. +.Pp +This is most often used to indicate importance or seriousness (not to be +confused with stress emphasis, see +.Sx \&Em ) . +When none of the semantic macros fit, it is also adequate for syntax +elements that have to be given or that appear verbatim. +.Pp +Examples: +.Bd -literal -compact -offset indent +\&.Sy Warning : +If +\&.Sy s +appears in the owner permissions, set-user-ID mode is set. +This utility replaces the former +\&.Sy dumpdir +program. +.Ed .Pp See also .Sx \&Bf , -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv
On Aug 13, 2014, at 2:22 PM, Ingo Schwarze <schwarze@usta.de> wrote:
> The central question is whether these macros should be considered
> as semantic or as physical markup. While the historic documents
> may slightly, if inconsistently, favour the semantic standpoint,
> in our MACRO OVERVIEW i called them "physical", and i'd like to
> stick with that, for the following reason: Even if we call them
> semantic, we have to define such a broad range of semantic
> meanings that translation into any other modern semantic markup
> language, in particular HTML, becomes impossible. In particular,
> sometimes .Em would have to become <em>, sometimes <i>, .Sy
> sometimes <strong> and sometimes <b>, but there is no way to
> automatically decide which is the right one when finding one of
> these macros in a manual page, so we would have to fall back to
> physical markup anyway. Calling the physical also reflects actual
> usage better and isn't completely inconsistent with historical
> documentation.
Well, there's physical, as in "display this in italic characters", and there's physical, as in "present this in some form that indicates emphasis"; the latter is "physical" in that it explicitly affects presentation but not "physical" in the sense that it explicitly *specifies* presentation (it could be read in an emphatic tone by a browser for vision-impaired users).
Unfortunately, mdoc, unlike HTML, never had "display this in italic characters", and somebody who had a reason to want the text displayed in italic characters for reasons *other* than emphasis had to fall back on .Em, so maybe retroactively redefining .Em to mean "display this in italic characters" is the least bad choice (adding .I to mdoc wouldn't help people who want to write man pages that will work with older versions of mandoc and with the mdoc macros).
--
To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv
Hi Jason, > i'm not sure what difference there is between "stress emphasis" > and "importance". to my mind, they are the same. Citing from http://www.w3.org/html/wg/drafts/html/CR/text-level-semantics.html Stress emphasis --------------- The placement of stress emphasis changes the meaning of the sentence. The element thus forms an integral part of the content. The precise way in which stress is used in this way depends on the language. These examples show how changing the stress emphasis changes the meaning. First, a general statement of fact, with no stress: <p>Cats are cute animals.</p> By emphasizing the first word, the statement implies that the kind of animal under discussion is in question (maybe someone is asserting that dogs are cute): <p><em>Cats</em> are cute animals.</p> Moving the stress to the verb, one highlights that the truth of the entire sentence is in question (maybe someone is saying cats are not cute): <p>Cats <em>are</em> cute animals.</p> [...] Importance ---------- The "strong" element represents strong importance, seriousness, or urgency for its contents. Importance: The strong element can be used in a heading, caption, or paragraph to distinguish the part that really matters from other parts of it that might be more detailed, more jovial, or merely boilerplate. [...] Seriousness: The strong element can be used to mark up a warning or caution notice. Urgency: The strong element can be used to denote contents that the user needs to see sooner than other parts of the document. [...] Changing the importance of a piece of text with the strong element does not change the meaning of the sentence. I think this distinction has usually been made in typography. In a professionally typeset technical manual, you might find "grep -v selects the lines that are <em>not</em> matched by the pattern" or "<strong>Warning:</strong> sed -i can destroy your file", but hardly the other way round. Even if people couldn't explain the rules, a conforming text subconsciously helps understanding, just like good orthography helps understanding even for people who make many spelling errors when writing themselves. That's not the main point of my patch, though. Sure, I'm also trying to make the description of .Sy (right now: "Format enclosed arguments in symbolic" - ?!?) a bit clearer, but the main point is to clearly mark .Em and .Sy as physical markup. Yours, Ingo -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv
Hi Guy, Guy Harris wrote on Wed, Aug 13, 2014 at 03:54:55PM -0700: > On Aug 13, 2014, at 2:22 PM, Ingo Schwarze <schwarze@usta.de> wrote: >> The central question is whether these macros should be considered >> as semantic or as physical markup. While the historic documents >> may slightly, if inconsistently, favour the semantic standpoint, >> in our MACRO OVERVIEW i called them "physical", and i'd like to >> stick with that, for the following reason: Even if we call them >> semantic, we have to define such a broad range of semantic >> meanings that translation into any other modern semantic markup >> language, in particular HTML, becomes impossible. In particular, >> sometimes .Em would have to become <em>, sometimes <i>, .Sy >> sometimes <strong> and sometimes <b>, but there is no way to >> automatically decide which is the right one when finding one of >> these macros in a manual page, so we would have to fall back to >> physical markup anyway. Calling the physical also reflects actual >> usage better and isn't completely inconsistent with historical >> documentation. > Well, there's physical, as in "display this in italic characters", That's exactly what people usually mean when talking about physical (or presentational or visual) markup, see for example http://www.math.grin.edu/~rebelsky/Tutorials/Design/EdMedia97/logical-vs-physical.html http://www.augustana.ab.ca/~mohrj/courses/2000.fall/csc110/lecture_notes/html.html http://webtips.dan.info/logical.html http://www.cs.tut.fi/~jKorpela/HTML3.2/4.5.html ... > and there's physical, as in "present this in some form that > indicates emphasis"; the latter is "physical" in that it explicitly > affects presentation [...] "Affecting but not specifying presentation" is almost the definition of semantic (or logic or structural) markup, which is the opposite of physical markup. So i fear you are confusing the terms here. > Unfortunately, mdoc, unlike HTML, never had "display this in italic > characters", and somebody who had a reason to want the text displayed > in italic characters for reasons *other* than emphasis had to fall back > on .Em, Not quite. When the development of mdoc(7) started in about 1989, it predated HTML and the web, and people weren't aware at that time that the distinction of physical and semantic markup is all that important. For example Tim Barners-Lee's original WWW proposal does not mention "formatting" at all: http://www.w3.org/History/1989/proposal.html There is a document about future "HTML directions" as late as 1992 (!) talking about the possible introduction of <bold> and <italic> tags - that's what is called <b> and <i> now - mentioning the terms "physical" and "logical" markup, but not even mentioning that logical markup should usually be preferred: http://www.w3.org/History/19921103-hypertext/hypertext/WWW/MarkUp/Future.html People were blurry about it at the time, they didn't see that it matters, you can clearly see that here in the 4.3BSD-Reno text predating the HTML <bold> discussion by two years: Symbolic The symbolic request is really a boldface request. The need for this macro has not been established, it is included 'just in case'. Usage: .Sy symbol ... Example output: something bold Emphasis Request A portion of text may be stressed or emphasized with the .Em request. The font used is commonly italic. Usage: .Em argument ... One is described as physical ("boldface") with a fuzzy reference to semantics ("symbolic"), the other is described as semantic ("emphasized"). People just didn't care back then. That neglect persists in the groff documentation to this day. > so maybe retroactively redefining .Em to mean "display this in italic > characters" We are not retroactively redefining it. The code always did just that and the docs always hinted at both the formatting and the semantics. It's merely that at the time the documentation was written, people didn't care about the distinction "<font>, usually used for <purpose>" and "<purpose>, usually formatted in <font>". So we are merely clarifying it. > is the least bad choice (adding .I to mdoc wouldn't help people who > want to write man pages that will work with older versions of mandoc > and with the mdoc macros). I don't see an urgent need to add any macros in this area either. Yours, Ingo -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv
Hi Jason, Jason McIntyre wrote on Thu, Aug 14, 2014 at 09:16:02PM +0100: > On Thu, Aug 14, 2014 at 05:13:10PM +0200, Ingo Schwarze wrote: >> Citing from >> http://www.w3.org/html/wg/drafts/html/CR/text-level-semantics.html [...] > these are nice tidy rules, though i note no example is given for > importance. Oh, they do have an example for importance, i just cut it for brevity. > anyway, i'm just trying to say it's blurry. That is certainly true. All the more reason for trying to at least make the idea as clear als possible, even though cases where the application is ambiguous will no doubt occur in practice. > emphasis is not > just about "stress" (more a spoken thing anyway). yes, in that case you > can change the meaning of a sentence. what about: > > The shell is a > .Em command line interpreter . > > The shell is a > .Sy command line interpreter . Actually, it depends on the context. I would consider the following good usage: When espie@ complained what a .Em horribly designed programming language the shell is, halex@ tried to argue that it's not actually .Em that bad, and jmc@ remarked that it's just a .Em command line interpreter . Some commonly installed programs: .Bl -bullet .It scp is a .Em remote copy program. If is part of the OpenSSH suite... .It The shell is a .Sy command line interpreter . Many flavours are available: ksh, csh, zsh, bash, ... .It sed is a non-interactive .Em stream editor ... > this is not stress emphasis, but i'd pick Em. > maybe i'd be wrong (along with 1000 other man pages). Hm, i'm not quite sure which context you are thinking of. > yes, i can see the mess you're trying to clean up. > i'm fine with your diffs, and haven;t a better solution to hand. Good, so it's in. Yours, Ingo -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv