help / color / mirror / Atom feed
From: "Mario Blättermann" <mario.blaettermann@gmail.com>
To: discuss@mandoc.bsd.lv
Subject: Re: HTML output: section headers with diacritics not in table of contents
Date: Sat, 26 Mar 2022 14:35:04 +0100	[thread overview]
Message-ID: <CAHi0vA_HN6mhZTKfePy_YN7Cjwbfk4tKaMagbQqw8W4FgNJgJw@mail.gmail.com> (raw)
In-Reply-To: <Yj8IRC2PWiP8ZDYO@asta-kit.de>

Hello Ingo,

Am Sa., 26. März 2022 um 13:34 Uhr schrieb Ingo Schwarze <schwarze@usta.de>:
> Hi Mario,
> Mario Blättermann wrote on Fri, Mar 25, 2022 at 05:07:13PM +0100:
> > Am Fr., 25. März 2022 um 13:27 Uhr schrieb Ingo Schwarze <schwarze@usta.de>:
> >> Mario Blättermann wrote on Thu, Mar 24, 2022 at 06:13:23PM +0100:
> >>> recently I'm switched from GNU man-db to mandoc. It's really a big
> >>> step ahead, especially regarding the creation of HTML pages, but it
> >>> has its own peculiarities …
> >>>
> >>> For creating a HTML man page I use the following command:
> >>>
> >>> mandoc -T html -O toc ./manpage.1 > manpage.1.html
> >> You should really use the -O style=... and -O man=... options in
> >> addition to the options you are already using.  Without "style",
> >> CSS support is next to absent; no real style sheet is linked to,
> >> and only a minimal style sheet is embedded with <style>, so minimal
> >> that many features cannot work.
> > As far as I understand, proper TOC creation depends on a CSS file?
> No.  The TOC (in the sense Jan and i explained in earlier messages)
> does not need CSS.
> Then again, i guess what you meant here probably was "tagging depends
> on a CSS file".  That statement would be mostly misleading but arguably
> somewhat true to a lesser extent.  To understand what i mean, look at
> this HTML code (make sure to disable HTML in you mail user agent if you
> have it enabled):
>   <h1 class="Sh" id="DESCRIPTION">
>     <a class="permalink" href="#DESCRIPTION">DESCRIPTION</a>
>   </h1>
> Tagging involves four aspects:
>  1. The id= attribute of the h1 element shown above.
>     The value of that attribute, "DESCRIPTION", is called
>     the "tag" in mandoc(1), less(1), and ctags(1) parlance,
>     admittedly a bit unfortunately as "h1" is called a tag
>     in HTML parlance.  The mandoc/less/ctags tag is generated
>     if and only if the mdoc(7) or man(7) parser finds a section
>     title that looks sufficiently alphabetic.  It's intentional
>     that i use an imprecise wording here because what
>     "sufficiently alphabetic" means is the technical detail
>     that we are considering to change right now.  This tag
>     and id=-attribute is generated even if you do not use a CSS
>     file.
>  2. The 'a class="permalink"' element.  As long as a tag was
>     generated in no. 1 above, that element is also generated no
>     matter whether you use a CSS file or not.
>  3. Formatting of the h1 element depends on the stylesheet.
>     The following CSS properties are absent when you fail to
>     use a stylesheet:
>         margin-top: 1.2em;
>         margin-bottom: 0.6em;
>         margin-left: -3.2em;
>     Also, no tooltip is shown when you hover your mouse over
>     the h1 element unless you use the CSS file.
>     Arguably, none of this no. 3 is related to tagging.
>  4. Formatting of the "a" element depends on the stylesheet.
>     The following CSS properties are absent when you fail to
>     use a stylesheet:
>         color: inherit;
>         font: inherit;
>         text-decoration: inherit;
>         border-bottom: thin dotted;
>     Arguably, this no. 4 is related to tagging because these
>     properties determine how the presence of the tag is
>     indicated by the rendering.  Then again, without using
>     the stylesheet, the presence of the tag is usually also
>     indicated in whatever way is the default for the browser,
>     for example by a big blue font with a solid underline.
> >> Without "man", you get no hyperlinks from .Xr macros.
> > It's not about hyperlinks, this works at least in Archmanweb, and on
> > my local machine I don't need such links
> I have no idea how Archmanweb might create hyperlinks for manual
> page references unless you pass the -O man= option to mandoc.
> Well, Archmanweb might perhaps tinker around with the generated
> HTML code after the fact, using some crude heuristics.  I don't
> know what Archmanweb does.
> >> Such mistranslations obviously not only happen for reserved words
> >> like "SYNOPSIS", but also in the main text of manual pages.
> >> That's why i hate translated manual pages so much.  Reading German
> >> manual pages, i usually find them pretty unitelligible.
> > OK, if you hate German man pages anyway, why get upset...?
> For several reasons.
> First and foremost, i really care about good documentation, so bad
> documentation bothers me.
> Secondly, once in a while machines maintained by other people
> (not my own machines, of course) show me German error messages
> and/or German documentation even if i don't ask for that, and
> having to do extra configuration work just to get an intelligibly
> user interface on some random machine feels annoying to me.
> Finally, i am interested in questions of languages (formal and
> living) in general, even though i'm not a specialist for language
> theory (neither for formal nor for living languages).
> My former teacher in theoretical physics, Prof. Dahmen (whom i greatly
> respected in other matters) always made a point of strongly insisting that
> a thesis ought to be written in German because developing professional and
> technical terminology in all possible fields is crucial (in his opinion)
> to keep a living language alive.  Even though i always found the idea
> intriguing, i never managed to make up my mind whether that opinion is
> true or false, or rather: to which degree it is reasonable.
> But for technical terms in computer science, i fear German already is
> a dead language (in Prof. Dahmen's sense) whether we like it or not.
> Firmly established translations do exist for many technical terms in
> computer science (for example input = Eingabe), even more technical terms
> are firmly established as loanwords in German (for example hyperlink =
> Hyperlink, patch = Patch), but huge numbers of technical terms do not
> have a generally accepted and used translation to German.  In such cases,
> people sometimes simply use the English word when talking in German (for
> example diff = Diff), which may sometimes indicate that the establishment
> of a new loanword is in progress.  In many cases, a translation does not
> really exist.  A striking example from the example manual page you picked
> is the English technical term "unified diff".  The (admittedly meager)
> German Wikipedia page https://de.wikipedia.org/wiki/Diff works around
> the gap in the langauge by using the somewhat leangthy wording "Das
> sogenannte vereinheitlichte Format (unified diff)".  This solution feels
> completely adequate to me: it is easy to understand by both professionals
> and beginners, and the wording is also elegant from the perspective of
> the language.
> https://manpages.debian.org/bullseye/manpages-de/diff.1.de.html
> says, by contrast:
>   -u, -U ANZAHL, --unified[=ANZAHL]
>     ANZAHL Zeilen (Vorgabe 3) des vereinheitlichten Kontexts ausgeben
> That's completely unitelligible for a German native speaker unless they
> are also fluent in English *and* already know what the technical meanings
> of "unified" *and* of "context" are in this particular context.  The only
> way to understand this particular German wording is to translate the word
> "vereinheitlicht" back to English and then recognize that "unified" and
> "context" here both function as highly specialized technical terms
> and *neither* of them is to be interpreted in the everyday sense of the
> plain English words "unified" and "context".
This is an example for a German term found after some discussion
between translators, without developers involved. See below. Keep also
in mind, translators often try to transate as many terms as possible.
It happens very often that my reviewers complain about »Denglish«
terms. I remember a discussion some years ago where someone really
asked for »Herunterladung« instead of »Download« …

> I discuss this here in so much detail because i do care about such
> matters and because i think such considerations do have some bearing
> on the question which functionality matters to which degree in a
> formatting program for technical documentation.
> You cannot design a program well without considering how it should
> and how it should better not be used.
> >> So while i'm not aggressively trying to *not* support translated manual
> >> pages, i don't think translated manual pages are particularly relevant
> >> either.
> > OK, I understand. I don't expect any further efforts to get a better
> > TOC creation from your side. Maybe I can discuss this with the
> > Archmanweb developers.
> I fear you misunderstand.  I didn't mean to say, "fuck you, go to hell".
> I'd like to apologize if it sounded like that to you.
No, the problem was that I hadn't seen the rest of your mail because
it was truncated by the mail web interface. See my follow-up.

> I regularly consider features for implementation even when i consider
> them "not particularly relevant".  If something is not partcularly
> relevant and causes huge effort or disruption, it is likely to be
> rejected.  But if something is easy to do, it might be worthwhile
> even if it only provides marginal benefit.
> No feature is implemented without carfully scrutinizing the design,
> though.
> Besides, i may be missing something and it might emerge that the defect
> you are talking about causes more trouble than i so far think, and
> the feature you are proposing provides more benefit than i so far
> recognize.  In another mail, i said "i feel like sitting on the
> fence."
> And finally, while the questions of how the formatter should handle a
> translated manual page and how translations can be improved to actually
> become useable are clearly somewhat related, in the following sense, they
> are at the same time close to orthogonal: *if* formatters get better at
> handling translated manuals, that also helps to make translated manuals
> better, no matter how the latter may be achieved in the text itself.
> Maybe not all hope is lost for reviving at least some of the most widely
> used native languages for this particular technical field, for example
> German, Spanish, and Japanese.  As i said, i'm not sure whether that
> is desirable, feasible, and if so, how.
To become a translator is one of the first steps for lots of users who
thought about to give something back to the community. »Oh, I speak
reasonably English, and I speak German, so maybe it's a good idea to
translate something«. But lots of such translations will be send to
developers without any review by an experienced translator, and older,
but still valid translations haven't been reviewed ever, although the
appropriate .po file is under maintenance of a translation team for

And besides that, people with programming skills usually don't bother
with translations. So there's actually no intersection between
develpers and translators: terms expected and needed by developers and
advanced users are not covered by the translations written by
non-developers. Moreover, often developers don't use a locale matching
their native language, so they don't see at all what's wrong.

Of course, I understand your complaints about translation quality.
Especially in case of man pages we have another problem: For some
languages, old textual translations will be shipped by the
distributions again and again, drifting more and more away from the
English versions. In manpages-l10n, we use Po4a to ease the pain with
keeping the translated versions up-to-date, but some teams, like the
Japanese team, don't do so. The current bash.1 is from GNU Bash 5.1,
but the Japanese version is from Bash 4.2, twelve years old. That's
why translated man pages have still a bad reputation regarding their
age, although the manpages-l10n versions will be released every three

But as long as we don't get enough feedback from the target audience,
we can't improve that much. The translation of »SYNOPSIS« to
»ÜBERSICHT« is probably the result of a discussion between involved
translators years ago, or even written by the first translator ever
and thoughtless taken over by all others. As long as no one complains
about it, we keep it.

Best Regards,
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv

  reply	other threads:[~2022-03-26 13:35 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-24 17:13 Mario Blättermann
2022-03-24 17:33 ` Michael Stapelberg
2022-03-24 18:00   ` Mario Blättermann
2022-03-25 12:27 ` Ingo Schwarze
2022-03-25 16:07   ` Mario Blättermann
2022-03-25 20:58     ` Jan Stary
2022-03-26 12:34     ` Ingo Schwarze
2022-03-26 13:35       ` Mario Blättermann [this message]
2022-03-25 16:21   ` Anthony J. Bentley
2022-03-25 21:15     ` Jan Stary
2022-03-26 10:33     ` Ingo Schwarze
2022-03-26 17:55       ` Anthony J. Bentley
2022-03-27 11:17         ` Ingo Schwarze
2022-03-27 11:44           ` Ingo Schwarze
2022-03-25 16:57   ` Mario Blättermann
2022-03-25 20:36     ` Jan Stary
2022-03-25 20:59       ` Mario Blättermann
2022-03-25 21:20         ` Jan Stary
2022-03-26  9:25           ` Ingo Schwarze

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHi0vA_HN6mhZTKfePy_YN7Cjwbfk4tKaMagbQqw8W4FgNJgJw@mail.gmail.com \
    --to=mario.blaettermann@gmail.com \
    --cc=discuss@mandoc.bsd.lv \
    --subject='Re: HTML output: section headers with diacritics not in table of contents' \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).