help / color / mirror / Atom feed
From: Ingo Schwarze <schwarze@usta.de>
To: "Dag-Erling Smørgrav" <des@des.no>
Cc: discuss@mdocml.bsd.lv
Subject: Re: Mdocdate
Date: Sat, 14 Nov 2015 02:22:01 +0100	[thread overview]
Message-ID: <20151114012201.GL7344@athene.usta.de> (raw)
In-Reply-To: <86r3jua1ja.fsf@desk.des.no>

Hi Dag-Erling,

it looks like you are missing one of the chief design goals of
mandoc(1):  Compatibility.  That means, among other aspects,
byte-by-byte identical output with groff where that is possible
with reasonable effort and not manifestly absurd (in maybe a dozen
cases, mandoc(1) deliberatly deviates from groff output, but all
these are cases of blatantly invalid input syntax, where groff
produces completely garbled output and mandoc can easily produce
something less broken, of course providing prominent error messages
when run with -Werror, -Wall, or -Tlint).  For input that is not
blatantly invalid, we do not want to become incompatible with groff.

We also do not want to encourage sloppy input that may work with
mandoc(1) - if you have a version that is new enough - but won't
work with other formatters.

It is quite frustrating for document authors when they find out
much later that their pages, even though they looked good with
mandoc(1) and mandoc -Tlint didn't report any warnings, are just
plain broken elsewhere.

I think what you really want to do is generate .Dd lines with SVN.
In that case, it doesn't help you to change the rules of the game
in just mandoc only.  You need to fix *all* formatters, or you
generate non-portable output.  Actually, it might be possible with
a bit of effort to get similar patches into groff and Heirloom,
Werner and Carsten are usually very helpful - but i'm not convinced
the format you propose makes much sense.  We really don't need
seconds and timezones in .Dd, and i wouldn't know how to explain
the need to Werner and Carsten.  Rather than trying to change the
world, i think it makes more sense to teach SVN to generate .Dd
lines that everyone will understand.

Dag-Erling Smoergrav wrote on Fri, Nov 13, 2015 at 06:52:57AM +0100:
> Ingo Schwarze <schwarze@usta.de> writes:
>> Dag-Erling Smoergrav <des@des.no> writes:

>>> Accept trailing text after a successful match (e.g. "$").

>> Why?  That seems like a step backwards, weakening syntax validation.

> Postel's law.

Exactly.  Authoring manuals is *sending* information to other people.
So the tools for writing documentation should help authors to be
conservative with that.

Note that mandoc(1) already is liberal with respect to the *accepting*
part.  If the date found in a document is invalid, no matter in which
way, it is shown to the end-user verbatim.  Mandoc already is more
liberal than groff in that respect.

What you are proposing is weakening the -Wall / -Tlint warnings
that authors see.  That will make end-users suffer on systems
mismatching the author's system.

>>> If the first character after a successful match is 'Z', use UTC
>>> instead of the local time zone.

>> That makes no sense.  Neither .TH nor .Dd syntax allows time formats
>> more fine-grained than days, so it makes no sense to worry about
>> time zones.

> It can make a difference of 1 day if a) the commit date was close to
> midnight or b) you are relatively close to the date line.  Subversion
> timestamps are always in UTC and include the time followed by a Z.

Yes, but so what?  Including that information will make groff display
*the current date* (the date the manual is formatted or displayed)
because the language definition doesn't allow it.  Giving a sizeable
fraction of users something completely wrong all the time because
you want to prevent occasional subtle off-by-one errors in the
last-change date is not a reasonable tradeoff.

>>>  - in mandoc_normdate():
>>>    Fix several logic errors (inverted tests),

>> Err, what?  Can you be more specific?  I don't see any logic errors
>> in there.  But your changes break several features.

> Please explain how
>  if (valid_date(str))
>     /* we didn't find a valid date */ ;
> is a feature.

You misread the code, maybe the new version (which i revised
using your suggestions, but without functional change) is clearer:

	/* Do not warn about the legacy man(7) format. */
	if ( ! a2time(&t, "%Y-%m-%d", in))
		mandoc_msg(MANDOCERR_DATE_BAD, parse, ln, pos, in);
	/* Use any non-mdoc(7) date verbatim. */
	return mandoc_strdup(in);

So the point is, a legacy man(7) format date is returned verbatim
and *NOT* converted to mdoc(7) format.  Groff behaves the same way
for man(7) documents, and that's where these dates occur in practice.
Note that in mdoc(7) documents, groff does *NOT* accept the man(7)
format at all, causing the current date to be displayed.  That's why
i should probably warn about man(7) style dates in mdoc(7) documents.
I didn't come round to doing that yet, it's a bit tricky to implement.

>>> add a test for the Subversion %d format,

>> That is not valid mdoc(7) .Dd syntax, so it can't be added.

> Says who?  Once again: Postel's law.

Admittedly, there is no mdoc(7) RFC.  But groff-mdoc(7) has been
the de-facto standard for two decades and still behaves that way.
mandoc(1) is becoming the de-facto standard as we speak and takes
great care to remain compatible with groff.  Cynthia Livingston's
original mdoc(7) v3 also behaves that way, see


and so does Heirloom roff.

>>> - in the man page:
>>>   Document how to use Mdocdate with Subversion.

>> No.  Please don't introduce yet another format, use something like
>>   Mdocdate="%b %d %Y "
>> or whatever the subversion syntax is for svn:keywords.

> That's not possible.  Mdocdate=%d is the closest you will get,
> otherwise I wouldn't have needed this patch.

Oh.  That's bad.  I assumed SVN would take a format string...

It doesn't buy you much to let SVN put in stuff that almost
no mdoc(7) parser understands.  If you want to autogenerate
mdoc(7) code from SVN, you will probably have to teach SVN
how to do it properly.  We did the same for CVS long ago.
I hope it won't be that hard...

>>> The current code (1.13.3) produces the following output:
>>> % while read d ; do echo ".Dd $d" | mandoc | tail -1 ; done <tests
>>>                                 February 7 2036
>>>                                 February 7 2036
>>>                                February 07 2036
>>>                                February 7, 2036
>>>                                February 7, 2036
>>>                                February 7, 2036
>>>                                   Feb 7 2036
>>>                                   Feb 7 2036
>>>                                   Feb 07 2036
>>>                                February 7, 2036
>>>                                February 7, 2036
>>>                                February 7, 2036
>>>                                   2036-02-07
>>>                                    2036-02-7
>>>                                    2036-2-07
>>>                                    2036-2-7
>>>                                February 7, 2036
>>>                        $Mdocdate: 2036-02-07 06:28:16Z $

>> Yes, that is correct behaviour.

> Not according to what you wrote earlier (the %Y-%m-%d forms should have
> been accepted but were not due to the logic bugs I mentioned which you
> claim are not bugs)

You are confusing *validity* and *output*.

If something is valid, that doesn't imply that it will be converted
to some other format.  It means that -Wall / -Tlint will produce
no warning.

The input is used verbatim both for valid legacy man(7) format
and for invalid input.

> I must say I'm really surprised by your attitude.  You seem to value
> rigid adherence to forty-year-old specifications over flexibility, ease
> of use and compatibility with third-party systems, even when the latter
> can be achieved without sacrificing the former.  That's laudable when
> you're implementing TLS or a message parser in a SOLAS system, but not
> when you're writing a text preparation system.  You'll just end up
> driving even more people away from mdoc and towards DocBook or Doxygen
> or whatever the flavor of the month is, for absolutely no benefit.

No, that's not what i'm trying to do.  I'm trying to make sure that
authors can find out whether what they write will be portable.
And i'm also trying to make sure that even when authors screwed up,
end-users see something reasonable.

You propose to make it impossible for authors to figure out whether
what they wrote might work elsewhere, and to show output to end-users
that is gratuitiously different from what other formatters would
show them.  We have to solve your task in some better way than with
your original patch.

 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

  reply	other threads:[~2015-11-14  1:22 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-12  9:25 Mdocdate Dag-Erling Smørgrav
2015-11-12 21:31 ` Mdocdate Ingo Schwarze
2015-11-12 23:06   ` Mdocdate Ingo Schwarze
2015-11-13  5:52   ` Mdocdate Dag-Erling Smørgrav
2015-11-14  1:22     ` Ingo Schwarze [this message]
2015-11-14 15:02       ` Mdocdate Steffen Nurpmeso
2015-11-15  1:51         ` Mdocdate Ingo Schwarze
2015-11-16 13:35           ` Mdocdate Steffen Nurpmeso
2015-11-17  9:12             ` Mdocdate Svyatoslav Mishyn
2015-11-17 10:35               ` Mdocdate Steffen Nurpmeso
2015-11-23 20:48   ` Mdocdate Michael Dexter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151114012201.GL7344@athene.usta.de \
    --to=schwarze@usta.de \
    --cc=des@des.no \
    --cc=discuss@mdocml.bsd.lv \
    --subject='Re: Mdocdate' \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).