discuss@mandoc.bsd.lv
 help / color / mirror / Atom feed
* Re: .Li hiccup (and \*(Pu issue)
       [not found]   ` <20100525141839.GG8074@bramka.kerhand.co.uk>
@ 2010-05-25 20:42     ` Kristaps Dzonsons
  2010-05-25 22:38       ` Jason McIntyre
  2010-05-25 23:28       ` Ingo Schwarze
  0 siblings, 2 replies; 5+ messages in thread
From: Kristaps Dzonsons @ 2010-05-25 20:42 UTC (permalink / raw)
  To: Jason McIntyre; +Cc: discuss

Note this is being posted into the discuss list.  Reads in a standard 
bottom-posted way, for newcomers (jmc first, then me, then jmc).

>>> hi. found a blip whereby Li seems to be doing something weird. from
>>> expr.1:
>>>
>>> 	.It Ar expr1 Li : Ar expr2
>>>
>>> 	groff:	expr1 : expr2
>>> 	mandoc: expr1: expr2
>>>
>>> to be honest, Li is treating the colon as punctuation, and i'd kind of
>>> expect that. but groff does not. so i'm not sure what way you want to
>>> go.
>>
>> Seems this can be generalised: in-line macros treat closing punctuation
>> differently when it's used as the first term; namely, the treat it like
>> a normal word.  So
>>
>>   .Fl . a b
>>
>> becomes
>>
>>  -. -a -b
>>
>> while
>>
>>  .Fl a . b c
>>
>> becomes
>>
>>  -a. -b -c
>>
>> I've come to have a rule of thumb:  if we can't explain something in a
>> single mdoc.7 sentence, we shouldn't be following it.  I have no idea
>> how to explain this, so I think it should go into CAVEATS as crappy
>> behaviour we don't emulate.
>>
>> If you think otherwise, please let me know; I know where this behaviour
>> can be "fixed" in mdoc_macro.c.  Until then, I note that it already
>> exists in Ingo's TODO:
>>
>> http://mdocml.bsd.lv/cgi-bin/cvsweb/TODO?cvsroot=mdocml
> 
> well, i'm not sure. i can see reasons for and against. nothing is
> documented in mdoc.samples.7 really. it just says that macros can handle
> punctuation. but given that this would be an exception, and that the
> behaviour does not conform with groff, i'd be tempted to say we'd be
> better off emulating it. if not, i guess we need to document it clearly.
> 
> i'm not too fussed though. i can change the page if you think it better
> to keep current behaviour.

I respectively suggest, in this case, that the invocations be fixed.  I 
think that people have confused `Li' as meaning "what follows is 
interpreted in literal mode", instead of "is rendered in a fixed font", 
as this behaviour is only used for `Li' and not for any other macro, to 
wit (on OpenBSD):

% egrep 'Li [:.;] ' `cat manuals.txt `
usr.sbin/bgpd/bgpd.conf.5:.Ar as-number Ns Li : Ns Ar local ,
usr.sbin/bgpd/bgpd.conf.5:.Ar subtype Ar as-number Ns Li : Ns Ar local
usr.sbin/bgpd/bgpd.conf.5:.Ar subtype Ar IP Ns Li : Ns Ar local
usr.sbin/bgpd/bgpd.conf.5:.Ar as-number Ns Li : Ns Ar local
usr.sbin/bgpd/bgpd.conf.5:.Ar as-number Ns Li : Ns Ar local ,
usr.sbin/bgpd/bgpd.conf.5:.Ar subtype Ar as-number Ns Li : Ns Ar local
usr.sbin/bgpd/bgpd.conf.5:.Ar subtype Ar IP Ns Li : Ns Ar local
usr.sbin/bgpd/bgpd.conf.5:.Ar as-number Ns Li : Ns Ar local .
usr.sbin/bgpd/bgpd.conf.5:.Ar IP Ns Li : Ns Ar local .
bin/expr/expr.1:.It Ar expr1 Li : Ar expr2
usr.bin/login/login.1:.Li : Ns Va style
usr.bin/login/login.1:.Ar user Ns Li : Ns Va style ) .

Except for expr.1, these are actually trying to fix groff's behaviour of 
the extra space!  So expr.1 is the only one that actually depends on 
this crappy behaviour.

> oh, and while looking at this, i see another issue - we now have no
> \*(Pu sequence for characters considered punctuation. at least
> mdoc.samples.7 uses this (probably nothing else). new groff does not
> recognise it, and its current equivalent man page, groff_mdoc.7 simply
> lists the punctuation characters. i'm guessing you won;t see a need to
> add the sequence, so should i just adjust our page?

We can temporarily define a chars.in for \*(Pu, but only mdoc.samples.7 
in NetBSD and OpenBSD use this (NB, I don't know where FreeBSD's 
mdoc.samples.7 is to look at it, but a grep over all the source can't 
find the invocation).

Once our mdoc.7 replaces mdoc.samples.7, this won't be an issue.  I 
suggest having a downstream chars.in patch for it, as I don't want to 
put it in upstream and have people assume that it exists.  Then once 
mdoc.7 takes over mdoc.samples.7, it can be removed.

Thoughts?

Kristaps
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: .Li hiccup (and \*(Pu issue)
  2010-05-25 20:42     ` .Li hiccup (and \*(Pu issue) Kristaps Dzonsons
@ 2010-05-25 22:38       ` Jason McIntyre
  2010-05-25 23:28       ` Ingo Schwarze
  1 sibling, 0 replies; 5+ messages in thread
From: Jason McIntyre @ 2010-05-25 22:38 UTC (permalink / raw)
  To: discuss

On Tue, May 25, 2010 at 10:42:33PM +0200, Kristaps Dzonsons wrote:
> 
> I respectively suggest, in this case, that the invocations be fixed.  I 
> think that people have confused `Li' as meaning "what follows is 
> interpreted in literal mode", instead of "is rendered in a fixed font", 
> as this behaviour is only used for `Li' and not for any other macro, to 
> wit (on OpenBSD):
> 
> % egrep 'Li [:.;] ' `cat manuals.txt `
> usr.sbin/bgpd/bgpd.conf.5:.Ar as-number Ns Li : Ns Ar local ,
> usr.sbin/bgpd/bgpd.conf.5:.Ar subtype Ar as-number Ns Li : Ns Ar local
> usr.sbin/bgpd/bgpd.conf.5:.Ar subtype Ar IP Ns Li : Ns Ar local
> usr.sbin/bgpd/bgpd.conf.5:.Ar as-number Ns Li : Ns Ar local
> usr.sbin/bgpd/bgpd.conf.5:.Ar as-number Ns Li : Ns Ar local ,
> usr.sbin/bgpd/bgpd.conf.5:.Ar subtype Ar as-number Ns Li : Ns Ar local
> usr.sbin/bgpd/bgpd.conf.5:.Ar subtype Ar IP Ns Li : Ns Ar local
> usr.sbin/bgpd/bgpd.conf.5:.Ar as-number Ns Li : Ns Ar local .
> usr.sbin/bgpd/bgpd.conf.5:.Ar IP Ns Li : Ns Ar local .
> bin/expr/expr.1:.It Ar expr1 Li : Ar expr2
> usr.bin/login/login.1:.Li : Ns Va style
> usr.bin/login/login.1:.Ar user Ns Li : Ns Va style ) .
> 
> Except for expr.1, these are actually trying to fix groff's behaviour of 
> the extra space!  So expr.1 is the only one that actually depends on 
> this crappy behaviour.
> 

so i will change expr.1, if ingo agrees.

> >oh, and while looking at this, i see another issue - we now have no
> >\*(Pu sequence for characters considered punctuation. at least
> >mdoc.samples.7 uses this (probably nothing else). new groff does not
> >recognise it, and its current equivalent man page, groff_mdoc.7 simply
> >lists the punctuation characters. i'm guessing you won;t see a need to
> >add the sequence, so should i just adjust our page?
> 
> We can temporarily define a chars.in for \*(Pu, but only mdoc.samples.7 
> in NetBSD and OpenBSD use this (NB, I don't know where FreeBSD's 
> mdoc.samples.7 is to look at it, but a grep over all the source can't 
> find the invocation).
> 
> Once our mdoc.7 replaces mdoc.samples.7, this won't be an issue.  I 
> suggest having a downstream chars.in patch for it, as I don't want to 
> put it in upstream and have people assume that it exists.  Then once 
> mdoc.7 takes over mdoc.samples.7, it can be removed.
> 
> Thoughts?
> 

i will change mdoc.samples.7 too, if ingo agrees.

jmc
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: .Li hiccup (and \*(Pu issue)
  2010-05-25 20:42     ` .Li hiccup (and \*(Pu issue) Kristaps Dzonsons
  2010-05-25 22:38       ` Jason McIntyre
@ 2010-05-25 23:28       ` Ingo Schwarze
  2010-05-26  0:49         ` Kristaps Dzonsons
  2010-05-26  6:56         ` Jason McIntyre
  1 sibling, 2 replies; 5+ messages in thread
From: Ingo Schwarze @ 2010-05-25 23:28 UTC (permalink / raw)
  To: discuss

Hi Kristaps,

Kristaps Dzonsons wrote on Tue, May 25, 2010 at 10:42:33PM +0200:
> Jason wrote:
>> Kristaps wrote:
>>> Jason wrote:

>>>> found a blip whereby Li seems to be doing something weird.
>>>> from expr.1:
>>>>
>>>>	.It Ar expr1 Li : Ar expr2
>>>>
>>>>	groff:	expr1 : expr2
>>>>	mandoc: expr1: expr2
>>>>
>>>> to be honest, Li is treating the colon as punctuation, and i'd kind of
>>>> expect that. but groff does not. so i'm not sure what way you want to
>>>> go.

>>> Seems this can be generalised: in-line macros treat closing punctuation
>>> differently when it's used as the first term; namely, the treat it like
>>> a normal word.  So
>>>
>>>  .Fl . a b
>>>
>>> becomes
>>>
>>> -. -a -b

I guess you misunderstand what is happening here.
The result is not Fl(.) Fl(a) but Fl() text(.) Fl(a).
Compare to

  .Ar : a

which gives

  file ...: a

where the dots are underlined, but the colon is not, so the colon is
outside the macro scope - of course, you can see that with .Ar, but
you can't see that with .Fl or .Li.

So, in a line macro, leading *opening* (and middle) punctuation will
delay macro opening, but leading *closing* punctuation will not.
That is, the logic must be:

  macro scope closed
  start argument loop
    if !open && (text || (closing punctuation && never open))
      open macro scope
    if open && punctuation:
      close macro scope
    put text element
  end argument loop
  if open:
    close macro scope

Look at this:

  .Ar [ ( : ( :

It gives:

  [(file ...: (:

That is text([) text(() Ar() text(:) text(() (*) text(:);
note that Ar() is not again opened at (*) - so, closing
punctuation opens the macro only once, whereas text opens
it every time.

>>> I've come to have a rule of thumb:  if we can't explain something
>>> in a single mdoc.7 sentence, we shouldn't be following it.

I don't think we need to explain much about this, it is easy.
Perhaps just:

   In-line
     Opening of these macros is delayed by one or more leading "["
     and "(" characters.  These macros are interrupted by Reserved
     Characters, but reopen afterwards in case additional arguments
     follow.  They are closed by subsequent macros, after the maximum
     number of arguments is reached, or at the end of the line.

>> well, i'm not sure. i can see reasons for and against. nothing is
>> documented in mdoc.samples.7 really. it just says that macros can handle
>> punctuation. but given that this would be an exception,

I don't see an exception.
Isn't it playing by the rules?

>> and that the behaviour does not conform with groff, i'd be tempted
>> to say we'd be better off emulating it.

Unless i'm missing something, i tend to agree.

> I respectively suggest, in this case, that the invocations be fixed.
> I think that people have confused `Li' as meaning "what follows is
> interpreted in literal mode", instead of "is rendered in a fixed
> font", as this behaviour is only used for `Li' and not for any other
> macro, to wit (on OpenBSD):
> 
> % egrep 'Li [:.;] ' `cat manuals.txt `
> usr.sbin/bgpd/bgpd.conf.5:.Ar as-number Ns Li : Ns Ar local ,

Cargo-cult alert:  Why not just

  .Ar as-number : Ns Ar local ,

That seems to work with all our formatters, and i don't see why
it shouldn't work, or why we should put any more macros in there.

> usr.sbin/bgpd/bgpd.conf.5:.Ar subtype Ar as-number Ns Li : Ns Ar local
> usr.sbin/bgpd/bgpd.conf.5:.Ar subtype Ar IP Ns Li : Ns Ar local
> usr.sbin/bgpd/bgpd.conf.5:.Ar as-number Ns Li : Ns Ar local
> usr.sbin/bgpd/bgpd.conf.5:.Ar as-number Ns Li : Ns Ar local ,
> usr.sbin/bgpd/bgpd.conf.5:.Ar subtype Ar as-number Ns Li : Ns Ar local
> usr.sbin/bgpd/bgpd.conf.5:.Ar subtype Ar IP Ns Li : Ns Ar local
> usr.sbin/bgpd/bgpd.conf.5:.Ar as-number Ns Li : Ns Ar local .
> usr.sbin/bgpd/bgpd.conf.5:.Ar IP Ns Li : Ns Ar local .
> bin/expr/expr.1:.It Ar expr1 Li : Ar expr2

This ought to parse as Ar(expr1) Li() text(:) Ar(expr2).
But currently we parse it as Ar(expr1) text(:) Ar(expr2).

I guess we shouldn't drop the empty .Li.
A macro following a macro will assert a space,
then the Li will find itself empty and not print anything,
then punctuation following a macro will use TERMP_NOSPACE
and not assert a second space before printing the colon.
I admit, not tested, but that is how it should work.

That said, i think

  .Ar expr1 No : Ar expr2

is easier to understand and might look better in variable-width
output - of course, it is currently broken in the same way as
in the .Li, .Fl and .Ar cases, the empty .No is missing from
the syntax tree just like the other ones.

Yours,
  Ingo
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: .Li hiccup (and \*(Pu issue)
  2010-05-25 23:28       ` Ingo Schwarze
@ 2010-05-26  0:49         ` Kristaps Dzonsons
  2010-05-26  6:56         ` Jason McIntyre
  1 sibling, 0 replies; 5+ messages in thread
From: Kristaps Dzonsons @ 2010-05-26  0:49 UTC (permalink / raw)
  To: discuss

[-- Attachment #1: Type: text/plain, Size: 346 bytes --]

This patch seems to fix these issues.  Ingo, let's cross over to tech@ 
if you have comments (it's lacking regression tests and documentation).

Note that `Ar' behaves differently between versions of groff: old groff 
doesn't print out the "file ..." when invoked as `.Ar [ ( : ( :'.  I 
vaguely remember this being discussed long ago.

Kristaps

[-- Attachment #2: patch.txt --]
[-- Type: text/plain, Size: 3739 bytes --]

? config.h
? config.log
? foo.1
? mandoc
? patch.txt
? regress/output
Index: mdoc_action.c
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/mdoc_action.c,v
retrieving revision 1.61
diff -u -r1.61 mdoc_action.c
--- mdoc_action.c	24 May 2010 11:59:37 -0000	1.61
+++ mdoc_action.c	26 May 2010 00:48:56 -0000
@@ -57,6 +57,7 @@
 static	int	  post_display(POST_ARGS);
 static	int	  post_dt(POST_ARGS);
 static	int	  post_lb(POST_ARGS);
+static	int	  post_li(POST_ARGS);
 static	int	  post_nm(POST_ARGS);
 static	int	  post_os(POST_ARGS);
 static	int	  post_pa(POST_ARGS);
@@ -102,7 +103,7 @@
 	{ NULL, NULL }, /* Ft */ 
 	{ NULL, NULL }, /* Ic */ 
 	{ NULL, NULL }, /* In */ 
-	{ NULL, NULL }, /* Li */
+	{ NULL, post_li }, /* Li */
 	{ NULL, NULL }, /* Nd */ 
 	{ NULL, post_nm }, /* Nm */ 
 	{ NULL, NULL }, /* Op */
@@ -309,6 +310,28 @@
 	m->next = MDOC_NEXT_CHILD;
 
 	if ( ! mdoc_word_alloc(m, n->line, n->pos, m->meta.name))
+		return(0);
+	m->last = nn;
+	return(1);
+}
+
+
+/*
+ * The `Li' macro gets a mysterious space when invoked without any
+ * arguments.  Happens, e.g., `Ar expr Li : expr2'.
+ */
+static int
+post_li(POST_ARGS)
+{
+	struct mdoc_node *nn;
+	
+	if (n->child)
+		return(1);
+
+	nn = n;
+	m->next = MDOC_NEXT_CHILD;
+
+	if ( ! mdoc_word_alloc(m, n->line, n->pos, ""))
 		return(0);
 	m->last = nn;
 	return(1);
Index: mdoc_macro.c
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/mdoc_macro.c,v
retrieving revision 1.68
diff -u -r1.68 mdoc_macro.c
--- mdoc_macro.c	17 May 2010 22:11:42 -0000	1.68
+++ mdoc_macro.c	26 May 2010 00:48:56 -0000
@@ -754,7 +754,7 @@
 static int
 in_line(MACRO_PROT_ARGS)
 {
-	int		 la, lastpunct, cnt, nc, nl;
+	int		 la, scope, cnt, nc, nl;
 	enum margverr	 av;
 	enum mdoct	 ntok;
 	enum margserr	 ac;
@@ -805,7 +805,7 @@
 		return(0);
 	}
 
-	for (cnt = 0, lastpunct = 1;; ) {
+	for (cnt = scope = 0;; ) {
 		la = *pos;
 		ac = mdoc_args(m, line, pos, buf, tok, &p);
 
@@ -826,7 +826,7 @@
 		 */
 
 		if (MDOC_MAX != ntok) {
-			if (0 == lastpunct && ! rew_elem(m, tok))
+			if (scope && ! rew_elem(m, tok))
 				return(0);
 			if (nc && 0 == cnt) {
 				if ( ! mdoc_elem_alloc(m, line, ppos, tok, arg))
@@ -853,14 +853,33 @@
 
 		d = ARGS_QWORD == ac ? DELIM_NONE : mdoc_isdelim(p);
 
-		if (ARGS_QWORD != ac && DELIM_NONE != d) {
-			if (0 == lastpunct && ! rew_elem(m, tok))
+		if (DELIM_NONE != d) {
+			/*
+			 * If we encounter closing punctuation, no word
+			 * has been omitted, no scope is open, and we're
+			 * allowed to have an empty element, then start
+			 * a new scope.  `Ar' and `Li', mysteriously,
+			 * only do this once per invocation.
+			 */
+			if (0 == cnt && (nc || MDOC_Li == tok) && 
+					DELIM_CLOSE == d && ! scope) {
+				if ( ! mdoc_elem_alloc(m, line, ppos, tok, arg))
+					return(0);
+				if (MDOC_Ar == tok || MDOC_Li == tok)
+					cnt++;
+				scope = 1;
+			}
+			/*
+			 * Close out our scope, if one is open, before
+			 * any punctuation.
+			 */
+			if (scope && ! rew_elem(m, tok))
 				return(0);
-			lastpunct = 1;
-		} else if (lastpunct) {
+			scope = 0;
+		} else if ( ! scope) {
 			if ( ! mdoc_elem_alloc(m, line, ppos, tok, arg))
 				return(0);
-			lastpunct = 0;
+			scope = 1;
 		}
 
 		if (DELIM_NONE == d)
@@ -873,14 +892,14 @@
 		 * word so that the `-' can be added to each one without
 		 * having to parse out spaces.
 		 */
-		if (0 == lastpunct && MDOC_Fl == tok) {
+		if (scope && MDOC_Fl == tok) {
 			if ( ! rew_elem(m, tok))
 				return(0);
-			lastpunct = 1;
+			scope = 0;
 		}
 	}
 
-	if (0 == lastpunct && ! rew_elem(m, tok))
+	if (scope && ! rew_elem(m, tok))
 		return(0);
 
 	/*

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: .Li hiccup (and \*(Pu issue)
  2010-05-25 23:28       ` Ingo Schwarze
  2010-05-26  0:49         ` Kristaps Dzonsons
@ 2010-05-26  6:56         ` Jason McIntyre
  1 sibling, 0 replies; 5+ messages in thread
From: Jason McIntyre @ 2010-05-26  6:56 UTC (permalink / raw)
  To: discuss

On Wed, May 26, 2010 at 01:28:18AM +0200, Ingo Schwarze wrote:
> > 
> > % egrep 'Li [:.;] ' `cat manuals.txt `
> > usr.sbin/bgpd/bgpd.conf.5:.Ar as-number Ns Li : Ns Ar local ,
> 
> Cargo-cult alert:  Why not just
> 
>   .Ar as-number : Ns Ar local ,
> 
> That seems to work with all our formatters, and i don't see why
> it shouldn't work, or why we should put any more macros in there.
> 

well, my reading of that example would be the author wants `:' in a
literal font. but after reading your mail, i'm not sure whether you
think that should be the result or not.

that is, i don;t know whether you think i should change the expr.1
example and the \*(Pu example, or whether you think mandoc itself should
be changed. can you clarify this please? (then if it's a mandoc change,
i can bow out of the discussion ;)

but getting back to your example, no matter what was intended, i'd say
that your example without Li is much better anyway. Li is a colossal
waste of space.

> 
> That said, i think
> 
>   .Ar expr1 No : Ar expr2
> 
> is easier to understand and might look better in variable-width
> output - of course, it is currently broken in the same way as
> in the .Li, .Fl and .Ar cases, the empty .No is missing from
> the syntax tree just like the other ones.
> 

i think the No is needless here too.

jmc
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-05-26  6:56 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20100525065543.GB8074@bramka.kerhand.co.uk>
     [not found] ` <4BFBCEBB.4070205@bsd.lv>
     [not found]   ` <20100525141839.GG8074@bramka.kerhand.co.uk>
2010-05-25 20:42     ` .Li hiccup (and \*(Pu issue) Kristaps Dzonsons
2010-05-25 22:38       ` Jason McIntyre
2010-05-25 23:28       ` Ingo Schwarze
2010-05-26  0:49         ` Kristaps Dzonsons
2010-05-26  6:56         ` Jason McIntyre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).