discuss@mandoc.bsd.lv
 help / color / mirror / Atom feed
* Sx quoting and Header
@ 2010-12-25 15:22 Thomas Klausner
  2010-12-26 23:30 ` Ingo Schwarze
  2010-12-27  0:08 ` Ingo Schwarze
  0 siblings, 2 replies; 7+ messages in thread
From: Thomas Klausner @ 2010-12-25 15:22 UTC (permalink / raw)
  To: discuss; +Cc: Nicolas Joly

[-- Attachment #1: Type: text/plain, Size: 820 bytes --]

Hi!

Two issues reported by Nicolas Joly:

1. When the argument for .Sx is not quoted and contains a space, it
breaks the link. Example attached. The .Sx links differ as follows:
the one without the quotes has "20" in a place where the other has
"x20x". Shouldn't the argument of .Sx be taken as one item independent
on if it has quotes or not?

2. The header for man pages generated with mandoc doesn't have
"NetBSD" in the center. Example:
njoly@petaure [~]> nroff -mandoc /usr/share/man/man4/multicast.4 | head -n 1
MULTICAST(4)            NetBSD Kernel Interfaces Manual           MULTICAST(4)
njoly@petaure [~]> mandoc -Tascii /usr/share/man/man4/multicast.4 | head -n 1
MULTICAST(4)               Kernel Interfaces Manual               MULTICAST(4)
Not a biggy, but why not leave it as before? :)

Cheers,
 Thomas

[-- Attachment #2: test.1 --]
[-- Type: text/plain, Size: 160 bytes --]

.Dd December 25, 2010
.Dt TEST 1
.Os
.Sh NAME
.Nm test
.Nd test page
.Sh DESCRIPTION
.Ss Programming Guide
See
.Sx "Programming Guide"
or
.Sx Programming Guide

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Sx quoting and Header
  2010-12-25 15:22 Sx quoting and Header Thomas Klausner
@ 2010-12-26 23:30 ` Ingo Schwarze
  2010-12-27  0:04   ` Kristaps Dzonsons
       [not found]   ` <20101227132623.GA285025@medusa.sis.pasteur.fr>
  2010-12-27  0:08 ` Ingo Schwarze
  1 sibling, 2 replies; 7+ messages in thread
From: Ingo Schwarze @ 2010-12-26 23:30 UTC (permalink / raw)
  To: discuss; +Cc: Nicolas Joly, Thomas Klausner

Hi Thomas and Nicolas,

gah, you are making *me* fix -Thtml bugs...  :)

Thomas Klausner wrote on Sat, Dec 25, 2010 at 04:22:04PM +0100:

> 1. When the argument for .Sx is not quoted and contains a space, it
> breaks the link. Example attached. The .Sx links differ as follows:
> the one without the quotes has "20" in a place where the other has
> "x20x". Shouldn't the argument of .Sx be taken as one item independent
> on if it has quotes or not?

Well, whether it is handled as one AST node internally or not
is not that important here (using -Ttree, you see that it is
indeed not the same, which does make sense in other contexts),
but the bug here is that the ID generator ought to handle
two words in the same way as one word with a blank in the middle.

OK to commit the following patch?

It makes sure the "x" protecting the beginning of the ID
only gets added before the first word, not before each word.

Thanks for reporting!

Yours,
  Ingo


> .Dd December 25, 2010
> .Dt TEST 1
> .Os
> .Sh NAME
> .Nm test
> .Nd test page
> .Sh DESCRIPTION
> .Ss Programming Guide
> See
> .Sx "Programming Guide"
> or
> .Sx Programming Guide


Index: html.c
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/html.c,v
retrieving revision 1.123
diff -d -u -p -r1.123 html.c
--- html.c	24 Dec 2010 14:14:00 -0000	1.123
+++ html.c	26 Dec 2010 23:10:43 -0000
@@ -771,20 +771,20 @@ html_idcat(char *dst, const char *src, i
 {
 	int		 ssz;
 
-	assert(sz);
+	assert(sz > 2);
 
 	/* Cf. <http://www.w3.org/TR/html4/types.html#h-6.2>. */
 
-	for ( ; *dst != '\0' && sz; dst++, sz--)
-		/* Jump to end. */ ;
-
-	assert(sz > 2);
-
 	/* We can't start with a number (bah). */
 
-	*dst++ = 'x';
-	*dst = '\0';
-	sz--;
+	if ('\0' == *dst) {
+		*dst++ = 'x';
+		*dst = '\0';
+		sz--;
+	}
+
+	for ( ; *dst != '\0' && sz; dst++, sz--)
+		/* Jump to end. */ ;
 
 	for ( ; *src != '\0' && sz > 1; src++) {
 		ssz = snprintf(dst, (size_t)sz, "%.2x", *src);
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Sx quoting and Header
  2010-12-26 23:30 ` Ingo Schwarze
@ 2010-12-27  0:04   ` Kristaps Dzonsons
  2010-12-27  0:36     ` Ingo Schwarze
       [not found]   ` <20101227132623.GA285025@medusa.sis.pasteur.fr>
  1 sibling, 1 reply; 7+ messages in thread
From: Kristaps Dzonsons @ 2010-12-27  0:04 UTC (permalink / raw)
  To: discuss; +Cc: Ingo Schwarze, Nicolas Joly, Thomas Klausner

On 27/12/2010 01:30, Ingo Schwarze wrote:
> Hi Thomas and Nicolas,
>
> gah, you are making *me* fix -Thtml bugs...  :)
>
> Thomas Klausner wrote on Sat, Dec 25, 2010 at 04:22:04PM +0100:
>
>> 1. When the argument for .Sx is not quoted and contains a space, it
>> breaks the link. Example attached. The .Sx links differ as follows:
>> the one without the quotes has "20" in a place where the other has
>> "x20x". Shouldn't the argument of .Sx be taken as one item independent
>> on if it has quotes or not?
>
> Well, whether it is handled as one AST node internally or not
> is not that important here (using -Ttree, you see that it is
> indeed not the same, which does make sense in other contexts),
> but the bug here is that the ID generator ought to handle
> two words in the same way as one word with a blank in the middle.
>
> OK to commit the following patch?
>
> It makes sure the "x" protecting the beginning of the ID
> only gets added before the first word, not before each word.
>
> Thanks for reporting!
>
> Yours,
>    Ingo
>
>
>> .Dd December 25, 2010
>> .Dt TEST 1
>> .Os
>> .Sh NAME
>> .Nm test
>> .Nd test page
>> .Sh DESCRIPTION
>> .Ss Programming Guide
>> See
>> .Sx "Programming Guide"
>> or
>> .Sx Programming Guide
>
>
> Index: html.c
> ===================================================================
> RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/html.c,v
> retrieving revision 1.123
> diff -d -u -p -r1.123 html.c
> --- html.c	24 Dec 2010 14:14:00 -0000	1.123
> +++ html.c	26 Dec 2010 23:10:43 -0000
> @@ -771,20 +771,20 @@ html_idcat(char *dst, const char *src, i
>   {
>   	int		 ssz;
>
> -	assert(sz);
> +	assert(sz>  2);
>
>   	/* Cf.<http://www.w3.org/TR/html4/types.html#h-6.2>. */
>
> -	for ( ; *dst != '\0'&&  sz; dst++, sz--)
> -		/* Jump to end. */ ;
> -
> -	assert(sz>  2);
> -
>   	/* We can't start with a number (bah). */
>
> -	*dst++ = 'x';
> -	*dst = '\0';
> -	sz--;
> +	if ('\0' == *dst) {
> +		*dst++ = 'x';
> +		*dst = '\0';
> +		sz--;
> +	}
> +
> +	for ( ; *dst != '\0'&&  sz; dst++, sz--)
> +		/* Jump to end. */ ;
>
>   	for ( ; *src != '\0'&&  sz>  1; src++) {
>   		ssz = snprintf(dst, (size_t)sz, "%.2x", *src);

Ingo,

Great!

To assuage the trauma of dealing with -T[x]html, I've a patch on hand 
that stashes the mdoc_validate.c concatenated `Sh' string (see 
post_sh()) into a mdoc_sh structure.  This buffer is then pumped right 
into html_idcat(), removing all those stupid loops over children.

The neat part about this is that, if I add a sorted list among these, I 
can do the same in `Sx' and add a check that makes sure `Sx' links 
actually go somewhere AND that sections/subsections/etc. aren't 
duplicates.  It's only little bits of code, but since this isn't checked 
by nroff, I'll post it here for a relevancy check before it goes in.

Note, if I haven't ever explained myself, that the hex encoding is just 
to get around the fact that IDs are case-insensitive, while -mdoc is 
case-sensitive (NAME != name)...

Kristaps
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Sx quoting and Header
  2010-12-25 15:22 Sx quoting and Header Thomas Klausner
  2010-12-26 23:30 ` Ingo Schwarze
@ 2010-12-27  0:08 ` Ingo Schwarze
  1 sibling, 0 replies; 7+ messages in thread
From: Ingo Schwarze @ 2010-12-27  0:08 UTC (permalink / raw)
  To: discuss; +Cc: Thomas Klausner, Nicolas Joly

Hi Thomas, hi Nicolas,

Thomas Klausner wrote on Sat, Dec 25, 2010 at 04:22:04PM +0100:

> 2. The header for man pages generated with mandoc doesn't have
> "NetBSD" in the center. Example:
> njoly@petaure [~]> nroff -mandoc /usr/share/man/man4/multicast.4 | head -n 1
> MULTICAST(4)            NetBSD Kernel Interfaces Manual           MULTICAST(4)
> njoly@petaure [~]> mandoc -Tascii /usr/share/man/man4/multicast.4 | head -n 1
> MULTICAST(4)               Kernel Interfaces Manual               MULTICAST(4)
> Not a biggy, but why not leave it as before? :)

The newest groff (1.20.1) puts "BSD" into that place, see the
following line in upstream groff-1.20.1/tmac/doc-common:

  .ds doc-volume-operating-system BSD

When porting groff-1.20.1 to OpenBSD, i had to apply this local patch
to groff:

  -.ds doc-volume-operating-system BSD
  +.ds doc-volume-operating-system OpenBSD

Judging from the line

  .ds doc-volume-operating-system FreeBSD

in

http://cvsweb.netbsd.org/bsdweb.cgi/src/gnu/usr.bin/groff/tmac/mdoc.local?rev=1.1

you basically did the same to NetBSD in 2003.

In mandoc, the way to do this adjustment is to locally customize
the file msec.in.  You can just prepend "NetBSD" to the relevant
strings in that file and commit to the NetBSD repository.
That file is specifically intended for operating system customization.

Yours,
  Ingo
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Sx quoting and Header
  2010-12-27  0:04   ` Kristaps Dzonsons
@ 2010-12-27  0:36     ` Ingo Schwarze
  2010-12-27  8:39       ` Kristaps Dzonsons
  0 siblings, 1 reply; 7+ messages in thread
From: Ingo Schwarze @ 2010-12-27  0:36 UTC (permalink / raw)
  To: discuss

Hi Kristaps,

Kristaps Dzonsons wrote on Mon, Dec 27, 2010 at 02:04:09AM +0200:

> To assuage the trauma of dealing with -T[x]html, I've a patch on hand

OK, so i'm not committing.

> that stashes the mdoc_validate.c concatenated `Sh' string (see
> post_sh()) into a mdoc_sh structure.  This buffer is then pumped
> right into html_idcat(), removing all those stupid loops over
> children.
> 
> The neat part about this is that, if I add a sorted list among
> these, I can do the same in `Sx' and add a check that makes sure
> `Sx' links actually go somewhere AND that sections/subsections/etc.
> aren't duplicates.  It's only little bits of code, but since this
> isn't checked by nroff, I'll post it here for a relevancy check
> before it goes in.

I think that plan does make sense.
Of course, all this should only cause warnings, not errors.

Not sure whether a duplicate section or subsection, or a subsection
with the same name as a section, warrants a warning, though.
I guess a warning is only needed when a reference is ambiguous.
Thus, the list nodes should probably contain a flag with the values
OK, DUPE, WARNED, and when a DUPE gets referenced, a warning should
be issued and the flag advanced to WARNED.

> Note, if I haven't ever explained myself, that the hex encoding is
> just to get around the fact that IDs are case-insensitive, while
> -mdoc is case-sensitive (NAME != name)...

Hm.  In mdoc(7), it could actually happen that we might get
both a `ds' and a `Ds' section.  So yes, properly dealing with
case appears to be relevant.

Yours,
  Ingo
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Sx quoting and Header
  2010-12-27  0:36     ` Ingo Schwarze
@ 2010-12-27  8:39       ` Kristaps Dzonsons
  0 siblings, 0 replies; 7+ messages in thread
From: Kristaps Dzonsons @ 2010-12-27  8:39 UTC (permalink / raw)
  To: discuss; +Cc: Ingo Schwarze

On 27/12/2010 02:36, Ingo Schwarze wrote:
> Hi Kristaps,
>
> Kristaps Dzonsons wrote on Mon, Dec 27, 2010 at 02:04:09AM +0200:
>
>> To assuage the trauma of dealing with -T[x]html, I've a patch on hand
>
> OK, so i'm not committing.

Ingo,

These patches don't conflict, so feel free to do so (I don't touch 
html.c, only mdoc_validate.c and mdoc_html.c).

Thanks,

Kristaps
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Sx quoting and Header
       [not found]   ` <20101227132623.GA285025@medusa.sis.pasteur.fr>
@ 2010-12-27 21:57     ` Ingo Schwarze
  0 siblings, 0 replies; 7+ messages in thread
From: Ingo Schwarze @ 2010-12-27 21:57 UTC (permalink / raw)
  To: Nicolas Joly; +Cc: discuss, Thomas Klausner

Hi Nicolas,

Nicolas Joly wrote on Mon, Dec 27, 2010 at 02:26:23PM +0100:

> This does not work for hrefs ... it does not add the "x" protection
> char because, in that case, *dst = '#' and not '\0'.

Oh indeed.  That was sloppy.
Thanks for catching the regression!

> What about letting the caller do the protection instead ?

Hm, your patch is nice too, in particular it's a bit shorter.

On the other hand, doing things required for structural reasons
in one central place is not a bad idea either, it reduces the
risk of forgetting it at one of the several places, in particular
when adding code at a later time.

What finally decided the question is that Kristaps currently
has mdoc_html.c kind of locked, he is working on that very code,
and it will probably change.  Interfering with that seemed like
a bad idea, making html_idcat() a bit safer in parallel, on the
other hand, not a problem.

Thanks again,
  Ingo


P.S.
Here is what i just committed:

Index: html.c
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/html.c,v
retrieving revision 1.123
diff -u -p -r1.123 html.c
--- html.c	24 Dec 2010 14:14:00 -0000	1.123
+++ html.c	27 Dec 2010 21:29:35 -0000
@@ -771,20 +771,24 @@ html_idcat(char *dst, const char *src, i
 {
 	int		 ssz;
 
-	assert(sz);
+	assert(sz > 2);
 
 	/* Cf. <http://www.w3.org/TR/html4/types.html#h-6.2>. */
 
-	for ( ; *dst != '\0' && sz; dst++, sz--)
-		/* Jump to end. */ ;
-
-	assert(sz > 2);
-
 	/* We can't start with a number (bah). */
 
-	*dst++ = 'x';
-	*dst = '\0';
-	sz--;
+	if ('#' == *dst) {
+		dst++;
+		sz--;
+	}
+	if ('\0' == *dst) {
+		*dst++ = 'x';
+		*dst = '\0';
+		sz--;
+	}
+
+	for ( ; *dst != '\0' && sz; dst++, sz--)
+		/* Jump to end. */ ;
 
 	for ( ; *src != '\0' && sz > 1; src++) {
 		ssz = snprintf(dst, (size_t)sz, "%.2x", *src);
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-12-27 21:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-25 15:22 Sx quoting and Header Thomas Klausner
2010-12-26 23:30 ` Ingo Schwarze
2010-12-27  0:04   ` Kristaps Dzonsons
2010-12-27  0:36     ` Ingo Schwarze
2010-12-27  8:39       ` Kristaps Dzonsons
     [not found]   ` <20101227132623.GA285025@medusa.sis.pasteur.fr>
2010-12-27 21:57     ` Ingo Schwarze
2010-12-27  0:08 ` Ingo Schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).