9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] Re: Plan9 [v]seprint(2) API design bug
       [not found] ` <4fbbf814-c51e-4e92-a8e2-e77afac4cb3b@posixcafe.org>
@ 2025-02-09 23:05   ` Alejandro Colomar via 9fans
  2025-02-09 23:17     ` Alejandro Colomar via 9fans
  2025-02-10  0:21     ` Jacob Moody
  0 siblings, 2 replies; 4+ messages in thread
From: Alejandro Colomar via 9fans @ 2025-02-09 23:05 UTC (permalink / raw)
  To: Jacob Moody
  Cc: Rob Pike, Ken Thompson, cinap_lenrek, Ori Bernstein,
	Lennart Jablonka, Sigrid Solveig Haflínudóttir,
	Dan Cross, Russ Cox, onf, branden, Douglas McIlroy,
	Christopher Bazley, 9fans


[-- Attachment #1.1: Type: text/plain, Size: 10199 bytes --]

[CC += 9fans@]

Hi Jacob,

On Sun, Feb 09, 2025 at 03:51:13PM -0600, Jacob Moody wrote:
> On 2/9/25 14:59, Alejandro Colomar wrote:
> > Hi all,
> > 
> > I found a few addresses inspecting both the 9front and plan9port git
> > repositories.  If you think anyone else --or any mailing list-- should
> > be CCed, feel free to add them.  I also added Rob and Ken, which I
> > believe to be the authors of this API.
> > 
> > Some time ago, I discussed a design bug in Plan9's [v]seprint(2) with
> > Doug.  ISTR that we agreed on the bug, although I couldn't find where to
> > report it.  I hope this time I found the right people.
> 
> In general 9front specific comments should go to the 9front list 9front@9front.org
> more general questions regarding any sort of plan 9 may be best sent to 9fans@9fans.net
> There is enough overlap that it should be seen by the right people.

Thanks!  Are those lists public?  That is, can I post to those lists
without being subscribed?

> > The issue is both with the API, and its documentation.
> > 
> > So, I was writing a function for myself, for being able to chain
> > snprintf(3) calls, and ended up re-inventing Plan9's seprint(2) --I
> > didn't know it existed back then--.  Having invented it from scratch, I
> > was able to compare my design with seprint(2), and see some obvious
> > flaws in it.
> > 
> > The documentation for [v]seprint(2) isn't clear about what happens on
> > truncation.  Does the function return NULL?  Reading the implementation
> > seems to say no.  Does the function return a special value that allows
> > detecting truncation?  Reading the implementation seems to say no.
> > Hummm.  But then, it returns as it the input string had been short
> > enough to not truncate, thus silently truncating?  The documentation
> > seems to say so, and the implementation seems to confirm it.  This is
> > unfortunate.
> > 
> > There are two ways you could detect truncation:
> > 
> > -  Return the 'e' value on truncation.
> > -  Return a null pointer on truncation.
> 
> You can check, it's just a bit fuzzy. You can check if the return value
> points to the last valid byte of the input. Now this will not let you
> differentiate between the buffer being filled exactly up to the brim
> and if the input was truncated. However you could assume the worst case
> and be safe all the time in those scenarios. It's not great but it gets
> the job done.

Hmmm, yeah, so you can workaround it by wasting 1 byte.  Not a great
thing.  Not a great thing.  :|

Iff we can't fix seprint(2) for historic reasons, I think it would be
better to invent a new seprint2(2) and consider slowly migrating all
code to that API.  Existing code using seprint(2) might have many bugs,
and needs to be audited anyway.

> > The latter seems to be more appropriate.  My implementation has tried
> > both, and I'm finally settling on the null pointer.  That's the most
> > easy-to-use API, and is also safer (if one accidentally uses that null
> > pointer, it will easily abort the program, instead of overrunning
> > something).
> > 
> > I've called my function seprintf().  Here's how it's used:
> > 
> > 	#define endof(a)  (a + countof(a))
> > 
> > 	p = buf;
> > 	p = seprintf(p, endof(buf), "...");
> > 	p = seprintf(p, endof(buf), "...");
> > 	p = seprintf(p, endof(buf), "...");
> > 	p = seprintf(p, endof(buf), "...");
> > 	p = seprintf(p, endof(buf), "...");
> > 	p = seprintf(p, endof(buf), "...");
> > 	if (p == NULL) {
> > 		if (errno == E2BIG)
> > 			goto truncated;
> > 		goto failed;
> > 	}
> > 
> > If you search through 9front's source code, you'll find some weird calls
> > to seprint(2):
> > 
> > sys/src/libc/9sys/idn.c:214:		if((dp = seprint(dp, de, "%.*S", nr, rb)) == nil)
> > sys/src/libc/9sys/idn.c-215-			return -1;
> > sys/src/libc/9sys/idn.c-216-		if(dp >= de)
> > sys/src/libc/9sys/idn.c-217-			return -1;
> > 
> > It looks like it expects dp to possibly be a null pointer, which
> > shouldn't be possible except for programmer bugs.  Let's check the
> > implementation:
> > 
> > 	$ grepc -tfd seprint .
> > 	./src/lib9/fmt/seprint.c:char*
> > 	seprint(char *buf, char *e, char *fmt, ...)
> > 	{
> > 		char *p;
> > 		va_list args;
> > 
> > 		va_start(args, fmt);
> > 		p = vseprint(buf, e, fmt, args);
> > 		va_end(args);
> > 		return p;
> > 	}
> > 	$ grepc -tfd vseprint .
> > 	./src/lib9/fmt/vseprint.c:char*
> > 	vseprint(char *buf, char *e, char *fmt, va_list args)
> > 	{
> > 		Fmt f;
> > 
> > 		if(e <= buf)
> > 			return nil;
> > 		f.runes = 0;
> > 		f.start = buf;
> > 		f.to = buf;
> > 		f.stop = e - 1;
> > 		f.flush = 0;
> > 		f.farg = nil;
> > 		f.nfmt = 0;
> > 		VA_COPY(f.args,args);
> > 		fmtlocaleinit(&f, nil, nil, nil);
> > 		dofmt(&f, fmt);
> > 		VA_END(f.args);
> > 		*(char*)f.to = '\0';
> > 		return (char*)f.to;
> > 	}
> > 
> > The only way to return a null pointer is if buf>=e.  But how could that
> > happen?  Also, the test in <sys/src/libc/9sys/idn.c:216> seems really
> > impossible.  seprint(2) silently truncates, and returns a pointer to the
> > terminating NUL byte.  That can never be >=e, since 'e' is supposed to
> > be a pointer to one-after-the-last element, that is, one after the
> > terminating NUL byte.
> 
> We manually increment the returned value from seprint later in that function:
> /sys/src/libc/9sys/idn.c:220
> So buf can be >= e in this specific case, further investigation with other
> 9front folks reviewed the code and determined it is working as intended.
> Now it is not exactly clear,

But the test is performed before the increment.  How can it be possible?
The loop is:

200-	for(;;){
201-		nc = nr = 0;
202-		while(cp[nc] != 0){
203-			n = chartorune(&r, cp+nc);
204-			if(r == '.')
205-				break;
206-			if(nr >= nelem(rb))
207-				return -1;
208-			rb[nr++] = r;
209-			nc += n;
210-		}
211-		if(cistrncmp(cp, "xn--", 4) == 0)
212-			if((nr = punydecode(nc-4, cp+4, nelem(rb), rb)) < 0)
213-				return -1;
214:		if((dp = seprint(dp, de, "%.*S", nr, rb)) == nil)
215-			return -1;
216-		if(dp >= de)
217-			return -1;
218-		if(cp[nc] == 0)
219-			break;
220-		*dp++ = '.';
221-		cp += nc+1;
222-	}

In line 214, seprint(2) cannot possibly return anything >=de.  And
there's no increment between line 214 and line 216, so line 216 must be
dead code, isn't it?  The value of dp might be incremented later in line
220, but that's already after the test has happened, and thus cannot
make it non-dead code posthumously, can it?

> and relying on this behavior is not great
> so we are exploring perhaps changing this to make things more clear.
> http://okturing.com/src/23249/body is the current draft.

This looks better, but I'm not happy doing a transformation like that
claiming that nothing changes, where the current code clearly reads as
dead code.  I'd first make sure that the commit message really tells the
full story.

> > I suspect that we could fix seprint(2) by making it return a null
> > pointer on truncation, and it would most likely fix existing code, which
> > does expect a way to detect truncation.
> > 
> > Here's how I implemented my seprintf() for Linux.  It may need some
> > adaptation to Plan9, so I welcome your feedback about it.  But I think
> > the main ideas could be exported to Plan9.
> > 
> > 	$ grepc -htfd seprintf .
> > 	inline char *
> > 	seprintf(char *dst, char *end, const char *restrict fmt, ...)
> > 	{
> > 		char     *p;
> > 		va_list  ap;
> > 
> > 		va_start(ap, fmt);
> > 		p = vseprintf(dst, end, fmt, ap);
> > 		va_end(ap);
> > 
> > 		return p;
> > 	}
> > 	$ grepc -htfd vseprintf .
> > 	inline char *
> > 	vseprintf(char *dst, char *end, const char *restrict fmt, va_list ap)
> > 	{
> > 		int        len;
> > 		ptrdiff_t  size;
> > 
> > 		if (dst == NULL)
> > 			return NULL;
> > 
> > 		size = end - dst;
> > 		len = vstprintf(dst, size, fmt, ap);
> > 		if (len == -1)
> > 			return NULL;
> > 
> > 		return dst + len;
> > 	}
> > 	$ grepc -htfd vstprintf .
> > 	inline int
> > 	vstprintf(char *restrict s, int size, const char *restrict fmt, va_list ap)
> > 	{
> > 		int  len;
> > 
> > 		len = vsnprintf(s, size, fmt, ap);
> > 		if (len == -1)
> > 			return -1;
> > 		if (len >= size && size != 0) {
> > 			errno = E2BIG;
> > 			return -1;
> > 		}
> > 
> > 		return len;
> > 	}
> > 
> > What do you think?
> 
> In my personal opinion the issue is not so great that it would be worth changing
> the existing code (and all the callers). More "core" functions like this is used not
> only by internal code but by plenty of external code.

I suspect most of that external code is bogus, and probably depends on
some behavior that differs from the actual behavior of seprint(2), just
like the code we discussed above has dead code attempting to check for
truncation with invalid tests.

Here's one that's more obviously wrong, I think:

	sys/src/cmd/nusb/lib/dump.c:75:		s = seprint(s, e, " %.2ux", b[i]);
	sys/src/cmd/nusb/lib/dump.c-76-	if(s == e)
	sys/src/cmd/nusb/lib/dump.c-77-		fprint(2, "%s: usb/lib: hexdump: bug: small buffer\n", argv0);
	sys/src/cmd/nusb/lib/dump.c-78-	return dbuff;
	sys/src/cmd/nusb/lib/dump.c-79-}

s==e cannot be possibly true in line 76 right after the call to
seprint(2) in line 75.  That's necessarily dead code, or I'm getting
lost in the implementation of vseprint(2).

After checking many call sites, I think I agree with you that the API
cannot be fixed without breaking some code (code that is already broken
in other ways, but).  Maybe I'd add a new API and slowly move to it,
deprecating seprint(2).  Maybe call the new one seprint2(2)?

> These days I think we like to
> have a fairly good reason to grow the list of 9front idiosyncrasies compared to the
> original plan 9 for core interfaces like this.

I'd say having this bad API being code is in itself a problem, and a
better API would probably fix some bugs, and promote better code.


Have a lovely night!
Alex

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 228 bytes --]


------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T90be9e76a4dae77d-M014d4797afb3ec6fcdb450ff
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [9fans] Re: Plan9 [v]seprint(2) API design bug
  2025-02-09 23:05   ` [9fans] Re: Plan9 [v]seprint(2) API design bug Alejandro Colomar via 9fans
@ 2025-02-09 23:17     ` Alejandro Colomar via 9fans
  2025-02-10  0:21     ` Jacob Moody
  1 sibling, 0 replies; 4+ messages in thread
From: Alejandro Colomar via 9fans @ 2025-02-09 23:17 UTC (permalink / raw)
  To: Jacob Moody
  Cc: Rob Pike, Ken Thompson, cinap_lenrek, Ori Bernstein,
	Lennart Jablonka, Sigrid Solveig Haflínudóttir,
	Dan Cross, Russ Cox, onf, branden, Douglas McIlroy,
	Christopher Bazley, 9fans


[-- Attachment #1.1: Type: text/plain, Size: 459 bytes --]

On Mon, Feb 10, 2025 at 12:05:58AM +0100, Alejandro Colomar wrote:
> > These days I think we like to
> > have a fairly good reason to grow the list of 9front idiosyncrasies compared to the
> > original plan 9 for core interfaces like this.
> 
> I'd say having this bad API being code is in itself a problem, and a

I meant s/code/core/

> better API would probably fix some bugs, and promote better code.

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 228 bytes --]


------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T90be9e76a4dae77d-M7db37bb5a004b56a1bf2b482
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [9fans] Re: Plan9 [v]seprint(2) API design bug
  2025-02-09 23:05   ` [9fans] Re: Plan9 [v]seprint(2) API design bug Alejandro Colomar via 9fans
  2025-02-09 23:17     ` Alejandro Colomar via 9fans
@ 2025-02-10  0:21     ` Jacob Moody
  2025-02-10 11:37       ` Alejandro Colomar via 9fans
  1 sibling, 1 reply; 4+ messages in thread
From: Jacob Moody @ 2025-02-10  0:21 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Rob Pike, Ken Thompson, cinap_lenrek, Ori Bernstein,
	Lennart Jablonka, Sigrid Solveig Haflínudóttir,
	Dan Cross, Russ Cox, onf, branden, Douglas McIlroy,
	Christopher Bazley, 9fans

On 2/9/25 17:05, Alejandro Colomar wrote:
> [CC += 9fans@]
> 
> Hi Jacob,
> 
> On Sun, Feb 09, 2025 at 03:51:13PM -0600, Jacob Moody wrote:
>> On 2/9/25 14:59, Alejandro Colomar wrote:
>>> Hi all,
>>>
>>> I found a few addresses inspecting both the 9front and plan9port git
>>> repositories.  If you think anyone else --or any mailing list-- should
>>> be CCed, feel free to add them.  I also added Rob and Ken, which I
>>> believe to be the authors of this API.
>>>
>>> Some time ago, I discussed a design bug in Plan9's [v]seprint(2) with
>>> Doug.  ISTR that we agreed on the bug, although I couldn't find where to
>>> report it.  I hope this time I found the right people.
>>
>> In general 9front specific comments should go to the 9front list 9front@9front.org
>> more general questions regarding any sort of plan 9 may be best sent to 9fans@9fans.net
>> There is enough overlap that it should be seen by the right people.
> 
> Thanks!  Are those lists public?  That is, can I post to those lists
> without being subscribed?

9fans is, 9front is not.

>>> The issue is both with the API, and its documentation.
>>>
>>> So, I was writing a function for myself, for being able to chain
>>> snprintf(3) calls, and ended up re-inventing Plan9's seprint(2) --I
>>> didn't know it existed back then--.  Having invented it from scratch, I
>>> was able to compare my design with seprint(2), and see some obvious
>>> flaws in it.
>>>
>>> The documentation for [v]seprint(2) isn't clear about what happens on
>>> truncation.  Does the function return NULL?  Reading the implementation
>>> seems to say no.  Does the function return a special value that allows
>>> detecting truncation?  Reading the implementation seems to say no.
>>> Hummm.  But then, it returns as it the input string had been short
>>> enough to not truncate, thus silently truncating?  The documentation
>>> seems to say so, and the implementation seems to confirm it.  This is
>>> unfortunate.
>>>
>>> There are two ways you could detect truncation:
>>>
>>> -  Return the 'e' value on truncation.
>>> -  Return a null pointer on truncation.
>>
>> You can check, it's just a bit fuzzy. You can check if the return value
>> points to the last valid byte of the input. Now this will not let you
>> differentiate between the buffer being filled exactly up to the brim
>> and if the input was truncated. However you could assume the worst case
>> and be safe all the time in those scenarios. It's not great but it gets
>> the job done.
> 
> Hmmm, yeah, so you can workaround it by wasting 1 byte.  Not a great
> thing.  Not a great thing.  :|
> 
> Iff we can't fix seprint(2) for historic reasons, I think it would be
> better to invent a new seprint2(2) and consider slowly migrating all
> code to that API.  Existing code using seprint(2) might have many bugs,
> and needs to be audited anyway.

Sure let's audit existing code and fix things, I am very much interested
in making things better, but I don't think we should point at a couple of bugs
and decide we need to rip up the floor boards. More of the reasoning for why
I feel this way is later on in this mail.

>>> The latter seems to be more appropriate.  My implementation has tried
>>> both, and I'm finally settling on the null pointer.  That's the most
>>> easy-to-use API, and is also safer (if one accidentally uses that null
>>> pointer, it will easily abort the program, instead of overrunning
>>> something).
>>>
>>> I've called my function seprintf().  Here's how it's used:
>>>
>>>     #define endof(a)  (a + countof(a))
>>>
>>>     p = buf;
>>>     p = seprintf(p, endof(buf), "...");
>>>     p = seprintf(p, endof(buf), "...");
>>>     p = seprintf(p, endof(buf), "...");
>>>     p = seprintf(p, endof(buf), "...");
>>>     p = seprintf(p, endof(buf), "...");
>>>     p = seprintf(p, endof(buf), "...");
>>>     if (p == NULL) {
>>>             if (errno == E2BIG)
>>>                     goto truncated;
>>>             goto failed;
>>>     }
>>>
>>> If you search through 9front's source code, you'll find some weird calls
>>> to seprint(2):
>>>
>>> sys/src/libc/9sys/idn.c:214:                if((dp = seprint(dp, de, "%.*S", nr, rb)) == nil)
>>> sys/src/libc/9sys/idn.c-215-                        return -1;
>>> sys/src/libc/9sys/idn.c-216-                if(dp >= de)
>>> sys/src/libc/9sys/idn.c-217-                        return -1;
>>>
>>> It looks like it expects dp to possibly be a null pointer, which
>>> shouldn't be possible except for programmer bugs.  Let's check the
>>> implementation:
>>>
>>>     $ grepc -tfd seprint .
>>>     ./src/lib9/fmt/seprint.c:char*
>>>     seprint(char *buf, char *e, char *fmt, ...)
>>>     {
>>>             char *p;
>>>             va_list args;
>>>
>>>             va_start(args, fmt);
>>>             p = vseprint(buf, e, fmt, args);
>>>             va_end(args);
>>>             return p;
>>>     }
>>>     $ grepc -tfd vseprint .
>>>     ./src/lib9/fmt/vseprint.c:char*
>>>     vseprint(char *buf, char *e, char *fmt, va_list args)
>>>     {
>>>             Fmt f;
>>>
>>>             if(e <= buf)
>>>                     return nil;
>>>             f.runes = 0;
>>>             f.start = buf;
>>>             f.to = buf;
>>>             f.stop = e - 1;
>>>             f.flush = 0;
>>>             f.farg = nil;
>>>             f.nfmt = 0;
>>>             VA_COPY(f.args,args);
>>>             fmtlocaleinit(&f, nil, nil, nil);
>>>             dofmt(&f, fmt);
>>>             VA_END(f.args);
>>>             *(char*)f.to = '\0';
>>>             return (char*)f.to;
>>>     }
>>>
>>> The only way to return a null pointer is if buf>=e.  But how could that
>>> happen?  Also, the test in <sys/src/libc/9sys/idn.c:216> seems really
>>> impossible.  seprint(2) silently truncates, and returns a pointer to the
>>> terminating NUL byte.  That can never be >=e, since 'e' is supposed to
>>> be a pointer to one-after-the-last element, that is, one after the
>>> terminating NUL byte.
>>
>> We manually increment the returned value from seprint later in that function:
>> /sys/src/libc/9sys/idn.c:220
>> So buf can be >= e in this specific case, further investigation with other
>> 9front folks reviewed the code and determined it is working as intended.
>> Now it is not exactly clear,
> 
> But the test is performed before the increment.  How can it be possible?
> The loop is:
> 
> 200-  for(;;){
> 201-          nc = nr = 0;
> 202-          while(cp[nc] != 0){
> 203-                  n = chartorune(&r, cp+nc);
> 204-                  if(r == '.')
> 205-                          break;
> 206-                  if(nr >= nelem(rb))
> 207-                          return -1;
> 208-                  rb[nr++] = r;
> 209-                  nc += n;
> 210-          }
> 211-          if(cistrncmp(cp, "xn--", 4) == 0)
> 212-                  if((nr = punydecode(nc-4, cp+4, nelem(rb), rb)) < 0)
> 213-                          return -1;
> 214:          if((dp = seprint(dp, de, "%.*S", nr, rb)) == nil)
> 215-                  return -1;
> 216-          if(dp >= de)
> 217-                  return -1;
> 218-          if(cp[nc] == 0)
> 219-                  break;
> 220-          *dp++ = '.';
> 221-          cp += nc+1;
> 222-  }
> 
> In line 214, seprint(2) cannot possibly return anything >=de.  And
> there's no increment between line 214 and line 216, so line 216 must be
> dead code, isn't it?  The value of dp might be incremented later in line
> 220, but that's already after the test has happened, and thus cannot
> make it non-dead code posthumously, can it?

Yes line 216 is dead code, which is why it was removed in the new code.
Your original comment was that the check on line 214 was impossible,
I was commenting that no that check on line 214 is intentional.
Sorry, I should have been more specific.

>> and relying on this behavior is not great
>> so we are exploring perhaps changing this to make things more clear.
>> http://okturing.com/src/23249/body is the current draft.
> 
> This looks better, but I'm not happy doing a transformation like that
> claiming that nothing changes, where the current code clearly reads as
> dead code.  I'd first make sure that the commit message really tells the
> full story.

We just pushed a change that we're happy with.
Thanks for the bug report.

>>> I suspect that we could fix seprint(2) by making it return a null
>>> pointer on truncation, and it would most likely fix existing code, which
>>> does expect a way to detect truncation.
>>>
>>> Here's how I implemented my seprintf() for Linux.  It may need some
>>> adaptation to Plan9, so I welcome your feedback about it.  But I think
>>> the main ideas could be exported to Plan9.
>>>
>>>     $ grepc -htfd seprintf .
>>>     inline char *
>>>     seprintf(char *dst, char *end, const char *restrict fmt, ...)
>>>     {
>>>             char     *p;
>>>             va_list  ap;
>>>
>>>             va_start(ap, fmt);
>>>             p = vseprintf(dst, end, fmt, ap);
>>>             va_end(ap);
>>>
>>>             return p;
>>>     }
>>>     $ grepc -htfd vseprintf .
>>>     inline char *
>>>     vseprintf(char *dst, char *end, const char *restrict fmt, va_list ap)
>>>     {
>>>             int        len;
>>>             ptrdiff_t  size;
>>>
>>>             if (dst == NULL)
>>>                     return NULL;
>>>
>>>             size = end - dst;
>>>             len = vstprintf(dst, size, fmt, ap);
>>>             if (len == -1)
>>>                     return NULL;
>>>
>>>             return dst + len;
>>>     }
>>>     $ grepc -htfd vstprintf .
>>>     inline int
>>>     vstprintf(char *restrict s, int size, const char *restrict fmt, va_list ap)
>>>     {
>>>             int  len;
>>>
>>>             len = vsnprintf(s, size, fmt, ap);
>>>             if (len == -1)
>>>                     return -1;
>>>             if (len >= size && size != 0) {
>>>                     errno = E2BIG;
>>>                     return -1;
>>>             }
>>>
>>>             return len;
>>>     }
>>>
>>> What do you think?
>>
>> In my personal opinion the issue is not so great that it would be worth changing
>> the existing code (and all the callers). More "core" functions like this is used not
>> only by internal code but by plenty of external code.
> 
> I suspect most of that external code is bogus, and probably depends on
> some behavior that differs from the actual behavior of seprint(2), just
> like the code we discussed above has dead code attempting to check for
> truncation with invalid tests.
> 
> Here's one that's more obviously wrong, I think:
> 
>       sys/src/cmd/nusb/lib/dump.c:75:         s = seprint(s, e, " %.2ux", b[i]);
>       sys/src/cmd/nusb/lib/dump.c-76- if(s == e)
>       sys/src/cmd/nusb/lib/dump.c-77-         fprint(2, "%s: usb/lib: hexdump: bug: small buffer\n", argv0);
>       sys/src/cmd/nusb/lib/dump.c-78- return dbuff;
>       sys/src/cmd/nusb/lib/dump.c-79-}
> 
> s==e cannot be possibly true in line 76 right after the call to
> seprint(2) in line 75.  That's necessarily dead code, or I'm getting
> lost in the implementation of vseprint(2).

You're correct in your interpretation, this bug is also fixed now.
Thanks for the reports, if you find any more I'd be very happy to
have them sent our way.

> 
> After checking many call sites, I think I agree with you that the API
> cannot be fixed without breaking some code (code that is already broken
> in other ways, but).  Maybe I'd add a new API and slowly move to it,
> deprecating seprint(2).  Maybe call the new one seprint2(2)?

Maybe, we had talked about doing something like this after your first email.
There was a lack of consensus on if we'd like to do this at the moment.

>> These days I think we like to
>> have a fairly good reason to grow the list of 9front idiosyncrasies compared to the
>> original plan 9 for core interfaces like this.
> 
> I'd say having this bad API being code is in itself a problem, and a
> better API would probably fix some bugs, and promote better code.
> 

It's a little funky and perhaps a little sharp, but that is not something uncommon in C.
I can agree with you that simpler semantics would have been nice, but I think you're missing
some context of how the plan 9 community works at this point. We're really quite small and the
churn rate has historically been really low (9front has picked up some steam in comparison but
is still not anywhere near most other mainstream operating systems). We have people that use
quite old code all the time, or have their own existing code that might be going on a couple
decades old at this point. This is in part because "core" APIs don't tend to change or churn that
much. We have community members who are active that still do a lot of work on "original"
plan 9 with minimal patches. Things have drifted throughout the years, some drift more warranted
then others. We still like to share code around and I am bit hesitant to make things harder
to move back and forth.

Anyway, all this is to just give some context about some of our hesitation to make some changes
for an issue on paper that is annoying but is still possible to avoid.

I appreciate your review on the 9front code, always happy to have discussions like this :).


Thanks,
Jacob Moody

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T90be9e76a4dae77d-M051da777c7078e3f496c1aeb
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [9fans] Re: Plan9 [v]seprint(2) API design bug
  2025-02-10  0:21     ` Jacob Moody
@ 2025-02-10 11:37       ` Alejandro Colomar via 9fans
  0 siblings, 0 replies; 4+ messages in thread
From: Alejandro Colomar via 9fans @ 2025-02-10 11:37 UTC (permalink / raw)
  To: Jacob Moody
  Cc: Rob Pike, Ken Thompson, cinap_lenrek, Ori Bernstein,
	Lennart Jablonka, Sigrid Solveig Haflínudóttir,
	Dan Cross, Russ Cox, branden, Douglas McIlroy,
	Christopher Bazley, 9fans


[-- Attachment #1.1: Type: text/plain, Size: 7980 bytes --]

[Dropped onf@ per their request; updated Ken's address]

Hi Jacob,

On Sun, Feb 09, 2025 at 06:21:42PM -0600, Jacob Moody wrote:
> >> In general 9front specific comments should go to the 9front list 9front@9front.org
> >> more general questions regarding any sort of plan 9 may be best sent to 9fans@9fans.net
> >> There is enough overlap that it should be seen by the right people.
> > 
> > Thanks!  Are those lists public?  That is, can I post to those lists
> > without being subscribed?
> 
> 9fans is, 9front is not.

Ok.

> > Iff we can't fix seprint(2) for historic reasons, I think it would be
> > better to invent a new seprint2(2) and consider slowly migrating all
> > code to that API.  Existing code using seprint(2) might have many bugs,
> > and needs to be audited anyway.
> 
> Sure let's audit existing code and fix things, I am very much interested
> in making things better, but I don't think we should point at a couple of bugs
> and decide we need to rip up the floor boards. More of the reasoning for why
> I feel this way is later on in this mail.

[...]

> > 200-	for(;;){
> > 201-		nc = nr = 0;
> > 202-		while(cp[nc] != 0){
> > 203-			n = chartorune(&r, cp+nc);
> > 204-			if(r == '.')
> > 205-				break;
> > 206-			if(nr >= nelem(rb))
> > 207-				return -1;
> > 208-			rb[nr++] = r;
> > 209-			nc += n;
> > 210-		}
> > 211-		if(cistrncmp(cp, "xn--", 4) == 0)
> > 212-			if((nr = punydecode(nc-4, cp+4, nelem(rb), rb)) < 0)
> > 213-				return -1;
> > 214:		if((dp = seprint(dp, de, "%.*S", nr, rb)) == nil)
> > 215-			return -1;
> > 216-		if(dp >= de)
> > 217-			return -1;
> > 218-		if(cp[nc] == 0)
> > 219-			break;
> > 220-		*dp++ = '.';
> > 221-		cp += nc+1;
> > 222-	}
> > 
> > In line 214, seprint(2) cannot possibly return anything >=de.  And
> > there's no increment between line 214 and line 216, so line 216 must be
> > dead code, isn't it?  The value of dp might be incremented later in line
> > 220, but that's already after the test has happened, and thus cannot
> > make it non-dead code posthumously, can it?
> 
> Yes line 216 is dead code, which is why it was removed in the new code.
> Your original comment was that the check on line 214 was impossible,
> I was commenting that no that check on line 214 is intentional.
> Sorry, I should have been more specific.

Hmmm, makes sense.

> >> and relying on this behavior is not great
> >> so we are exploring perhaps changing this to make things more clear.
> >> http://okturing.com/src/23249/body is the current draft.
> > 
> > This looks better, but I'm not happy doing a transformation like that
> > claiming that nothing changes, where the current code clearly reads as
> > dead code.  I'd first make sure that the commit message really tells the
> > full story.
> 
> We just pushed a change that we're happy with.
> Thanks for the bug report.

You're welcome!  I've fetched the changes, and after your clarification,
they look good to me (I would have appreciated a mention about the
report(s) in the commit message, though:).  Thanks for fixing them!

> >> In my personal opinion the issue is not so great that it would be worth changing
> >> the existing code (and all the callers). More "core" functions like this is used not
> >> only by internal code but by plenty of external code.
> > 
> > I suspect most of that external code is bogus, and probably depends on
> > some behavior that differs from the actual behavior of seprint(2), just
> > like the code we discussed above has dead code attempting to check for
> > truncation with invalid tests.
> > 
> > Here's one that's more obviously wrong, I think:
> > 
> > 	sys/src/cmd/nusb/lib/dump.c:75:		s = seprint(s, e, " %.2ux", b[i]);
> > 	sys/src/cmd/nusb/lib/dump.c-76-	if(s == e)
> > 	sys/src/cmd/nusb/lib/dump.c-77-		fprint(2, "%s: usb/lib: hexdump: bug: small buffer\n", argv0);
> > 	sys/src/cmd/nusb/lib/dump.c-78-	return dbuff;
> > 	sys/src/cmd/nusb/lib/dump.c-79-}
> > 
> > s==e cannot be possibly true in line 76 right after the call to
> > seprint(2) in line 75.  That's necessarily dead code, or I'm getting
> > lost in the implementation of vseprint(2).
> 
> You're correct in your interpretation, this bug is also fixed now.
> Thanks for the reports, if you find any more I'd be very happy to
> have them sent our way.

Nice!  :)

> > After checking many call sites, I think I agree with you that the API
> > cannot be fixed without breaking some code (code that is already broken
> > in other ways, but).  Maybe I'd add a new API and slowly move to it,
> > deprecating seprint(2).  Maybe call the new one seprint2(2)?
> 
> Maybe, we had talked about doing something like this after your first email.
> There was a lack of consensus on if we'd like to do this at the moment.

Feel free to CC me on those emails.  I enjoy discussing these things . :)

> >> These days I think we like to
> >> have a fairly good reason to grow the list of 9front idiosyncrasies compared to the
> >> original plan 9 for core interfaces like this.
> > 
> > I'd say having this bad API being code is in itself a problem, and a
> > better API would probably fix some bugs, and promote better code.
> 
> It's a little funky and perhaps a little sharp, but that is not something uncommon in C.

Yup.

> I can agree with you that simpler semantics would have been nice, but I think you're missing
> some context of how the plan 9 community works at this point. We're really quite small and the
> churn rate has historically been really low (9front has picked up some steam in comparison but
> is still not anywhere near most other mainstream operating systems). We have people that use
> quite old code all the time, or have their own existing code that might be going on a couple
> decades old at this point. This is in part because "core" APIs don't tend to change or churn that
> much. We have community members who are active that still do a lot of work on "original"
> plan 9 with minimal patches. Things have drifted throughout the years, some drift more warranted
> then others. We still like to share code around and I am bit hesitant to make things harder
> to move back and forth.

Heh, a little bit ironic that the OS that departed from Unix to not
depend on 15-year-old mistakes and be free to innovate does now depend
on 40-year-old mistakes.  :D

I understand, though.

> Anyway, all this is to just give some context about some of our hesitation to make some changes
> for an issue on paper that is annoying but is still possible to avoid.
> 
> I appreciate your review on the 9front code, always happy to have discussions like this :).

We'll probably have more.  :)

The thing is, I've been researching on the C string APIs (<string.h>
and <strings.h>) for a few years already, auditing the shadow-utils
source code in Linux, and developing newer APIs that allow us to write
better code, free of bugs, while keeping the spirit of the string APIs.
That's how I ended up writing my own seprint(2)-like API.  (I guess Rob
and Ken had similar itches back then when they wrote it.)  That API has
already made me find and fix several bugs in shadow-utils.

I'm preparing to publish a string library, libs, with all these APIs,
and was checking existing code that looks similar, in case anything can
be done better.  Since I would like to take over the name (or actually,
a similar one), by naming my API seprintf() (it currently is named
stpeprintf()), it would be interesting if both had as-close-as-possible
semantics, to avoid bugs when switching from one to the other.

I will probably continue reading your source code every now and then,
and might report similar bugs in string-handling code.  And especially,
design bugs in APIs.  ;)

(BTW, strecpy(2) has similar design issues.)


Have a lovely day!
Alex

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 228 bytes --]


------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T90be9e76a4dae77d-Ma6284b475fc32805cb9547fa
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-02-10 13:55 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <oqmwt5tnpd7g5ckeba76zak7vzkjentj4hpymi3ngft4oowsld@svyrfpl2bh5z>
     [not found] ` <4fbbf814-c51e-4e92-a8e2-e77afac4cb3b@posixcafe.org>
2025-02-09 23:05   ` [9fans] Re: Plan9 [v]seprint(2) API design bug Alejandro Colomar via 9fans
2025-02-09 23:17     ` Alejandro Colomar via 9fans
2025-02-10  0:21     ` Jacob Moody
2025-02-10 11:37       ` Alejandro Colomar via 9fans

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).