mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Joakim Sindholt <opensource@zhasha.com>
To: musl@lists.openwall.com
Subject: Re: [musl] [PATCH] pack posix_spawn file actions into extraneous padding
Date: Fri, 6 May 2022 19:07:16 +0200	[thread overview]
Message-ID: <20220506190716.0d0483778c4bf15daddc22a3@zhasha.com> (raw)
In-Reply-To: <20220504201010.GZ7074@brightrain.aerifal.cx>

[-- Attachment #1: Type: text/plain, Size: 3207 bytes --]

On Wed, 4 May 2022 16:10:10 -0400, Rich Felker <dalias@libc.org> wrote:
> On Fri, Feb 04, 2022 at 08:09:13PM +0100, Joakim Sindholt wrote:
> > On Sat, Jan 29, 2022 at 02:43:28PM +0100, Joakim Sindholt wrote:
> > > I've written this patch that packs posix_spawn file actions into the
> > > posix_spawn_file_actions struct padding. I'm not sure if the way I've
> > > split the code up is desirable. Maybe the faexpand() function needs a
> > > different name and to be in its own .c file; or maybe a different
> > > approach is preferable.
> 
> (I noted on IRC that we could also solve the problem of the "density"
> of the coding space by using the 32 bits of __pad0[1] to store type
> information for the 16 slots, but this does not seem necessary.)

Since close and dup2 are as densely packed as possible without getting
into bit packing, and those two are by far the most common cases, and
since open, the only remaining action that you might reasonably want to
call many times, also requires malloc every time (for the path), I don't
see a need to add complexity to the reader.

We should also consider the complexity of adding new actions.

> One change I think I would like to make is not using the names __pad0
> and __pad directly in code. It's ugly. Instead, fdop.h could do:
> 
> #define fa_cnt __pad0[0]
> #define fa_ops __pad
> 
> It might also make sense to define a macro for the fa_ops array limit
> 
> #define FA_OPS_MAX_CNT ...
> 
> using sizeof on a compound literal as appropriate.

I agree and I have integrated all of these into this new patch. On
multiple occasions I had to hunt down bugs related to confusing __pad
and __pad0 but I wasn't sure if this solution was acceptable.

> We could also for consistency (not using __ junk names from the public
> header directly) do
> 
> #define fa_chain __actions
> 
> Not sure if that's the best name for it but it seems to make things
> make more sense than writing __actions in the code that's no longer
> using the pointed-to data as the actions list.

Visually it's nicer just to be consistent with the other 2 fields being
fa_*.

Furthermore I have split faexpand into __faexpand in its own .c file to
avoid increasing the .text size of all the add* functions.

> I think I'm happy at this point with just doing "strace tests", but
> sometime between now and next release I would like to get a better set
> of tests into libc-test that cover usage of sufficiently many ops to
> require allocations. Something like a test function that takes a list
> of operations and builds both the file actions object and a shell
> command to evaluate that the file actions were executed correctly,
> using:
> 
> 	read x <&%d && test "$x" = %s
> 
> and:
> 
> 	! 2>/dev/null <&%d
> 
> to assert that each fd is either closed or reads the expected content
> (expected content could be temp files with different content in each).

Testing manually with strace is still a fair bit of work if you're
trying to hit the weird corner cases. I think it makes sense to just
make the tests now. It might take a little while before I get to it
though.

This patch still appears to work fine. I have also attached the skeleton
program I manually modify to test it.

[-- Attachment #2: 0001-pack-posix_spawn-file-actions-into-extraneous-paddin.patch --]
[-- Type: application/octet-stream, Size: 12766 bytes --]

From b25d5c8305409c34a874b527001996219e1337c7 Mon Sep 17 00:00:00 2001
From: Joakim Sindholt <opensource@zhasha.com>
Date: Fri, 4 Feb 2022 19:42:14 +0100
Subject: [PATCH] pack posix_spawn file actions into extraneous padding

The posix_spawn_file_actions struct contains 2 int paddings of 2 and 16
ints respectively, making them ideal for packing file actions into.
The encoding used is simple: __pad0[0] contains a count and __pad
contains the actuals elements. The most important actions, dup2 and
close, are encoded as 2 and 1 int respectively.

If the first element of a sequence is negative, then it's a close and
processing stops there.
If the second element of a sequence is positive, then it's a dup2 and
processing stops there.
If the second element is negative, however, then it contains a command
and requires per-command specific processing.
In total there are 5 file actions. Their encodings are:

close: -fd-1
dup2: fd, srcfd
fchdir: fd, -FDOP_FCHDIR
chdir: INT_MAX, -FDOP_CHDIR
open: fd, -FDOP_OPEN, flags, mode

Some file actions need a string. When a string is needed it's allocated
on the tail end of a new file_actions struct. When a string is needed or
an action requires more space than is available then a new struct is
allocated and __actions is set to point to it. All the elements of the
first file_actions struct are copied to the new one. The new
file_actions struct, being the second in the list, always points to the
last file_actions struct in the list. From then on the structs all point
back in reverse order until they reach the second one again, creating a
loop. This rube-goldberg machine exists because there's nowhere to store
a last pointer for appending new actions efficiently.
As a result of this backwards allocation, the actions requiring strings
aren't stored in the same struct as the strings themselves. Rather the
string gets stored in the logically previous struct when reading.

This packing is far from optimal but it is very simple and it allows for
up to 16 close, 8 dup2, or any combination of them, without needing to
call malloc. At worst it calls malloc as much as before.
---
 src/process/__faexpand.c                      | 17 +++++
 src/process/fdop.h                            | 18 +++--
 src/process/posix_spawn.c                     | 66 ++++++++++++-------
 .../posix_spawn_file_actions_addchdir.c       | 15 ++---
 .../posix_spawn_file_actions_addclose.c       | 11 +---
 .../posix_spawn_file_actions_adddup2.c        | 13 ++--
 .../posix_spawn_file_actions_addfchdir.c      | 13 ++--
 .../posix_spawn_file_actions_addopen.c        | 19 +++---
 .../posix_spawn_file_actions_destroy.c        | 11 ++--
 src/process/posix_spawn_file_actions_init.c   |  4 +-
 10 files changed, 106 insertions(+), 81 deletions(-)
 create mode 100644 src/process/__faexpand.c

diff --git a/src/process/__faexpand.c b/src/process/__faexpand.c
new file mode 100644
index 00000000..f1d8c6fc
--- /dev/null
+++ b/src/process/__faexpand.c
@@ -0,0 +1,17 @@
+#include <spawn.h>
+#include <stdlib.h>
+#include "fdop.h"
+
+int __faexpand(posix_spawn_file_actions_t *fa, size_t str)
+{
+	posix_spawn_file_actions_t *n, *blk;
+	blk = malloc(sizeof *blk + str);
+	if (!blk) return -1;
+	*blk = *fa;
+	n = fa->fa_chain;
+	fa->fa_chain = blk;
+	blk->fa_chain = n->fa_chain;
+	n->fa_chain = blk;
+	fa->fa_cnt = 0;
+	return 0;
+}
diff --git a/src/process/fdop.h b/src/process/fdop.h
index 7cf733b2..5696e4ab 100644
--- a/src/process/fdop.h
+++ b/src/process/fdop.h
@@ -1,17 +1,21 @@
+#include <stddef.h>
+
 #define FDOP_CLOSE 1
 #define FDOP_DUP2 2
 #define FDOP_OPEN 3
 #define FDOP_CHDIR 4
 #define FDOP_FCHDIR 5
 
-struct fdop {
-	struct fdop *next, *prev;
-	int cmd, fd, srcfd, oflag;
-	mode_t mode;
-	char path[];
-};
-
 #define malloc __libc_malloc
 #define calloc __libc_calloc
 #define realloc undef
 #define free __libc_free
+
+#define fa_chain __actions
+#define fa_ops __pad
+#define fa_cnt __pad0[0]
+#define FA_CNT_MAX \
+	(sizeof((posix_spawn_file_actions_t){0}.fa_ops) \
+	/sizeof((posix_spawn_file_actions_t){0}.fa_ops[0]))
+
+int __faexpand(posix_spawn_file_actions_t *, size_t);
diff --git a/src/process/posix_spawn.c b/src/process/posix_spawn.c
index 728551b3..eab09191 100644
--- a/src/process/posix_spawn.c
+++ b/src/process/posix_spawn.c
@@ -81,33 +81,53 @@ static int child(void *args_vp)
 		    (ret=__syscall(SYS_setuid, __syscall(SYS_getuid))) )
 			goto fail;
 
-	if (fa && fa->__actions) {
-		struct fdop *op;
-		int fd;
-		for (op = fa->__actions; op->next; op = op->next);
-		for (; op; op = op->prev) {
+	if (fa) {
+		const posix_spawn_file_actions_t *first = fa;
+		const char *path;
+		int fd, srcfd, i = 0;
+
+		/* The list of actions, in reverse order, looks like this:
+		 * [0]->[1]<-[2]<-[3]
+		 *        \_______/^
+		 * so to get to the first action we to go to [0]->next->next
+		 * and to get all the way back we have to special case the
+		 * transition from [1]->[0] */
+		fa = fa->fa_chain;
+		fa = fa->fa_chain;
+		while (!(fa == first && i == fa->fa_cnt)) {
+			if (i == fa->fa_cnt) {
+				path = (char *)(fa+1);
+				fa = fa==first->fa_chain ? first : fa->fa_chain;
+				i = 0;
+				continue;
+			}
+			fd = fa->fa_ops[i++];
+
 			/* It's possible that a file operation would clobber
 			 * the pipe fd used for synchronizing with the
 			 * parent. To avoid that, we dup the pipe onto
 			 * an unoccupied fd. */
-			if (op->fd == p) {
+			if (fd == p) {
 				ret = __syscall(SYS_dup, p);
 				if (ret < 0) goto fail;
 				__syscall(SYS_close, p);
 				p = ret;
 			}
-			switch(op->cmd) {
-			case FDOP_CLOSE:
-				__syscall(SYS_close, op->fd);
-				break;
-			case FDOP_DUP2:
-				fd = op->srcfd;
-				if (fd == p) {
+
+			if (fd < 0) { /* FDOP_CLOSE */
+				if (-fd-1 == p) {
+					ret = -EBADF;
+					goto fail;
+				}
+				__syscall(SYS_close, -fd-1);
+			} else switch (-(srcfd = fa->fa_ops[i++])) {
+			default: /* FDOP_DUP2 */
+				if (srcfd == p) {
 					ret = -EBADF;
 					goto fail;
 				}
-				if (fd != op->fd) {
-					if ((ret=__sys_dup2(fd, op->fd))<0)
+				if (srcfd != fd) {
+					if ((ret=__sys_dup2(srcfd, fd))<0)
 						goto fail;
 				} else {
 					ret = __syscall(SYS_fcntl, fd, F_GETFD);
@@ -118,20 +138,22 @@ static int child(void *args_vp)
 				}
 				break;
 			case FDOP_OPEN:
-				fd = __sys_open(op->path, op->oflag, op->mode);
-				if ((ret=fd) < 0) goto fail;
-				if (fd != op->fd) {
-					if ((ret=__sys_dup2(fd, op->fd))<0)
+				srcfd = __sys_open(path, fa->fa_ops[i],
+				                   fa->fa_ops[i+1]);
+				if ((ret=srcfd) < 0) goto fail;
+				if (srcfd != fd) {
+					if ((ret=__sys_dup2(srcfd, fd))<0)
 						goto fail;
-					__syscall(SYS_close, fd);
+					__syscall(SYS_close, srcfd);
 				}
+				i += 2;
 				break;
 			case FDOP_CHDIR:
-				ret = __syscall(SYS_chdir, op->path);
+				ret = __syscall(SYS_chdir, path);
 				if (ret<0) goto fail;
 				break;
 			case FDOP_FCHDIR:
-				ret = __syscall(SYS_fchdir, op->fd);
+				ret = __syscall(SYS_fchdir, fd);
 				if (ret<0) goto fail;
 				break;
 			}
diff --git a/src/process/posix_spawn_file_actions_addchdir.c b/src/process/posix_spawn_file_actions_addchdir.c
index 7f2590ae..636bfba7 100644
--- a/src/process/posix_spawn_file_actions_addchdir.c
+++ b/src/process/posix_spawn_file_actions_addchdir.c
@@ -1,18 +1,15 @@
 #include <spawn.h>
-#include <stdlib.h>
 #include <string.h>
 #include <errno.h>
+#include <limits.h>
 #include "fdop.h"
 
 int posix_spawn_file_actions_addchdir_np(posix_spawn_file_actions_t *restrict fa, const char *restrict path)
 {
-	struct fdop *op = malloc(sizeof *op + strlen(path) + 1);
-	if (!op) return ENOMEM;
-	op->cmd = FDOP_CHDIR;
-	op->fd = -1;
-	strcpy(op->path, path);
-	if ((op->next = fa->__actions)) op->next->prev = op;
-	op->prev = 0;
-	fa->__actions = op;
+	size_t len = strlen(path);
+	if (__faexpand(fa, len + 1) != 0) return ENOMEM;
+	fa->fa_ops[fa->fa_cnt++] = INT_MAX;
+	fa->fa_ops[fa->fa_cnt++] = -FDOP_CHDIR;
+	memcpy((posix_spawn_file_actions_t *)fa->fa_chain + 1, path, len + 1);
 	return 0;
 }
diff --git a/src/process/posix_spawn_file_actions_addclose.c b/src/process/posix_spawn_file_actions_addclose.c
index 0c2ef8fa..5f496c06 100644
--- a/src/process/posix_spawn_file_actions_addclose.c
+++ b/src/process/posix_spawn_file_actions_addclose.c
@@ -1,17 +1,12 @@
 #include <spawn.h>
-#include <stdlib.h>
 #include <errno.h>
 #include "fdop.h"
 
 int posix_spawn_file_actions_addclose(posix_spawn_file_actions_t *fa, int fd)
 {
 	if (fd < 0) return EBADF;
-	struct fdop *op = malloc(sizeof *op);
-	if (!op) return ENOMEM;
-	op->cmd = FDOP_CLOSE;
-	op->fd = fd;
-	if ((op->next = fa->__actions)) op->next->prev = op;
-	op->prev = 0;
-	fa->__actions = op;
+	if (FA_CNT_MAX - fa->fa_cnt < 1)
+		if (__faexpand(fa, 0) != 0) return ENOMEM;
+	fa->fa_ops[fa->fa_cnt++] = -fd-1;
 	return 0;
 }
diff --git a/src/process/posix_spawn_file_actions_adddup2.c b/src/process/posix_spawn_file_actions_adddup2.c
index addca4d4..d83a99b0 100644
--- a/src/process/posix_spawn_file_actions_adddup2.c
+++ b/src/process/posix_spawn_file_actions_adddup2.c
@@ -1,18 +1,13 @@
 #include <spawn.h>
-#include <stdlib.h>
 #include <errno.h>
 #include "fdop.h"
 
 int posix_spawn_file_actions_adddup2(posix_spawn_file_actions_t *fa, int srcfd, int fd)
 {
 	if (srcfd < 0 || fd < 0) return EBADF;
-	struct fdop *op = malloc(sizeof *op);
-	if (!op) return ENOMEM;
-	op->cmd = FDOP_DUP2;
-	op->srcfd = srcfd;
-	op->fd = fd;
-	if ((op->next = fa->__actions)) op->next->prev = op;
-	op->prev = 0;
-	fa->__actions = op;
+	if (FA_CNT_MAX - fa->fa_cnt < 2)
+		if (__faexpand(fa, 0) != 0) return ENOMEM;
+	fa->fa_ops[fa->fa_cnt++] = fd;
+	fa->fa_ops[fa->fa_cnt++] = srcfd;
 	return 0;
 }
diff --git a/src/process/posix_spawn_file_actions_addfchdir.c b/src/process/posix_spawn_file_actions_addfchdir.c
index e89ede8c..fe248a5d 100644
--- a/src/process/posix_spawn_file_actions_addfchdir.c
+++ b/src/process/posix_spawn_file_actions_addfchdir.c
@@ -1,18 +1,13 @@
 #include <spawn.h>
-#include <stdlib.h>
-#include <string.h>
 #include <errno.h>
 #include "fdop.h"
 
 int posix_spawn_file_actions_addfchdir_np(posix_spawn_file_actions_t *fa, int fd)
 {
 	if (fd < 0) return EBADF;
-	struct fdop *op = malloc(sizeof *op);
-	if (!op) return ENOMEM;
-	op->cmd = FDOP_FCHDIR;
-	op->fd = fd;
-	if ((op->next = fa->__actions)) op->next->prev = op;
-	op->prev = 0;
-	fa->__actions = op;
+	if (FA_CNT_MAX - fa->fa_cnt < 2)
+		if (__faexpand(fa, 0) != 0) return ENOMEM;
+	fa->fa_ops[fa->fa_cnt++] = fd;
+	fa->fa_ops[fa->fa_cnt++] = -FDOP_FCHDIR;
 	return 0;
 }
diff --git a/src/process/posix_spawn_file_actions_addopen.c b/src/process/posix_spawn_file_actions_addopen.c
index 82bbcec9..8840dbd6 100644
--- a/src/process/posix_spawn_file_actions_addopen.c
+++ b/src/process/posix_spawn_file_actions_addopen.c
@@ -1,21 +1,18 @@
 #include <spawn.h>
-#include <stdlib.h>
 #include <string.h>
 #include <errno.h>
 #include "fdop.h"
 
 int posix_spawn_file_actions_addopen(posix_spawn_file_actions_t *restrict fa, int fd, const char *restrict path, int flags, mode_t mode)
 {
+	size_t len;
 	if (fd < 0) return EBADF;
-	struct fdop *op = malloc(sizeof *op + strlen(path) + 1);
-	if (!op) return ENOMEM;
-	op->cmd = FDOP_OPEN;
-	op->fd = fd;
-	op->oflag = flags;
-	op->mode = mode;
-	strcpy(op->path, path);
-	if ((op->next = fa->__actions)) op->next->prev = op;
-	op->prev = 0;
-	fa->__actions = op;
+	len = strlen(path);
+	if (__faexpand(fa, len + 1) != 0) return ENOMEM;
+	fa->fa_ops[fa->fa_cnt++] = fd;
+	fa->fa_ops[fa->fa_cnt++] = -FDOP_OPEN;
+	fa->fa_ops[fa->fa_cnt++] = flags;
+	fa->fa_ops[fa->fa_cnt++] = mode;
+	memcpy((posix_spawn_file_actions_t *)fa->fa_chain + 1, path, len + 1);
 	return 0;
 }
diff --git a/src/process/posix_spawn_file_actions_destroy.c b/src/process/posix_spawn_file_actions_destroy.c
index 3251babb..0ac48480 100644
--- a/src/process/posix_spawn_file_actions_destroy.c
+++ b/src/process/posix_spawn_file_actions_destroy.c
@@ -4,11 +4,12 @@
 
 int posix_spawn_file_actions_destroy(posix_spawn_file_actions_t *fa)
 {
-	struct fdop *op = fa->__actions, *next;
-	while (op) {
-		next = op->next;
-		free(op);
-		op = next;
+	posix_spawn_file_actions_t *next, *p = fa->fa_chain;
+	for (p = p->fa_chain; p != fa->fa_chain; p = next) {
+		next = p->fa_chain;
+		free(p);
 	}
+	if (p != fa)
+		free(p);
 	return 0;
 }
diff --git a/src/process/posix_spawn_file_actions_init.c b/src/process/posix_spawn_file_actions_init.c
index 89d5e127..a26d6f41 100644
--- a/src/process/posix_spawn_file_actions_init.c
+++ b/src/process/posix_spawn_file_actions_init.c
@@ -1,7 +1,9 @@
 #include <spawn.h>
+#include "fdop.h"
 
 int posix_spawn_file_actions_init(posix_spawn_file_actions_t *fa)
 {
-	fa->__actions = 0;
+	fa->fa_cnt = 0;
+	fa->fa_chain = fa;
 	return 0;
 }
-- 
2.26.3


[-- Attachment #3: test.c --]
[-- Type: text/plain, Size: 1132 bytes --]

#include <spawn.h>
#include <fcntl.h>
#include <stdio.h>
#include <errno.h>

#define A(x) \
    if ((errno = (x)) != 0) \
        perror(#x)
int
main(int argc, char *argv[])
{
    posix_spawn_file_actions_t fa;
    pid_t pid;

    posix_spawn_file_actions_init(&fa);
    A(posix_spawn_file_actions_addclose(&fa, 0));
    A(posix_spawn_file_actions_adddup2(&fa, 2, 3));
    A(posix_spawn_file_actions_addopen(&fa, 5, "test.c", O_RDONLY, 0));
    A(posix_spawn_file_actions_addopen(&fa, 4, "/usr", O_DIRECTORY, 0));
    A(posix_spawn_file_actions_addchdir_np(&fa, "/"));
    A(posix_spawn_file_actions_addfchdir_np(&fa, 4));
    A(posix_spawn_file_actions_addclose(&fa, 5));
    A(posix_spawn_file_actions_adddup2(&fa, 2, 10));
    A(posix_spawn_file_actions_adddup2(&fa, 2, 11));
    A(posix_spawn_file_actions_adddup2(&fa, 2, 12));
    A(posix_spawn_file_actions_adddup2(&fa, 2, 13));
    A(posix_spawn_file_actions_adddup2(&fa, 2, 14));
    A(posix_spawn_file_actions_adddup2(&fa, 2, 15));
    posix_spawn(&pid, "/bin/false", &fa, 0, (char*[]){"false", 0}, (char*[]){0});
    posix_spawn_file_actions_destroy(&fa);
    return 0;
}

      reply	other threads:[~2022-05-06 17:07 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-29 13:43 Joakim Sindholt
2022-02-04 19:09 ` Joakim Sindholt
2022-05-04 20:10   ` Rich Felker
2022-05-06 17:07     ` Joakim Sindholt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220506190716.0d0483778c4bf15daddc22a3@zhasha.com \
    --to=opensource@zhasha.com \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).