* [musl] BUG: Calling readdir/dirfd after vfork will cause deadlock. @ 2022-06-25 3:40 Nick Peng 2022-06-25 12:51 ` Szabolcs Nagy 2022-06-25 12:56 ` Alex Xu (Hello71) 0 siblings, 2 replies; 13+ messages in thread From: Nick Peng @ 2022-06-25 3:40 UTC (permalink / raw) To: musl [-- Attachment #1: Type: text/plain, Size: 2603 bytes --] Description: After vfork, calling functions such as readdir/dirfd may cause deadlock. GNU C is OK. Also tested on x86-64 with musl, no deadlock, but seems never exit, slower than GNU C. Version: latest, musl-1.2.3 OS: debian bullseye 64bit OS. and asus router CPU: raspberrypi aarch64, mips32 Reproduce Code: #include <dirent.h> #include <pthread.h> #include <stdint.h> #include <stdio.h> #include <stdlib.h> #include <sys/types.h> #include <sys/wait.h> #include <unistd.h> #include <string.h> pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER; pthread_cond_t cond = PTHREAD_COND_INITIALIZER; struct tlog_log *logs = NULL; int do_exit = 0; void *test(void *arg) { int i = 0; for (i = 0; i < 10000000; i++) { char *b = malloc(4096); memset(b, 0, 4096); free(b); } do_exit = 1; return NULL; } void lockfunc() { char path_name[4096]; DIR *dir = NULL; struct dirent *ent; snprintf(path_name, sizeof(path_name), "/proc/self/fd/"); dir = opendir(path_name); if (dir == NULL) { goto errout; } while ((ent = readdir(dir)) != NULL) { } closedir(dir); return; errout: if (dir) { closedir(dir); } return; } void *test_fork(void *arg) { int count = 0; while (do_exit == 0) { printf("test fork count %d\n", count++); int pid = vfork(); if (pid < 0) { return NULL; } else if (pid == 0) { lockfunc(); _exit(0); } int status; waitpid(pid, &status, 0); } return NULL; } int main(int argc, char *argv[]) { pthread_attr_t attr; pthread_t threads[10]; pthread_t fork_test; int i; int ret; pthread_attr_init(&attr); ret = pthread_create(&fork_test, &attr, test_fork, NULL); for (i = 0; i < 10; i++) { ret = pthread_create(&threads[i], &attr, test, NULL); if (ret != 0) { return 1; } } for (i = 0; i < 10; i++) { void *retval = NULL; pthread_join(threads[i], &retval); } void *retval = NULL; pthread_join(fork_test, &retval); printf("exit\n"); getchar(); return 0; } Log: pi@raspberrypi:~/code/tinylog/test $ ./test test fork count 0 test fork count 1 <-- lock here ^C gdb backtrace: x0000000000409524 in __lock () (gdb) bt #0 0x0000000000409524 in __lock () #1 0x0000000000406278 in __libc_malloc_impl () [-- Attachment #2: Type: text/html, Size: 3318 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [musl] BUG: Calling readdir/dirfd after vfork will cause deadlock. 2022-06-25 3:40 [musl] BUG: Calling readdir/dirfd after vfork will cause deadlock Nick Peng @ 2022-06-25 12:51 ` Szabolcs Nagy 2022-06-27 4:05 ` Nick Peng 2022-06-27 7:42 ` Florian Weimer 2022-06-25 12:56 ` Alex Xu (Hello71) 1 sibling, 2 replies; 13+ messages in thread From: Szabolcs Nagy @ 2022-06-25 12:51 UTC (permalink / raw) To: Nick Peng; +Cc: musl * Nick Peng <pymumu@gmail.com> [2022-06-25 11:40:17 +0800]: > Description: After vfork, calling functions such as readdir/dirfd may > cause deadlock. GNU C is OK. why do you think "GNU C is OK"? is this from some real software? opendir after vfork is documented to be invalid in glibc: https://www.gnu.org/software/libc/manual/html_mono/libc.html#Low_002dlevel-Directory-Access the standard is actually much stricter than the glibc manual: posix conforming code must not call any libc api after vfork other than _exit or the exec* familiy of functions. (as-safe is not enough, but opendir is not even as-safe) since the example is multi-threaded even using fork would be invalid, but i think both musl and glibc makes that work (as an extension to the standard). > Also tested on x86-64 with musl, no deadlock, but > seems never exit, slower than GNU C. > Version: latest, musl-1.2.3 > OS: debian bullseye 64bit OS. and asus router > CPU: raspberrypi aarch64, mips32 > Reproduce Code: > > #include <dirent.h> > #include <pthread.h> > #include <stdint.h> > #include <stdio.h> > #include <stdlib.h> > #include <sys/types.h> > #include <sys/wait.h> > #include <unistd.h> > #include <string.h> > > pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER; > pthread_cond_t cond = PTHREAD_COND_INITIALIZER; > > struct tlog_log *logs = NULL; > int do_exit = 0; > > void *test(void *arg) > { > int i = 0; > > for (i = 0; i < 10000000; i++) { > char *b = malloc(4096); > memset(b, 0, 4096); > free(b); > } > do_exit = 1; > return NULL; > } > > void lockfunc() > { > char path_name[4096]; > DIR *dir = NULL; > struct dirent *ent; > > snprintf(path_name, sizeof(path_name), "/proc/self/fd/"); > dir = opendir(path_name); > if (dir == NULL) { > goto errout; > } > > while ((ent = readdir(dir)) != NULL) { > } > > closedir(dir); > > return; > errout: > if (dir) { > closedir(dir); > } > > return; > } > > void *test_fork(void *arg) > { > int count = 0; > while (do_exit == 0) { > printf("test fork count %d\n", count++); > int pid = vfork(); > if (pid < 0) { > return NULL; > } else if (pid == 0) { > lockfunc(); > _exit(0); > } > > int status; > waitpid(pid, &status, 0); > } > > return NULL; > } > > int main(int argc, char *argv[]) > { > pthread_attr_t attr; > pthread_t threads[10]; > pthread_t fork_test; > int i; > int ret; > > pthread_attr_init(&attr); > > ret = pthread_create(&fork_test, &attr, test_fork, NULL); > > for (i = 0; i < 10; i++) { > ret = pthread_create(&threads[i], &attr, test, NULL); > if (ret != 0) { > return 1; > } > } > > for (i = 0; i < 10; i++) { > void *retval = NULL; > pthread_join(threads[i], &retval); > } > > void *retval = NULL; > pthread_join(fork_test, &retval); > printf("exit\n"); > getchar(); > return 0; > } > > > Log: > pi@raspberrypi:~/code/tinylog/test $ ./test > test fork count 0 > test fork count 1 <-- lock here > ^C > > gdb backtrace: > x0000000000409524 in __lock () > (gdb) bt > #0 0x0000000000409524 in __lock () > #1 0x0000000000406278 in __libc_malloc_impl () ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [musl] BUG: Calling readdir/dirfd after vfork will cause deadlock. 2022-06-25 12:51 ` Szabolcs Nagy @ 2022-06-27 4:05 ` Nick Peng 2022-06-27 10:38 ` Laurent Bercot ` (2 more replies) 2022-06-27 7:42 ` Florian Weimer 1 sibling, 3 replies; 13+ messages in thread From: Nick Peng @ 2022-06-27 4:05 UTC (permalink / raw) To: Nick Peng, musl [-- Attachment #1: Type: text/plain, Size: 5440 bytes --] The feature I want to achieve is to close all file handles after fork to avoid file handles being inherited to child processes. We have a component which is used in a process that uses too much memory (about 50BG) and uses too many file handles(about 100,000 files). In order to avoid file handles inheriting from child processes, we implemented a API like system,close all file handles after vfork. We can't close all file handles by iterating over all file handles number, because the maximum number of files limit is set very large, the performance of this method is too poor. And we can't use fork, fork will fail because of vm.overcommit_memory. In order to optimize this, the method we used is to read the file handes in /proc/self/fd directory and close the file handles one by one. The component is based on glibc, currently this program runs for about 10 years without deadlock. Recently, this component is used in musl-based embedded systems, and it hangs often. After reading the vfork manual and the guidance you gave, i understand calling readdir after vfork is problematic. and I also learned that I can call getent64 to achieve this but what I want to say is that, based on glibc, after calling readdir after vfork, there is no deadlock problem at present. more information: I googled and found this(https://lwn.net/Articles/789023/), probably the best solution, but it seems that the required kernel version is too new for my program to work with. However, I think my problem is solved anyway, thank you all. On Sun, Jun 26, 2022 at 2:49 AM Szabolcs Nagy <nsz@port70.net> wrote: > * Nick Peng <pymumu@gmail.com> [2022-06-25 11:40:17 +0800]: > > Description: After vfork, calling functions such as readdir/dirfd may > > cause deadlock. GNU C is OK. > > why do you think "GNU C is OK"? is this from some real software? > > opendir after vfork is documented to be invalid in glibc: > > https://www.gnu.org/software/libc/manual/html_mono/libc.html#Low_002dlevel-Directory-Access > > the standard is actually much stricter than the glibc manual: > posix conforming code must not call any libc api after vfork > other than _exit or the exec* familiy of functions. > (as-safe is not enough, but opendir is not even as-safe) > > since the example is multi-threaded even using fork would > be invalid, but i think both musl and glibc makes that work > (as an extension to the standard). > > > > Also tested on x86-64 with musl, no deadlock, but > > seems never exit, slower than GNU C. > > Version: latest, musl-1.2.3 > > OS: debian bullseye 64bit OS. and asus router > > CPU: raspberrypi aarch64, mips32 > > Reproduce Code: > > > > #include <dirent.h> > > #include <pthread.h> > > #include <stdint.h> > > #include <stdio.h> > > #include <stdlib.h> > > #include <sys/types.h> > > #include <sys/wait.h> > > #include <unistd.h> > > #include <string.h> > > > > pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER; > > pthread_cond_t cond = PTHREAD_COND_INITIALIZER; > > > > struct tlog_log *logs = NULL; > > int do_exit = 0; > > > > void *test(void *arg) > > { > > int i = 0; > > > > for (i = 0; i < 10000000; i++) { > > char *b = malloc(4096); > > memset(b, 0, 4096); > > free(b); > > } > > do_exit = 1; > > return NULL; > > } > > > > void lockfunc() > > { > > char path_name[4096]; > > DIR *dir = NULL; > > struct dirent *ent; > > > > snprintf(path_name, sizeof(path_name), "/proc/self/fd/"); > > dir = opendir(path_name); > > if (dir == NULL) { > > goto errout; > > } > > > > while ((ent = readdir(dir)) != NULL) { > > } > > > > closedir(dir); > > > > return; > > errout: > > if (dir) { > > closedir(dir); > > } > > > > return; > > } > > > > void *test_fork(void *arg) > > { > > int count = 0; > > while (do_exit == 0) { > > printf("test fork count %d\n", count++); > > int pid = vfork(); > > if (pid < 0) { > > return NULL; > > } else if (pid == 0) { > > lockfunc(); > > _exit(0); > > } > > > > int status; > > waitpid(pid, &status, 0); > > } > > > > return NULL; > > } > > > > int main(int argc, char *argv[]) > > { > > pthread_attr_t attr; > > pthread_t threads[10]; > > pthread_t fork_test; > > int i; > > int ret; > > > > pthread_attr_init(&attr); > > > > ret = pthread_create(&fork_test, &attr, test_fork, NULL); > > > > for (i = 0; i < 10; i++) { > > ret = pthread_create(&threads[i], &attr, test, NULL); > > if (ret != 0) { > > return 1; > > } > > } > > > > for (i = 0; i < 10; i++) { > > void *retval = NULL; > > pthread_join(threads[i], &retval); > > } > > > > void *retval = NULL; > > pthread_join(fork_test, &retval); > > printf("exit\n"); > > getchar(); > > return 0; > > } > > > > > > Log: > > pi@raspberrypi:~/code/tinylog/test $ ./test > > test fork count 0 > > test fork count 1 <-- lock here > > ^C > > > > gdb backtrace: > > x0000000000409524 in __lock () > > (gdb) bt > > #0 0x0000000000409524 in __lock () > > #1 0x0000000000406278 in __libc_malloc_impl () > [-- Attachment #2: Type: text/html, Size: 7191 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [musl] BUG: Calling readdir/dirfd after vfork will cause deadlock. 2022-06-27 4:05 ` Nick Peng @ 2022-06-27 10:38 ` Laurent Bercot 2022-06-27 14:21 ` Rich Felker 2022-06-27 14:29 ` Markus Wichmann 2 siblings, 0 replies; 13+ messages in thread From: Laurent Bercot @ 2022-06-27 10:38 UTC (permalink / raw) To: musl, Nick Peng >We can't close all file handles by iterating over all file handles >number, because the maximum number of files limit is set very large, >the performance of this method is too poor. >And we can't use fork, fork will fail because of vm.overcommit_memory. >In order to optimize this, the method we used is to read the file >handes in /proc/self/fd directory and close the file handles one by >one. This may be a rare instance where close_range(2) is useful, if you don't mind being restricted to Linux (>= 5.9) and FreeBSD. https://man7.org/linux/man-pages/man2/close_range.2.html -- Laurent ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [musl] BUG: Calling readdir/dirfd after vfork will cause deadlock. 2022-06-27 4:05 ` Nick Peng 2022-06-27 10:38 ` Laurent Bercot @ 2022-06-27 14:21 ` Rich Felker 2022-06-27 14:29 ` Markus Wichmann 2 siblings, 0 replies; 13+ messages in thread From: Rich Felker @ 2022-06-27 14:21 UTC (permalink / raw) To: Nick Peng; +Cc: musl On Mon, Jun 27, 2022 at 12:05:41PM +0800, Nick Peng wrote: > The feature I want to achieve is to close all file handles after fork to > avoid file handles being inherited to child processes. That's what opening them O_CLOEXEC is for. Closing *all* file descriptors is a bug and will break correct usage like service supervisors letting a whole process tree inherit a pipe fd so that they can observe the extent of the lifetime of the whole tree. > We have a component which is used in a process that uses too much memory > (about 50BG) and uses too many file handles(about 100,000 files). > In order to avoid file handles inheriting from child processes, we > implemented a API like system,close all file handles after vfork. > We can't close all file handles by iterating over all file handles number, > because the maximum number of files limit is set very large, the > performance of this method is too poor. > And we can't use fork, fork will fail because of vm.overcommit_memory. > In order to optimize this, the method we used is to read the file handes in > /proc/self/fd directory and close the file handles one by one. There's another way that's efficient to do this, though again it's a bad pattern that breaks correct usage as described above: with poll() you can query a very large set of file descriptors without requesting any status bits, and get POLLNVAL for all the ones that are not valid. This lets you enumerate thousands per syscall rather than just one at a time. And it does not make use of any operations that need to allocate memory or other resources. > The component is based on glibc, currently this program runs for about 10 > years without deadlock. > Recently, this component is used in musl-based embedded systems, and it > hangs often. > > After reading the vfork manual and the guidance you gave, i understand > calling readdir after vfork is problematic. and I also learned that I can > call getent64 to achieve this > but what I want to say is that, based on glibc, after calling readdir after > vfork, there is no deadlock problem at present. musl follows the specification for vfork that *nothing* except execve or _exit is supported after vfork. If something works, you got lucky (or unlucky, in the sense that it might actually blow up later on conditions you didn't anticipate, or on ugrade). For the most part, vfork should not be used. In practice, the above-mentioned poll method will poll will probably work, but it is not supported. What you're doing really calls for posix_spawn, and, if you need further actions that it can't represent "between the fork and exec", use of a helper program that performs those actions then execs the final program image. This will give you the advantages of vfork without the unsafety of it. > more information: I googled and found this(https://lwn.net/Articles/789023/), > probably the best solution, but it seems that the required kernel version > is too new for my program to work with. > > However, I think my problem is solved anyway, thank you all. > > On Sun, Jun 26, 2022 at 2:49 AM Szabolcs Nagy <nsz@port70.net> wrote: > > > * Nick Peng <pymumu@gmail.com> [2022-06-25 11:40:17 +0800]: > > > Description: After vfork, calling functions such as readdir/dirfd may > > > cause deadlock. GNU C is OK. > > > > why do you think "GNU C is OK"? is this from some real software? > > > > opendir after vfork is documented to be invalid in glibc: > > > > https://www.gnu.org/software/libc/manual/html_mono/libc.html#Low_002dlevel-Directory-Access > > > > the standard is actually much stricter than the glibc manual: > > posix conforming code must not call any libc api after vfork > > other than _exit or the exec* familiy of functions. > > (as-safe is not enough, but opendir is not even as-safe) > > > > since the example is multi-threaded even using fork would > > be invalid, but i think both musl and glibc makes that work > > (as an extension to the standard). > > > > > > > Also tested on x86-64 with musl, no deadlock, but > > > seems never exit, slower than GNU C. > > > Version: latest, musl-1.2.3 > > > OS: debian bullseye 64bit OS. and asus router > > > CPU: raspberrypi aarch64, mips32 > > > Reproduce Code: > > > > > > #include <dirent.h> > > > #include <pthread.h> > > > #include <stdint.h> > > > #include <stdio.h> > > > #include <stdlib.h> > > > #include <sys/types.h> > > > #include <sys/wait.h> > > > #include <unistd.h> > > > #include <string.h> > > > > > > pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER; > > > pthread_cond_t cond = PTHREAD_COND_INITIALIZER; > > > > > > struct tlog_log *logs = NULL; > > > int do_exit = 0; > > > > > > void *test(void *arg) > > > { > > > int i = 0; > > > > > > for (i = 0; i < 10000000; i++) { > > > char *b = malloc(4096); > > > memset(b, 0, 4096); > > > free(b); > > > } > > > do_exit = 1; > > > return NULL; > > > } > > > > > > void lockfunc() > > > { > > > char path_name[4096]; > > > DIR *dir = NULL; > > > struct dirent *ent; > > > > > > snprintf(path_name, sizeof(path_name), "/proc/self/fd/"); > > > dir = opendir(path_name); > > > if (dir == NULL) { > > > goto errout; > > > } > > > > > > while ((ent = readdir(dir)) != NULL) { > > > } > > > > > > closedir(dir); > > > > > > return; > > > errout: > > > if (dir) { > > > closedir(dir); > > > } > > > > > > return; > > > } > > > > > > void *test_fork(void *arg) > > > { > > > int count = 0; > > > while (do_exit == 0) { > > > printf("test fork count %d\n", count++); > > > int pid = vfork(); > > > if (pid < 0) { > > > return NULL; > > > } else if (pid == 0) { > > > lockfunc(); > > > _exit(0); > > > } > > > > > > int status; > > > waitpid(pid, &status, 0); > > > } > > > > > > return NULL; > > > } > > > > > > int main(int argc, char *argv[]) > > > { > > > pthread_attr_t attr; > > > pthread_t threads[10]; > > > pthread_t fork_test; > > > int i; > > > int ret; > > > > > > pthread_attr_init(&attr); > > > > > > ret = pthread_create(&fork_test, &attr, test_fork, NULL); > > > > > > for (i = 0; i < 10; i++) { > > > ret = pthread_create(&threads[i], &attr, test, NULL); > > > if (ret != 0) { > > > return 1; > > > } > > > } > > > > > > for (i = 0; i < 10; i++) { > > > void *retval = NULL; > > > pthread_join(threads[i], &retval); > > > } > > > > > > void *retval = NULL; > > > pthread_join(fork_test, &retval); > > > printf("exit\n"); > > > getchar(); > > > return 0; > > > } > > > > > > > > > Log: > > > pi@raspberrypi:~/code/tinylog/test $ ./test > > > test fork count 0 > > > test fork count 1 <-- lock here > > > ^C > > > > > > gdb backtrace: > > > x0000000000409524 in __lock () > > > (gdb) bt > > > #0 0x0000000000409524 in __lock () > > > #1 0x0000000000406278 in __libc_malloc_impl () > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [musl] BUG: Calling readdir/dirfd after vfork will cause deadlock. 2022-06-27 4:05 ` Nick Peng 2022-06-27 10:38 ` Laurent Bercot 2022-06-27 14:21 ` Rich Felker @ 2022-06-27 14:29 ` Markus Wichmann 2 siblings, 0 replies; 13+ messages in thread From: Markus Wichmann @ 2022-06-27 14:29 UTC (permalink / raw) To: musl; +Cc: Nick Peng On Mon, Jun 27, 2022 at 12:05:41PM +0800, Nick Peng wrote: > The feature I want to achieve is to close all file handles after fork to > avoid file handles being inherited to child processes. > The normal solution to that problem is FD_CLOEXEC. I have started just opening all FDs with FD_CLOEXEC set. If I want to bestow it upon a child process, I can always dup() it (which clears FD_CLOEXEC from the destination FD). > In order to avoid file handles inheriting from child processes, we > implemented a API like system,close all file handles after vfork. That wouldn't be conforming to the vfork() spec either, since close() is neither _exit() nor in the exec* family. > The component is based on glibc, currently this program runs for about 10 > years without deadlock. > Recently, this component is used in musl-based embedded systems, and it > hangs often. > Congratulations, you found a way to make undefined behavior break your setup only a decade after its inception. Such is the nature of the beast. Sometimes it will work for you. Until something changes, that may be out of your control. Ciao, Markus ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [musl] BUG: Calling readdir/dirfd after vfork will cause deadlock. 2022-06-25 12:51 ` Szabolcs Nagy 2022-06-27 4:05 ` Nick Peng @ 2022-06-27 7:42 ` Florian Weimer 2022-06-27 8:37 ` Szabolcs Nagy 1 sibling, 1 reply; 13+ messages in thread From: Florian Weimer @ 2022-06-27 7:42 UTC (permalink / raw) To: Nick Peng; +Cc: musl * Szabolcs Nagy: > * Nick Peng <pymumu@gmail.com> [2022-06-25 11:40:17 +0800]: >> Description: After vfork, calling functions such as readdir/dirfd may >> cause deadlock. GNU C is OK. > > why do you think "GNU C is OK"? is this from some real software? glibc supports opendir/readdir/closedir after vfork as an extension. The JVM depends on it. Of course it's undefined, but I really see no way around keeping support for it for the time being. Thanks, Florian ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [musl] BUG: Calling readdir/dirfd after vfork will cause deadlock. 2022-06-27 7:42 ` Florian Weimer @ 2022-06-27 8:37 ` Szabolcs Nagy 2022-06-27 9:23 ` Florian Weimer 0 siblings, 1 reply; 13+ messages in thread From: Szabolcs Nagy @ 2022-06-27 8:37 UTC (permalink / raw) To: Florian Weimer; +Cc: Nick Peng, musl * Florian Weimer <fweimer@redhat.com> [2022-06-27 09:42:57 +0200]: > * Szabolcs Nagy: > > > * Nick Peng <pymumu@gmail.com> [2022-06-25 11:40:17 +0800]: > >> Description: After vfork, calling functions such as readdir/dirfd may > >> cause deadlock. GNU C is OK. > > > > why do you think "GNU C is OK"? is this from some real software? > > glibc supports opendir/readdir/closedir after vfork as an extension. > The JVM depends on it. how does that work? i think glibc just calls vfork syscall (or clone(CLONE_VM|CLONE_VFORK)) from asm and opendir allocates. so i'd expect a deadlock where the parent waits for the child to exec while holding the malloc lock. > > Of course it's undefined, but I really see no way around keeping support > for it for the time being. > > Thanks, > Florian ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [musl] BUG: Calling readdir/dirfd after vfork will cause deadlock. 2022-06-27 8:37 ` Szabolcs Nagy @ 2022-06-27 9:23 ` Florian Weimer 2022-06-27 17:33 ` Szabolcs Nagy 0 siblings, 1 reply; 13+ messages in thread From: Florian Weimer @ 2022-06-27 9:23 UTC (permalink / raw) To: Nick Peng; +Cc: musl * Szabolcs Nagy: > * Florian Weimer <fweimer@redhat.com> [2022-06-27 09:42:57 +0200]: >> * Szabolcs Nagy: >> >> > * Nick Peng <pymumu@gmail.com> [2022-06-25 11:40:17 +0800]: >> >> Description: After vfork, calling functions such as readdir/dirfd may >> >> cause deadlock. GNU C is OK. >> > >> > why do you think "GNU C is OK"? is this from some real software? >> >> glibc supports opendir/readdir/closedir after vfork as an extension. >> The JVM depends on it. > > how does that work? i think glibc just calls vfork syscall (or > clone(CLONE_VM|CLONE_VFORK)) from asm and opendir allocates. > so i'd expect a deadlock where the parent waits for the child > to exec while holding the malloc lock. vfork stops the thread in the parent and uses its resources. It is the same userspace thread (with the same TCB), only the kernel TID is wrong. glibc's malloc-internal locks do not rely on the TID, so there is no incrased risk of deadlock. The malloc locks are internal, so user code cannot call vfork while they are locked. If another thread has locked them at the point of vfork, that thread will eventually unlock them, unblocking the vfork'ed subprocess. This relies on the shared address space of vfork. Without the shared address space, none of this would work, and for fork, we have complicated code to manage glibc-internal locks (including the malloc locks). Thanks, Florian ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [musl] BUG: Calling readdir/dirfd after vfork will cause deadlock. 2022-06-27 9:23 ` Florian Weimer @ 2022-06-27 17:33 ` Szabolcs Nagy 0 siblings, 0 replies; 13+ messages in thread From: Szabolcs Nagy @ 2022-06-27 17:33 UTC (permalink / raw) To: Florian Weimer; +Cc: Nick Peng, musl * Florian Weimer <fweimer@redhat.com> [2022-06-27 11:23:48 +0200]: > * Szabolcs Nagy: > > > * Florian Weimer <fweimer@redhat.com> [2022-06-27 09:42:57 +0200]: > >> * Szabolcs Nagy: > >> > >> > * Nick Peng <pymumu@gmail.com> [2022-06-25 11:40:17 +0800]: > >> >> Description: After vfork, calling functions such as readdir/dirfd may > >> >> cause deadlock. GNU C is OK. > >> > > >> > why do you think "GNU C is OK"? is this from some real software? > >> > >> glibc supports opendir/readdir/closedir after vfork as an extension. > >> The JVM depends on it. > > > > how does that work? i think glibc just calls vfork syscall (or > > clone(CLONE_VM|CLONE_VFORK)) from asm and opendir allocates. > > so i'd expect a deadlock where the parent waits for the child > > to exec while holding the malloc lock. > > vfork stops the thread in the parent and uses its resources. It is the > same userspace thread (with the same TCB), only the kernel TID is wrong. > glibc's malloc-internal locks do not rely on the TID, so there is no > incrased risk of deadlock. The malloc locks are internal, so user code > cannot call vfork while they are locked. If another thread has locked > them at the point of vfork, that thread will eventually unlock them, > unblocking the vfork'ed subprocess. This relies on the shared address > space of vfork. > > Without the shared address space, none of this would work, and for fork, > we have complicated code to manage glibc-internal locks (including the > malloc locks). ah i thought vfork stops the entire parent process (based on linux man). ok then i think musl should work too in practice.. however on some targets musl just uses a fork syscall instead of vfork. i think that's the reason why this fails on aarch64 and mips. so i think if musl adds src/process/ARCH/vfork.s then it would work. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [musl] BUG: Calling readdir/dirfd after vfork will cause deadlock. 2022-06-25 3:40 [musl] BUG: Calling readdir/dirfd after vfork will cause deadlock Nick Peng 2022-06-25 12:51 ` Szabolcs Nagy @ 2022-06-25 12:56 ` Alex Xu (Hello71) 2022-06-27 7:44 ` Florian Weimer 1 sibling, 1 reply; 13+ messages in thread From: Alex Xu (Hello71) @ 2022-06-25 12:56 UTC (permalink / raw) To: musl Excerpts from Nick Peng's message of June 24, 2022 11:40 pm: > Description: After vfork, calling functions such as readdir/dirfd may > cause deadlock. GNU C is OK. This is not a bug. TFM: DESCRIPTION Standard description (From POSIX.1) The vfork() function has the same effect as fork(2), except that the behavior is undefined if the process created by vfork() either modifies any data other than a variable of type pid_t used to store the return value from vfork(), or returns from the function in which vfork() was called, or calls any other function before successfully calling _exit(2) or one of the exec(3) family of functions. It may happen to appear to work in some cases on some libcs, or kernels, or architectures, or it may deadlock, corrupt data, or cause demons to fly out your nose. Do not use vfork in this manner. Avoid using fork in this manner either; that may work in musl 1.2.2+, and appear to work in glibc (but sometimes randomly corrupt data), but is not portable to other libcs. Alex. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [musl] BUG: Calling readdir/dirfd after vfork will cause deadlock. 2022-06-25 12:56 ` Alex Xu (Hello71) @ 2022-06-27 7:44 ` Florian Weimer 2022-06-27 12:28 ` Alex Xu (Hello71) 0 siblings, 1 reply; 13+ messages in thread From: Florian Weimer @ 2022-06-27 7:44 UTC (permalink / raw) To: Alex Xu (Hello71); +Cc: musl * Alex Xu: > Excerpts from Nick Peng's message of June 24, 2022 11:40 pm: >> Description: After vfork, calling functions such as readdir/dirfd may >> cause deadlock. GNU C is OK. > > This is not a bug. TFM: > > DESCRIPTION > Standard description > (From POSIX.1) The vfork() function has the same effect as > fork(2), except that the behavior is undefined if the process > created by vfork() either modifies any data other than a variable > of type pid_t used to store the return value from vfork(), or > returns from the function in which vfork() was called, or calls > any other function before successfully calling _exit(2) or one of > the exec(3) family of functions. > > It may happen to appear to work in some cases on some libcs, or kernels, > or architectures, or it may deadlock, corrupt data, or cause demons to > fly out your nose. Do not use vfork in this manner. Avoid using fork in > this manner either; that may work in musl 1.2.2+, and appear to work in > glibc (but sometimes randomly corrupt data), but is not portable to > other libcs. Could you describe the memory corruption you have seen with opendir/readdir on glibc? Thanks, Florian ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [musl] BUG: Calling readdir/dirfd after vfork will cause deadlock. 2022-06-27 7:44 ` Florian Weimer @ 2022-06-27 12:28 ` Alex Xu (Hello71) 0 siblings, 0 replies; 13+ messages in thread From: Alex Xu (Hello71) @ 2022-06-27 12:28 UTC (permalink / raw) To: Florian Weimer; +Cc: musl Excerpts from Florian Weimer's message of June 27, 2022 3:44 am: > Could you describe the memory corruption you have seen with > opendir/readdir on glibc? > > Thanks, > Florian > > Apologies, I misremembered the glibc fork lock-handling code. It supports dubious application code, but the glibc code itself is not incorrect, or at least not in the way I thought. Cheers, Alex. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2022-06-27 17:33 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-06-25 3:40 [musl] BUG: Calling readdir/dirfd after vfork will cause deadlock Nick Peng 2022-06-25 12:51 ` Szabolcs Nagy 2022-06-27 4:05 ` Nick Peng 2022-06-27 10:38 ` Laurent Bercot 2022-06-27 14:21 ` Rich Felker 2022-06-27 14:29 ` Markus Wichmann 2022-06-27 7:42 ` Florian Weimer 2022-06-27 8:37 ` Szabolcs Nagy 2022-06-27 9:23 ` Florian Weimer 2022-06-27 17:33 ` Szabolcs Nagy 2022-06-25 12:56 ` Alex Xu (Hello71) 2022-06-27 7:44 ` Florian Weimer 2022-06-27 12:28 ` Alex Xu (Hello71)
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/musl/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).