From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.4 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,HTML_MESSAGE,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 6753 invoked from network); 27 Jun 2022 06:30:54 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 27 Jun 2022 06:30:54 -0000 Received: (qmail 31752 invoked by uid 550); 27 Jun 2022 06:30:39 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 11461 invoked from network); 27 Jun 2022 04:06:03 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=ES61PYDxkTzwHVE4tcuVnSRe7Q+HHEkmyZJyYoi/bXA=; b=fHMzISFgOBpK4vnJl8oKCRtZ4GI79SVvT1AXtSuNOd+LdbhRA/7WNlm1LNcRXUifwO JtTb/hLu/NiDASJ4VDMqjvIYx/bG82+npBdITnhQ4voEgpMisJ4AGcDp6+RIq+NacJJG jd6hXKU/X0UR7GQtoq/ExJkiahUNQ1awQ04j3m1HK2oSvCNRym7un65t7bfe0pYAJgpi /UD6b+Lg/FxF7vCRp8iEb01n4nxjWtXSBKib/HRAumpIOuRfHe9Qc+X2vhBpFKlY+vvE AecG7jE4azpIQ/b0C8Qzyahydn2kO6YqxVHIQgK8FKUCojD+BcyZuvn0qrJJ/KjoX6qb D1yA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=ES61PYDxkTzwHVE4tcuVnSRe7Q+HHEkmyZJyYoi/bXA=; b=IY/7CeMIoRsYIAcfLxnkCr/4ydnTDR8FOSsQOjK7ORFMe9wTuErQlN60nALbkcpHrM FiiEBmWrjtlJ8zDBGBQZjztVvbSCuGiEIHkQ73jHsL8aiqZh56C1XuVHiw8uRPrLYGGP Nxn8VG6taN+VlNfJIg6ATdXiyqOTB7PiTQVU2mnRqt5yK+B+qvz1r3NlTh/vTj+zBlx+ 7avc4GwHTLZ925EDGQ6BSXKbdduSM/x6A2Y1I+vRqt495H1yMjaMVapROkeN7L0hfrdX bYFdl1i5jFSu1gDAJL+3agNMDgi6H/kjpYvG0jidqed25Tvuc5KioMvEptC/jc/jSax8 6GDg== X-Gm-Message-State: AJIora+mSwWZKLEFBnV6ThsxFav8X/bbvKBbyI9QlfOHJIjgtOebXLX2 inLvxVKRAoRwf4/xgWNN0Kj+JKdasR/z/vGFL+4= X-Google-Smtp-Source: AGRyM1ugqV4mq3Mj9cw4IUkXpqm2+4cTdIaHOO+F0y9a6NpX+6G0Tyoc47rwPluWhzkXp3Pk8LAjwF6uvGVBrP0wbXw= X-Received: by 2002:a05:6102:15a4:b0:354:6370:333b with SMTP id g36-20020a05610215a400b003546370333bmr3644930vsv.29.1656302751974; Sun, 26 Jun 2022 21:05:51 -0700 (PDT) MIME-Version: 1.0 References: <20220625125110.GV1320090@port70.net> In-Reply-To: <20220625125110.GV1320090@port70.net> From: Nick Peng Date: Mon, 27 Jun 2022 12:05:41 +0800 Message-ID: To: Nick Peng , musl@lists.openwall.com Content-Type: multipart/alternative; boundary="0000000000002c07a905e2660986" Subject: Re: [musl] BUG: Calling readdir/dirfd after vfork will cause deadlock. --0000000000002c07a905e2660986 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable The feature I want to achieve is to close all file handles after fork to avoid file handles being inherited to child processes. We have a component which is used in a process that uses too much memory (about 50BG) and uses too many file handles(about 100,000 files). In order to avoid file handles inheriting from child processes, we implemented a API like system,close all file handles after vfork. We can't close all file handles by iterating over all file handles number, because the maximum number of files limit is set very large, the performance of this method is too poor. And we can't use fork, fork will fail because of vm.overcommit_memory. In order to optimize this, the method we used is to read the file handes in /proc/self/fd directory and close the file handles one by one. The component is based on glibc, currently this program runs for about 10 years without deadlock. Recently, this component is used in musl-based embedded systems, and it hangs often. After reading the vfork manual and the guidance you gave, i understand calling readdir after vfork is problematic. and I also learned that I can call getent64 to achieve this but what I want to say is that, based on glibc, after calling readdir after vfork, there is no deadlock problem at present. more information: I googled and found this(https://lwn.net/Articles/789023/= ), probably the best solution, but it seems that the required kernel version is too new for my program to work with. However, I think my problem is solved anyway, thank you all. On Sun, Jun 26, 2022 at 2:49 AM Szabolcs Nagy wrote: > * Nick Peng [2022-06-25 11:40:17 +0800]: > > Description: After vfork, calling functions such as readdir/dirfd may > > cause deadlock. GNU C is OK. > > why do you think "GNU C is OK"? is this from some real software? > > opendir after vfork is documented to be invalid in glibc: > > https://www.gnu.org/software/libc/manual/html_mono/libc.html#Low_002dleve= l-Directory-Access > > the standard is actually much stricter than the glibc manual: > posix conforming code must not call any libc api after vfork > other than _exit or the exec* familiy of functions. > (as-safe is not enough, but opendir is not even as-safe) > > since the example is multi-threaded even using fork would > be invalid, but i think both musl and glibc makes that work > (as an extension to the standard). > > > > Also tested on x86-64 with musl, no deadlock, but > > seems never exit, slower than GNU C. > > Version: latest, musl-1.2.3 > > OS: debian bullseye 64bit OS. and asus router > > CPU: raspberrypi aarch64=EF=BC=8C mips32 > > Reproduce Code: > > > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > > > pthread_mutex_t lock =3D PTHREAD_MUTEX_INITIALIZER; > > pthread_cond_t cond =3D PTHREAD_COND_INITIALIZER; > > > > struct tlog_log *logs =3D NULL; > > int do_exit =3D 0; > > > > void *test(void *arg) > > { > > int i =3D 0; > > > > for (i =3D 0; i < 10000000; i++) { > > char *b =3D malloc(4096); > > memset(b, 0, 4096); > > free(b); > > } > > do_exit =3D 1; > > return NULL; > > } > > > > void lockfunc() > > { > > char path_name[4096]; > > DIR *dir =3D NULL; > > struct dirent *ent; > > > > snprintf(path_name, sizeof(path_name), "/proc/self/fd/"); > > dir =3D opendir(path_name); > > if (dir =3D=3D NULL) { > > goto errout; > > } > > > > while ((ent =3D readdir(dir)) !=3D NULL) { > > } > > > > closedir(dir); > > > > return; > > errout: > > if (dir) { > > closedir(dir); > > } > > > > return; > > } > > > > void *test_fork(void *arg) > > { > > int count =3D 0; > > while (do_exit =3D=3D 0) { > > printf("test fork count %d\n", count++); > > int pid =3D vfork(); > > if (pid < 0) { > > return NULL; > > } else if (pid =3D=3D 0) { > > lockfunc(); > > _exit(0); > > } > > > > int status; > > waitpid(pid, &status, 0); > > } > > > > return NULL; > > } > > > > int main(int argc, char *argv[]) > > { > > pthread_attr_t attr; > > pthread_t threads[10]; > > pthread_t fork_test; > > int i; > > int ret; > > > > pthread_attr_init(&attr); > > > > ret =3D pthread_create(&fork_test, &attr, test_fork, NULL); > > > > for (i =3D 0; i < 10; i++) { > > ret =3D pthread_create(&threads[i], &attr, test, NULL); > > if (ret !=3D 0) { > > return 1; > > } > > } > > > > for (i =3D 0; i < 10; i++) { > > void *retval =3D NULL; > > pthread_join(threads[i], &retval); > > } > > > > void *retval =3D NULL; > > pthread_join(fork_test, &retval); > > printf("exit\n"); > > getchar(); > > return 0; > > } > > > > > > Log=EF=BC=9A > > pi@raspberrypi:~/code/tinylog/test $ ./test > > test fork count 0 > > test fork count 1 <-- lock here > > ^C > > > > gdb backtrace: > > x0000000000409524 in __lock () > > (gdb) bt > > #0 0x0000000000409524 in __lock () > > #1 0x0000000000406278 in __libc_malloc_impl () > --0000000000002c07a905e2660986 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
The feature I want to achieve is to close all file ha= ndles after fork to avoid file handles being inherited to child processes.<= br>

We have a component which is used in a process that= uses too much memory (about 50BG) and uses too many file handles(about 100= ,000 files).=C2=A0
In order to avoid file handles inheriting from child= processes, we implemented a API=C2=A0like system,close all file handles af= ter vfork.
We can't close all file handles by iterating over = all file handles number, because the maximum number of files limit is set v= ery large, the performance of this method is too poor.
And we=C2= =A0can't use fork, fork will fail because of=C2=A0vm.overcommit_memory.=
In order to optimize this, the method we used is to read the fil= e handes in /proc/self/fd directory and close the file handles one by one.= =C2=A0

The component is based on glibc, currently this progra= m runs for about 10 years without deadlock.
Recently, this compon= ent is used in musl-based embedded systems, and it hangs often.
<= br>After reading the vfork manual and the guidance you gave, i understand c= alling readdir after vfork is problematic. and I also learned that I can ca= ll getent64 to achieve this
but what I want to say is that, based= on glibc, after calling readdir after vfork, there is no deadlock problem = at present.

more information: I googled and fo= und this(https://lwn.net/Artic= les/789023/), probably the best solution, but it seems that the require= d kernel version is too new for my program to work with.

However, I think my problem is solved anyway, thank you all.

On Sun, Jun 26, 2022 at 2:49 AM Szabolcs Nagy <nsz@port70.net> wrote:
* Nick Peng <pymumu@gmail.com> [2022-06-25 11:40:17 +0800]: > Description:=C2=A0 After vfork, calling functions such as readdir/dirf= d may
> cause deadlock. GNU C is OK.

why do you think "GNU C is OK"? is this from some real software?<= br>
opendir after vfork is documented to be invalid in glibc:
https://w= ww.gnu.org/software/libc/manual/html_mono/libc.html#Low_002dlevel-Directory= -Access

the standard is actually much stricter than the glibc manual:
posix conforming code must not call any libc api after vfork
other than _exit or the exec* familiy of functions.
(as-safe is not enough, but opendir is not even as-safe)

since the example is multi-threaded even using fork would
be invalid, but i think both musl and glibc makes that work
(as an extension to the standard).


>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 Also tested on x86-64 with musl, no deadlock, but
> seems never exit, slower than GNU C.
> Version: latest, musl-1.2.3
> OS: debian bullseye 64bit OS. and asus router
> CPU: raspberrypi aarch64=EF=BC=8C mips32
> Reproduce Code:
>
> #include <dirent.h>
> #include <pthread.h>
> #include <stdint.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/types.h>
> #include <sys/wait.h>
> #include <unistd.h>
> #include <string.h>
>
> pthread_mutex_t lock =3D PTHREAD_MUTEX_INITIALIZER;
> pthread_cond_t cond =3D PTHREAD_COND_INITIALIZER;
>
> struct tlog_log *logs =3D NULL;
> int do_exit =3D 0;
>
> void *test(void *arg)
> {
>=C2=A0 =C2=A0 =C2=A0int i =3D 0;
>
>=C2=A0 =C2=A0 =C2=A0for (i =3D 0; i < 10000000; i++) {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0char *b =3D malloc(4096);
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0memset(b, 0, 4096);
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0free(b);
>=C2=A0 =C2=A0 =C2=A0}
>=C2=A0 =C2=A0 =C2=A0do_exit =3D 1;
>=C2=A0 =C2=A0 =C2=A0return NULL;
> }
>
> void lockfunc()
> {
>=C2=A0 =C2=A0 =C2=A0char path_name[4096];
>=C2=A0 =C2=A0 =C2=A0DIR *dir =3D NULL;
>=C2=A0 =C2=A0 =C2=A0struct dirent *ent;
>
>=C2=A0 =C2=A0 =C2=A0snprintf(path_name, sizeof(path_name), "/proc/= self/fd/");
>=C2=A0 =C2=A0 =C2=A0dir =3D opendir(path_name);
>=C2=A0 =C2=A0 =C2=A0if (dir =3D=3D NULL) {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0goto errout;
>=C2=A0 =C2=A0 =C2=A0}
>
>=C2=A0 =C2=A0 =C2=A0while ((ent =3D readdir(dir)) !=3D NULL) {
>=C2=A0 =C2=A0 =C2=A0}
>
>=C2=A0 =C2=A0 =C2=A0closedir(dir);
>
>=C2=A0 =C2=A0 =C2=A0return;
> errout:
>=C2=A0 =C2=A0 =C2=A0if (dir) {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0closedir(dir);
>=C2=A0 =C2=A0 =C2=A0}
>
>=C2=A0 =C2=A0 =C2=A0return;
> }
>
> void *test_fork(void *arg)
> {
>=C2=A0 =C2=A0 =C2=A0int count =3D 0;
>=C2=A0 =C2=A0 =C2=A0while (do_exit =3D=3D 0) {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0printf("test fork count %d\n&quo= t;, count++);
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0int pid =3D vfork();
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (pid < 0) {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return NULL;
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0} else if (pid =3D=3D 0) {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0lockfunc();
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0_exit(0);
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0int status;
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0waitpid(pid, &status, 0);
>=C2=A0 =C2=A0 =C2=A0}
>
>=C2=A0 =C2=A0 =C2=A0return NULL;
> }
>
> int main(int argc, char *argv[])
> {
>=C2=A0 =C2=A0 =C2=A0pthread_attr_t attr;
>=C2=A0 =C2=A0 =C2=A0pthread_t threads[10];
>=C2=A0 =C2=A0 =C2=A0pthread_t fork_test;
>=C2=A0 =C2=A0 =C2=A0int i;
>=C2=A0 =C2=A0 =C2=A0int ret;
>
>=C2=A0 =C2=A0 =C2=A0pthread_attr_init(&attr);
>
>=C2=A0 =C2=A0 =C2=A0ret =3D pthread_create(&fork_test, &attr, t= est_fork, NULL);
>
>=C2=A0 =C2=A0 =C2=A0for (i =3D 0; i < 10; i++) {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ret =3D pthread_create(&threads[i= ], &attr, test, NULL);
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (ret !=3D 0) {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return 1;
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
>=C2=A0 =C2=A0 =C2=A0}
>
>=C2=A0 =C2=A0 =C2=A0for (i =3D 0; i < 10; i++) {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0void *retval =3D NULL;
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0pthread_join(threads[i], &retval)= ;
>=C2=A0 =C2=A0 =C2=A0}
>
>=C2=A0 =C2=A0 =C2=A0void *retval =3D NULL;
>=C2=A0 =C2=A0 =C2=A0pthread_join(fork_test, &retval);
>=C2=A0 =C2=A0 =C2=A0printf("exit\n");
>=C2=A0 =C2=A0 =C2=A0getchar();
>=C2=A0 =C2=A0 =C2=A0return 0;
> }
>
>
> Log=EF=BC=9A
> pi@raspberrypi:~/code/tinylog/test $ ./test
> test fork count 0
> test fork count 1=C2=A0 =C2=A0<-- lock here
> ^C
>
> gdb backtrace:
> x0000000000409524 in __lock ()
> (gdb) bt
> #0=C2=A0 0x0000000000409524 in __lock ()
> #1=C2=A0 0x0000000000406278 in __libc_malloc_impl ()
--0000000000002c07a905e2660986--