From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 29125 invoked from network); 28 Feb 2022 00:16:24 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 28 Feb 2022 00:16:24 -0000 Received: (qmail 3425 invoked by uid 550); 28 Feb 2022 00:16:22 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 3392 invoked from network); 28 Feb 2022 00:16:21 -0000 Date: Sun, 27 Feb 2022 19:16:09 -0500 From: Rich Felker To: naruto canada Cc: musl@lists.openwall.com Message-ID: <20220228001609.GD7074@brightrain.aerifal.cx> References: <20220225233234.GA7074@brightrain.aerifal.cx> <20220226134800.GB7074@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] Re: anyone know how to approach this problem (expect5.x.x hangs) On Sun, Feb 27, 2022 at 09:29:49PM +0000, naruto canada wrote: > On 2/27/22, naruto canada wrote: > > On 2/26/22, Rich Felker wrote: > >> On Sat, Feb 26, 2022 at 02:46:10AM +0000, naruto canada wrote: > >>> On 2/25/22, Rich Felker wrote: > >>> > On Fri, Feb 25, 2022 at 07:35:39PM +0000, naruto canada wrote: > >>> >> On 2/25/22, naruto canada wrote: > >>> >> > hi > >>> >> > > >>> >> > I'm in the process of porting all my desktop env. over to musl. > >>> >> > I'm about 70% done. I hit a few minor snags but got over them. > >>> >> > I had expected a lot more painful experience, but it turned out ok. > >>> >> > I could not get xserver to compile but will work around using vnc > >>> >> > for > >>> >> > now. > >>> >> > I am quite happy I got qemu to compile. > >>> >> > The last 30% (Browsers !!!), I dare not approach them right now. > >>> >> > > >>> >> > Anyway, back to my probem, expect5.x.x hangs, > >>> >> > no seg fault, so I do not know how to approach this problem. > >>> >> > normally I do a simple test: > >>> >> > expect -c "spawn ls" # this always succeeds. > >>> >> > > >>> >> > (I use expect to automate password creation) > >>> >> > VNCRP=123456 # need 6 characters # create ~/.vnc/passwd > >>> >> > echo '#!/usr/bin/expect > >>> >> > set timeout -1 > >>> >> > spawn vncpasswd > >>> >> > expect "Password:" > >>> >> > send "'$VNCRP'\r" > >>> >> > expect "Verify:" > >>> >> > send "'$VNCRP'\r" > >>> >> > expect "Would you like to enter a view-only password (y/n)?" > >>> >> > send "n\r" > >>> >> > interact' > /tmp/p.ex > >>> >> > expect /tmp/p.ex > >>> >> > This script works fine under glibc, but hangs under musl. > >>> >> > > >>> >> > I've already tried the same version of expect and patches from > >>> >> > aports-3.15.0/main/expect/*.patch > >>> >> > I got the same result. (it hangs) > >>> >> > > >>> >> > This is not a priority problem for me. I can easily work around it > >>> >> > without using expect. > >>> >> > Just wondering if anyone know how to approach this problem (when > >>> >> > there > >>> >> > is no seg fault) > >>> >> > >>> >> I did a quick strace, and compare it with glibc: > >>> >> GLIBC CASE: > >>> >> .... > >>> >> open("/tmp/p.ex", O_RDONLY) = 4 > >>> >> spawn vncpasswd > >>> >> open("/dev/ptmx", O_RDWR) = 4 > >>> >> open("/etc/group", O_RDONLY|O_CLOEXEC) = 5 > >>> >> open("/dev/pts/18", O_RDWR|O_NOCTTY) = 5 > >>> >> Password: > >>> >> Verify: > >>> >> Would you like to enter a view-only password (y/n)? n > >>> >> --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=956, > >>> >> si_status=0, si_utime=0, si_stime=0} --- > >>> >> > >>> >> open("/dev/null", O_RDONLY) > >>> >> = 4 > >>> >> open("/dev/null", O_RDONLY) = 3 > >>> >> open("/dev/null", O_RDONLY) = 2 > >>> >> open("/dev/null", O_RDONLY) = 0 > >>> >> +++ exited with 0 +++ > >>> >> > >>> >> MUSL CASE: > >>> >> open("/tmp/p.ex", O_RDONLY|O_LARGEFILE) = 7 > >>> >> spawn vncpasswd > >>> >> open("/dev/ptmx", O_RDWR|O_NOCTTY|O_LARGEFILE) = 7 > >>> >> open("/dev/pts/3", O_RDWR|O_NOCTTY|O_LARGEFILE) = 8 > >>> >> syscall_397(0xffffff9c, 0xb6f624e0, 0, 0x7ff, 0xbe927e48, 0xb6f624e0) > >>> >> = -1 (errno 38) > >>> >> syscall_397(0x8, 0xb6f58350, 0x1000, 0x7ff, 0xbe927e48, 0xb6f624e0) = > >>> >> -1 (errno 38) > >>> >> syscall_403(0, 0xbe928258, 0xb6e82de0, 0, 0xbe928334, 0) = -1 (errno > >>> >> 38) > >>> >> syscall_389(0x10, 0, 0, 0xb6f62170, 0xbe92815c, 0xbe92808c) = -1 > >>> >> (errno > >>> >> 38) > >>> >> > >>> >> It seems to block or stopped at syscall_389 > >>> >> ( arch/arm/bits/syscall.h.in:#define __NR_membarrier 389 ) > >>> > > >>> > The syscall has returned, so it's something after that which is > >>> > hanging. Running under gdb and hitting ^C could show where. > >>> > > >>> > Something very wrong is going on here, since the syscalls are failing > >>> > with ENOSYS but no fallback path has been taken. If it's musl making > >>> > them, it will not assume these exist but will check for ENOSYS and > >>> > make an alternate syscall if that happens. So it would seem that > >>> > either these syscalls are being made directly by the application > >>> > (expect) or something went very wrong in building musl (weird patches? > >>> > stale build dir previously used for another arch? ..?) that has the > >>> > wrong thing happening. > >>> > > >>> > What kernel version are you using? There was a recent thread on the > >>> > list where someone had a badly patched kernel from Google that did > >>> > something to mess up ENOSYS, and strace hid the bug, so perhaps this > >>> > is similar. > >>> > >>> It is compiled on (and running on) android phone. > >>> I have no control over the kernel (3.4.0-perf-g63c3cac) (LG G3). > >>> (I have already matched kernel header (3.4.0) when compile the tool > >>> and the World) > >>> I used 3.10.x before and got into problem, so I matched the kernel > >>> header > >>> this time. This is the gdb result (I ran 3 times). It always stopped > >>> at the same place: > >> > >> The kernel headers you use should not matter at all. Do you know where > >> we can find the sources for LG's kernel? I'm guessing they did > >> something horrible... > > Actually I have downloaded the source code, but I do not know how to config > > and > > match the device tree. There is no config.gz in /proc. > > Besides, there are many variants for different countries, there is no way > > to be certain that the source file is the one that > > was built for and running on the phone. > > > > Ok, here is the results from 2 phones and i686. > > I will describe the common build script: > > To eliminate complications I switched back to musl-1.1.24. > > If there is any time64 issues this should have made it irrelevant. > > C(XX)FLAGS enforced by specs file: -ggdb -O0 > > libssp enabled in gcc-5.3.0. linux-3.4 headers. > > I am also monitoring any selinux denied messages from logcat. > > > > 0. Linux localhost 4.1.49 #1 SMP Sat Feb 12 17:41:04 UTC 2022 i686 > > GNU/Linux > > 1. Linux localhost 3.4.0-perf-g63c3cac #1 SMP PREEMPT Mon Nov 9 > > 19:33:11 KST 2015 armv7l GNU/Linux (LG G3) > > 2. Linux localhost 3.10.49-g441c924c #1 SMP PREEMPT Mon May 16 > > 19:35:10 CST 2016 armv7l GNU/Linux (Another cheap android phone but > > slightly newer kernel) > > > > i686 CASE: > > strace -f -ff -o 1 expect /tmp/p.ex > > tail 1.* > > > > ==> 1.14236 <== > > close(0) = 0 > > write(6, "q", 1) = 1 > > close(6) = 0 > > futex(0xbf7fd53c, FUTEX_WAIT_PRIVATE, 2, NULL) = 0 > > futex(0xb76cf838, FUTEX_WAIT_PRIVATE, -2147483632, NULL) = 0 > > futex(0xb74dddac, FUTEX_WAIT_PRIVATE, 1, NULL) = 0 > > futex(0xb77ca568, FUTEX_WAIT, 14240, NULL) = 0 > > munmap(0xb74bb000, 143360) = 0 > > exit_group(130) = ? > > +++ exited with 130 +++ > > > > ==> 1.14237 <== > > set_thread_area({entry_number:6, base_addr:0xb76c2064, limit:1048575, > > seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, > > seg_not_present:0, useable:1}) = 0 > > set_tid_address(0xb7781568) = 14237 > > brk(0) = 0x86bf000 > > brk(0x86c4000) = 0x86c4000 > > mkdir("/root/.vnc/", 0777) = -1 EEXIST (File exists) > > ioctl(0, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or > > TCGETS, {B115200 opost isig icanon echo ...}) = 0 > > ioctl(0, SNDCTL_TMR_CONTINUE or SNDRV_TIMER_IOCTL_GPARAMS or TCSETSF, > > {B115200 opost isig icanon -echo ...}) = 0 > > read(0, 0xb777ffe8, 1024) = -1 EIO (I/O error) > > --- SIGHUP {si_signo=SIGHUP, si_code=SI_KERNEL} --- > > +++ killed by SIGHUP +++ > > > > ==> 1.14238 <== > > wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 14239 > > rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0xb7787836}, > > {0x808c34a, [], SA_RESTORER, 0xb7787836}, 8) = 0 > > rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 > > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=14239, > > si_status=0, si_utime=0, si_stime=0} --- > > wait4(-1, 0xbfb515d4, WNOHANG, NULL) = -1 ECHILD (No child process) > > sigreturn() (mask []) = 0 > > rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 > > rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 > > exit_group(0) = ? > > +++ exited with 0 +++ > > > > ==> 1.14239 <== > > close(4) = 0 > > execve("/bin/stty", ["/bin/stty", "sane"], [/* 40 vars */]) = 0 > > set_thread_area({entry_number:-1 -> 6, base_addr:0x81d55cc, > > limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, > > limit_in_pages:1, seg_not_present:0, useable:1}) = 0 > > set_tid_address(0x81d5e30) = 14239 > > getuid32() = 0 > > ioctl(0, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or > > TCGETS, {B115200 opost isig icanon echo ...}) = 0 > > ioctl(0, SNDCTL_TMR_STOP or SNDRV_TIMER_IOCTL_GINFO or TCSETSW, > > {B115200 opost isig icanon echo ...}) = 0 > > ioctl(0, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or > > TCGETS, {B115200 opost isig icanon echo ...}) = 0 > > exit_group(0) = ? > > +++ exited with 0 +++ > > > > ==> 1.14240 <== > > read(5, "\0", 1) = 1 > > select(6, [5], [], [], NULL) = 1 (in [5]) > > read(5, "q", 1) = 1 > > close(5) = 0 > > futex(0xbf7fd53c, FUTEX_WAKE_PRIVATE, 1) = 1 > > futex(0xb76cf838, FUTEX_WAKE_PRIVATE, 1) = 1 > > rt_sigprocmask(SIG_BLOCK, ~[RTMIN 33 34], [], 8) = 0 > > futex(0xb74dddac, FUTEX_WAKE_PRIVATE, 1) = 1 > > _exit(0) = ? > > +++ exited with 130 +++ > > > > Total of 5 processes: > > 2 processes existed normally. > > One is killed SIGHUP (main expect script I think) > > Two existed with 130 (they are blocked by kernel till ctrl-c ?) > > So 3 processes blocked, till ctrl-c was pressed. > > > > LG.G3 CASE: > > strace -f -ff -o 1 expect /tmp/p.ex > > pstree -w > > ... > > | | \-+= 28384 root strace -f -ff -o 1 expect /tmp/p.ex > > | | \-+- 28389 root expect /tmp/p.ex > > | | \--= 28390 root vncpasswd > > ... > > > > tail 1.* > > ==> 1.27818 <== > > fcntl64(2, F_SETFD, FD_CLOEXEC) = 0 > > close(0) = 0 > > close(1) = 0 > > open("/dev/null", O_RDONLY|O_LARGEFILE) = 0 > > fcntl64(0, F_SETFD, FD_CLOEXEC) = 0 > > close(3) = 0 > > close(2) = 0 > > close(0) = 0 > > exit_group(1) = ? > > +++ exited with 1 +++ > > > > ==> 1.28389 <== > > fcntl64(0, F_SETFD, FD_CLOEXEC) = 0 > > close(4) = 0 > > close(3) = 0 > > close(2) = 0 > > close(0) = 0 > > write(6, "q", 1) = 1 > > close(-1) = -1 EBADF (Bad file descriptor) > > munmap(0xb6c27000, 143360) = 0 > > exit_group(130) = ? > > +++ exited with 130 +++ > > > > ==> 1.28390 <== > > set_tls(0x29268, 0x29268, 0xb6f40c60, 0x29268, 0x200) = 0 > > set_tid_address(0xb6f96e18) = 28390 > > brk(0) = 0x125d000 > > brk(0x1262000) = 0x1262000 > > mkdir("/root/.vnc/", 0777) = -1 EEXIST (File exists) > > ioctl(0, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or > > TCGETS, {B38400 opost isig icanon echo ...}) = 0 > > ioctl(0, SNDCTL_TMR_CONTINUE or SNDRV_TIMER_IOCTL_GPARAMS or TCSETSF, > > {B38400 opost isig icanon -echo ...}) = 0 > > read(0, 0xb6f95e44, 1024) = -1 EIO (I/O error) > > --- SIGHUP {si_signo=SIGHUP, si_code=SI_KERNEL} --- > > +++ killed by SIGHUP +++ > > > > ==> 1.28391 <== > > wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 28392 > > rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0xb6f3229c}, {0x5d550, > > [], SA_RESTORER, 0xb6f3229c}, 8) = 0 > > rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 > > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=28392, > > si_status=0, si_utime=0, si_stime=0} --- > > wait4(-1, 0xbece8ebc, WNOHANG, NULL) = -1 ECHILD (No child process) > > sigreturn() (mask []) = 0 > > rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 > > rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 > > exit_group(0) = ? > > +++ exited with 0 +++ > > > > ==> 1.28392 <== > > close(4) = 0 > > execve("/bin/stty", ["/bin/stty", "sane"], [/* 42 vars */]) = 0 > > set_tls(0x20d25c, 0x20d25c, 0x1c1e28, 0x20d25c, 0) = 0 > > set_tid_address(0x20d788) = 28392 > > getuid32() = 0 > > ioctl(0, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or > > TCGETS, {B38400 opost isig icanon echo ...}) = 0 > > ioctl(0, SNDCTL_TMR_STOP or SNDRV_TIMER_IOCTL_GINFO or TCSETSW, > > {B38400 opost isig icanon echo ...}) = 0 > > ioctl(0, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or > > TCGETS, {B38400 opost isig icanon echo ...}) = 0 > > exit_group(0) = ? > > +++ exited with 0 +++ > > > > ==> 1.28393 <== > > read(5, "\0", 1) = 1 > > select(6, [5], [], [], NULL) = ? ERESTARTNOHAND (To be > > restarted if no handler) > > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=28390, > > si_status=SIGHUP, si_utime=1, si_stime=1} --- > > select(6, [5], [], [], NULL) = 1 (in [5]) > > read(5, "q", 1) = 1 > > close(5) = 0 > > rt_sigprocmask(SIG_BLOCK, ~[RTMIN 33 34], [], 8) = 0 > > futex(0xb6c49da0, FUTEX_WAKE_PRIVATE, 1) = 0 > > exit(0) = ? > > +++ exited with 0 +++ > > > > With LG G3, I have a total of 6 processes instead of 5. > > Strange. Anyway, the same result, blocked till ctrl-c is pressed. > > > > CHEAPO ANDROID PHONE with linux-3.10.x: > > still compiling slowly. I am going to guess I will have similar result. > > If it turns out differently, I will post another message. > > It turns out, I was over thinking the problem, thinking > maybe it is something buried deep. The problem was not "expect" but > the program "vncpasswd". I manually tried the program and it behaved > differently when compiled with musl. For some reason, it failed to flush > the prompt to the user, and "expect" was waiting forever instead. > I am not sure if this is correct fix for tigervnc-1.7.1, it works around > no prompt on my system: > sed -i.bak "s/\(fputs(prompt, stdout)\)/\1;fflush(stdout)/" > unix/vncpasswd/vncpasswd.cxx Yes, that's the right fix. Rich