mailing list of musl libc
 help / color / mirror / code / Atom feed
From: naruto canada <narutocanada@gmail.com>
To: Rich Felker <dalias@libc.org>
Cc: musl@lists.openwall.com
Subject: Re: [musl] Re: anyone know how to approach this problem (expect5.x.x hangs)
Date: Sun, 27 Feb 2022 15:35:09 +0000	[thread overview]
Message-ID: <CAKrOiPQ1s-jg=kV9mYRv4+iazwpfbQ15YpBuE49BzhrH1=bSUw@mail.gmail.com> (raw)
In-Reply-To: <20220226134800.GB7074@brightrain.aerifal.cx>

On 2/26/22, Rich Felker <dalias@libc.org> wrote:
> On Sat, Feb 26, 2022 at 02:46:10AM +0000, naruto canada wrote:
>> On 2/25/22, Rich Felker <dalias@libc.org> wrote:
>> > On Fri, Feb 25, 2022 at 07:35:39PM +0000, naruto canada wrote:
>> >> On 2/25/22, naruto canada <narutocanada@gmail.com> wrote:
>> >> > hi
>> >> >
>> >> > I'm in the process of porting all my desktop env. over to musl.
>> >> > I'm about 70% done. I hit a few minor snags but got over them.
>> >> > I had expected a lot more painful experience, but it turned out ok.
>> >> > I could not get xserver to compile but will work around using vnc
>> >> > for
>> >> > now.
>> >> > I am quite happy I got qemu to compile.
>> >> > The last 30% (Browsers !!!), I dare not approach them right now.
>> >> >
>> >> > Anyway, back to my probem, expect5.x.x hangs,
>> >> > no seg fault, so I do not know how to approach this problem.
>> >> > normally I do a simple test:
>> >> > expect -c "spawn ls" # this always succeeds.
>> >> >
>> >> > (I use expect to automate password creation)
>> >> > VNCRP=123456 # need 6 characters # create ~/.vnc/passwd
>> >> > echo '#!/usr/bin/expect
>> >> > set timeout -1
>> >> > spawn vncpasswd
>> >> > expect "Password:"
>> >> > send "'$VNCRP'\r"
>> >> > expect "Verify:"
>> >> > send "'$VNCRP'\r"
>> >> > expect "Would you like to enter a view-only password (y/n)?"
>> >> > send "n\r"
>> >> > interact' > /tmp/p.ex
>> >> > expect /tmp/p.ex
>> >> > This script works fine under glibc, but hangs under musl.
>> >> >
>> >> > I've already tried the same version of expect and patches from
>> >> > aports-3.15.0/main/expect/*.patch
>> >> > I got the same result. (it hangs)
>> >> >
>> >> > This is not a priority problem for me. I can easily work around it
>> >> > without using expect.
>> >> > Just wondering if anyone know how to approach this problem (when
>> >> > there
>> >> > is no seg fault)
>> >>
>> >> I did a quick strace, and compare it with glibc:
>> >> GLIBC CASE:
>> >> ....
>> >> open("/tmp/p.ex", O_RDONLY)             = 4
>> >> spawn vncpasswd
>> >> open("/dev/ptmx", O_RDWR)               = 4
>> >> open("/etc/group", O_RDONLY|O_CLOEXEC)  = 5
>> >> open("/dev/pts/18", O_RDWR|O_NOCTTY)    = 5
>> >> Password:
>> >> Verify:
>> >> Would you like to enter a view-only password (y/n)? n
>> >> --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=956,
>> >> si_status=0, si_utime=0, si_stime=0} ---
>> >>
>> >>                                 open("/dev/null", O_RDONLY)
>> >>  = 4
>> >>                      open("/dev/null", O_RDONLY)             = 3
>> >> open("/dev/null", O_RDONLY)             = 2
>> >> open("/dev/null", O_RDONLY)             = 0
>> >> +++ exited with 0 +++
>> >>
>> >> MUSL CASE:
>> >> open("/tmp/p.ex", O_RDONLY|O_LARGEFILE) = 7
>> >> spawn vncpasswd
>> >> open("/dev/ptmx", O_RDWR|O_NOCTTY|O_LARGEFILE) = 7
>> >> open("/dev/pts/3", O_RDWR|O_NOCTTY|O_LARGEFILE) = 8
>> >> syscall_397(0xffffff9c, 0xb6f624e0, 0, 0x7ff, 0xbe927e48, 0xb6f624e0)
>> >> = -1 (errno 38)
>> >> syscall_397(0x8, 0xb6f58350, 0x1000, 0x7ff, 0xbe927e48, 0xb6f624e0) =
>> >> -1 (errno 38)
>> >> syscall_403(0, 0xbe928258, 0xb6e82de0, 0, 0xbe928334, 0) = -1 (errno
>> >> 38)
>> >> syscall_389(0x10, 0, 0, 0xb6f62170, 0xbe92815c, 0xbe92808c) = -1
>> >> (errno
>> >> 38)
>> >>
>> >> It seems to block or stopped at syscall_389
>> >> ( arch/arm/bits/syscall.h.in:#define __NR_membarrier		389 )
>> >
>> > The syscall has returned, so it's something after that which is
>> > hanging. Running under gdb and hitting ^C could show where.
>> >
>> > Something very wrong is going on here, since the syscalls are failing
>> > with ENOSYS but no fallback path has been taken. If it's musl making
>> > them, it will not assume these exist but will check for ENOSYS and
>> > make an alternate syscall if that happens. So it would seem that
>> > either these syscalls are being made directly by the application
>> > (expect) or something went very wrong in building musl (weird patches?
>> > stale build dir previously used for another arch? ..?) that has the
>> > wrong thing happening.
>> >
>> > What kernel version are you using? There was a recent thread on the
>> > list where someone had a badly patched kernel from Google that did
>> > something to mess up ENOSYS, and strace hid the bug, so perhaps this
>> > is similar.
>>
>> It is compiled on (and running on) android phone.
>> I have no control over the kernel (3.4.0-perf-g63c3cac) (LG G3).
>> (I have already matched kernel header (3.4.0) when compile the tool
>> and the World)
>> I used 3.10.x before and got into problem, so I matched the kernel header
>> this time. This is the gdb result (I ran 3 times). It always stopped
>> at the same place:
>
> The kernel headers you use should not matter at all. Do you know where
> we can find the sources for LG's kernel? I'm guessing they did
> something horrible...
Actually I have downloaded the source code, but I do not know how to config and
match the device tree. There is no config.gz in /proc.
Besides, there are many variants for different countries, there is no way
to be certain that the source file is the one that
was built for and running on the phone.

Ok, here is the results from 2 phones and i686.
I will describe the common build script:
To eliminate complications I switched back to musl-1.1.24.
If there is any time64 issues this should have made it irrelevant.
C(XX)FLAGS enforced by specs file: -ggdb -O0
libssp enabled in gcc-5.3.0. linux-3.4 headers.
I am also monitoring any selinux denied messages from logcat.

0. Linux localhost 4.1.49 #1 SMP Sat Feb 12 17:41:04 UTC 2022 i686 GNU/Linux
1. Linux localhost 3.4.0-perf-g63c3cac #1 SMP PREEMPT Mon Nov 9
19:33:11 KST 2015 armv7l GNU/Linux (LG G3)
2. Linux localhost 3.10.49-g441c924c #1 SMP PREEMPT Mon May 16
19:35:10 CST 2016 armv7l GNU/Linux (Another cheap android phone but
slightly newer kernel)

i686 CASE:
strace -f -ff -o 1 expect /tmp/p.ex
tail 1.*

==> 1.14236 <==
close(0)                                = 0
write(6, "q", 1)                        = 1
close(6)                                = 0
futex(0xbf7fd53c, FUTEX_WAIT_PRIVATE, 2, NULL) = 0
futex(0xb76cf838, FUTEX_WAIT_PRIVATE, -2147483632, NULL) = 0
futex(0xb74dddac, FUTEX_WAIT_PRIVATE, 1, NULL) = 0
futex(0xb77ca568, FUTEX_WAIT, 14240, NULL) = 0
munmap(0xb74bb000, 143360)              = 0
exit_group(130)                         = ?
+++ exited with 130 +++

==> 1.14237 <==
set_thread_area({entry_number:6, base_addr:0xb76c2064, limit:1048575,
seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1,
seg_not_present:0, useable:1}) = 0
set_tid_address(0xb7781568)             = 14237
brk(0)                                  = 0x86bf000
brk(0x86c4000)                          = 0x86c4000
mkdir("/root/.vnc/", 0777)              = -1 EEXIST (File exists)
ioctl(0, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or
TCGETS, {B115200 opost isig icanon echo ...}) = 0
ioctl(0, SNDCTL_TMR_CONTINUE or SNDRV_TIMER_IOCTL_GPARAMS or TCSETSF,
{B115200 opost isig icanon -echo ...}) = 0
read(0, 0xb777ffe8, 1024)               = -1 EIO (I/O error)
--- SIGHUP {si_signo=SIGHUP, si_code=SI_KERNEL} ---
+++ killed by SIGHUP +++

==> 1.14238 <==
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 14239
rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0xb7787836},
{0x808c34a, [], SA_RESTORER, 0xb7787836}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=14239,
si_status=0, si_utime=0, si_stime=0} ---
wait4(-1, 0xbfb515d4, WNOHANG, NULL)    = -1 ECHILD (No child process)
sigreturn() (mask [])                   = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
exit_group(0)                           = ?
+++ exited with 0 +++

==> 1.14239 <==
close(4)                                = 0
execve("/bin/stty", ["/bin/stty", "sane"], [/* 40 vars */]) = 0
set_thread_area({entry_number:-1 -> 6, base_addr:0x81d55cc,
limit:1048575, seg_32bit:1, contents:0, read_exec_only:0,
limit_in_pages:1, seg_not_present:0, useable:1}) = 0
set_tid_address(0x81d5e30)              = 14239
getuid32()                              = 0
ioctl(0, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or
TCGETS, {B115200 opost isig icanon echo ...}) = 0
ioctl(0, SNDCTL_TMR_STOP or SNDRV_TIMER_IOCTL_GINFO or TCSETSW,
{B115200 opost isig icanon echo ...}) = 0
ioctl(0, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or
TCGETS, {B115200 opost isig icanon echo ...}) = 0
exit_group(0)                           = ?
+++ exited with 0 +++

==> 1.14240 <==
read(5, "\0", 1)                        = 1
select(6, [5], [], [], NULL)            = 1 (in [5])
read(5, "q", 1)                         = 1
close(5)                                = 0
futex(0xbf7fd53c, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xb76cf838, FUTEX_WAKE_PRIVATE, 1) = 1
rt_sigprocmask(SIG_BLOCK, ~[RTMIN 33 34], [], 8) = 0
futex(0xb74dddac, FUTEX_WAKE_PRIVATE, 1) = 1
_exit(0)                                = ?
+++ exited with 130 +++

Total of 5 processes:
2 processes existed normally.
One is killed SIGHUP (main expect script I think)
Two existed with 130 (they are blocked by kernel till ctrl-c ?)
So 3 processes blocked, till ctrl-c was pressed.

LG.G3 CASE:
strace -f -ff -o 1 expect /tmp/p.ex
pstree -w
...
 |     |           \-+= 28384 root strace -f -ff -o 1 expect /tmp/p.ex
 |     |             \-+- 28389 root expect /tmp/p.ex
 |     |               \--= 28390 root vncpasswd
...

tail 1.*
==> 1.27818 <==
fcntl64(2, F_SETFD, FD_CLOEXEC)         = 0
close(0)                                = 0
close(1)                                = 0
open("/dev/null", O_RDONLY|O_LARGEFILE) = 0
fcntl64(0, F_SETFD, FD_CLOEXEC)         = 0
close(3)                                = 0
close(2)                                = 0
close(0)                                = 0
exit_group(1)                           = ?
+++ exited with 1 +++

==> 1.28389 <==
fcntl64(0, F_SETFD, FD_CLOEXEC)         = 0
close(4)                                = 0
close(3)                                = 0
close(2)                                = 0
close(0)                                = 0
write(6, "q", 1)                        = 1
close(-1)                               = -1 EBADF (Bad file descriptor)
munmap(0xb6c27000, 143360)              = 0
exit_group(130)                         = ?
+++ exited with 130 +++

==> 1.28390 <==
set_tls(0x29268, 0x29268, 0xb6f40c60, 0x29268, 0x200) = 0
set_tid_address(0xb6f96e18)             = 28390
brk(0)                                  = 0x125d000
brk(0x1262000)                          = 0x1262000
mkdir("/root/.vnc/", 0777)              = -1 EEXIST (File exists)
ioctl(0, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or
TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(0, SNDCTL_TMR_CONTINUE or SNDRV_TIMER_IOCTL_GPARAMS or TCSETSF,
{B38400 opost isig icanon -echo ...}) = 0
read(0, 0xb6f95e44, 1024)               = -1 EIO (I/O error)
--- SIGHUP {si_signo=SIGHUP, si_code=SI_KERNEL} ---
+++ killed by SIGHUP +++

==> 1.28391 <==
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 28392
rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0xb6f3229c}, {0x5d550,
[], SA_RESTORER, 0xb6f3229c}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=28392,
si_status=0, si_utime=0, si_stime=0} ---
wait4(-1, 0xbece8ebc, WNOHANG, NULL)    = -1 ECHILD (No child process)
sigreturn() (mask [])                   = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
exit_group(0)                           = ?
+++ exited with 0 +++

==> 1.28392 <==
close(4)                                = 0
execve("/bin/stty", ["/bin/stty", "sane"], [/* 42 vars */]) = 0
set_tls(0x20d25c, 0x20d25c, 0x1c1e28, 0x20d25c, 0) = 0
set_tid_address(0x20d788)               = 28392
getuid32()                              = 0
ioctl(0, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or
TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(0, SNDCTL_TMR_STOP or SNDRV_TIMER_IOCTL_GINFO or TCSETSW,
{B38400 opost isig icanon echo ...}) = 0
ioctl(0, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or
TCGETS, {B38400 opost isig icanon echo ...}) = 0
exit_group(0)                           = ?
+++ exited with 0 +++

==> 1.28393 <==
read(5, "\0", 1)                        = 1
select(6, [5], [], [], NULL)            = ? ERESTARTNOHAND (To be
restarted if no handler)
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=28390,
si_status=SIGHUP, si_utime=1, si_stime=1} ---
select(6, [5], [], [], NULL)            = 1 (in [5])
read(5, "q", 1)                         = 1
close(5)                                = 0
rt_sigprocmask(SIG_BLOCK, ~[RTMIN 33 34], [], 8) = 0
futex(0xb6c49da0, FUTEX_WAKE_PRIVATE, 1) = 0
exit(0)                                 = ?
+++ exited with 0 +++

With LG G3, I have a total of 6 processes instead of 5.
Strange. Anyway, the same result, blocked till ctrl-c is pressed.

CHEAPO ANDROID PHONE with linux-3.10.x:
still compiling slowly. I am going to guess I will have similar result.
If it turns out differently, I will post another message.

Since, the i686 kernel is my daily kernel, with no selinux enabled,
I am going to guess, this is not kernel related.
I have no method of getting to the bottom of this issue right now.
I am going to leave it for now.
Maybe it will go away if I use newer gcc.

I am going to test icewm and qemu first, they have only been compiled
but not tested.

>
> Rich
>

  reply	other threads:[~2022-02-27 15:35 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-25 17:59 [musl] " naruto canada
2022-02-25 19:35 ` [musl] " naruto canada
2022-02-25 23:32   ` Rich Felker
2022-02-26  2:46     ` naruto canada
2022-02-26 13:48       ` Rich Felker
2022-02-27 15:35         ` naruto canada [this message]
2022-02-27 21:29           ` naruto canada
2022-02-28  0:16             ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKrOiPQ1s-jg=kV9mYRv4+iazwpfbQ15YpBuE49BzhrH1=bSUw@mail.gmail.com' \
    --to=narutocanada@gmail.com \
    --cc=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).