mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: recvmsg/sendmsg broken on mips64
Date: Thu, 21 Apr 2016 11:36:37 -0400	[thread overview]
Message-ID: <20160421153637.GY21636@brightrain.aerifal.cx> (raw)
In-Reply-To: <57187FA8.8010806@dd-wrt.com>

On Thu, Apr 21, 2016 at 09:22:16AM +0200, Sebastian Gottschall wrote:
> Am 21.04.2016 um 03:37 schrieb Rich Felker:
> >On Sun, Apr 10, 2016 at 10:35:22PM -0400, Rich Felker wrote:
> >>On Mon, Apr 11, 2016 at 12:33:07AM +0200, Sebastian Gottschall wrote:
> >>>Am 11.04.2016 um 00:29 schrieb Rich Felker:
> >>>>On Mon, Apr 11, 2016 at 12:24:49AM +0200, Sebastian Gottschall wrote:
> >>>>>>I think what nsz was asking for, and what I'd like to see, is a way to
> >>>>>>reproduce the bug. I'm going to try building iproute2 for mips64 and
> >>>>>>running it on a prebuilt kernel from Aboriginal Linux under
> >>>>>>qemu-system-mips64, but I don't know what specific commands are needed
> >>>>>>to hit the affected code path.
> >>>>>any command since all is netlink based
> >>>>>ip add add 192.168.1.1/24  dev eth0
> >>>>>
> >>>>>yo will see that nothing will happen. ip will just return a error
> >>>>>message (i wrote this message already in the first entry on this
> >>>>>mailinglist)
> >>>>>"EOF on netlink" is the error which is shown
> >>>>OK, I'll try this.
> >>>>
> >>>>>>>its all resulting in the same failing recvmsg / sendmsg call.. so
> >>>>>>>yes libnetlink.c does not work with musl on mips64 (it does work on
> >>>>>>>x64 and everything else, just not mips64) unless the hack i offered
> >>>>>>>was applied which again fixed all.
> >>>>>>>before you ask again for a problem description, just read again. it
> >>>>>>>wont change the description if you ask again and just makes people
> >>>>>>>tired on this list.
> >>>>>>Both versions of the struct (musl's and your modified one that matches
> >>>>>>the kernel) have the exact same layout, but due to having a member
> >>>>>>with 64-bit type, yours has 8-byte alignment and musl's only has
> >>>>>>4-byte alignment. This means, at least:
> >>>>>>
> >>>>>>1. When musl's sendmsg.c makes its copy to zero out the padding, the
> >>>>>>    copy may not be correctly aligned for 64-bit writes, and the kernel
> >>>>>>    faults or manually produces an error for this case, causing the
> >>>>>>    whole operation to fail. However, I don't see where iproute2 is
> >>>>>>    actually passing control messages to sendmsg, so while this is a
> >>>>>>    problem, I don't think it's the cause. Maybe I'm missing the
> >>>>>>    affected call point; this is why I'd like steps to reproduce the
> >>>>>>    issue so I can see it.
> >>>>>>
> >>>>>>2. iproute2's libnetlink.c's rtnl_listen function does not properly
> >>>>>>    declare its cmsgbuf with the alignment of cmsghdr; it has type
> >>>>>>    char[] so the compiler is free not to align it at all. This is
> >>>>>>    presumably a bug in iproute2, but I can't find any good
> >>>>>>    documentation (in the standards or Linux-specific) for how you're
> >>>>>>    supposed to allocate this space, so maybe the kernel is able to
> >>>>>>    handle aligning the buffer itself. I don't see any way the
> >>>>>>    alignment of musl's cmsghdr type affects recvmsg though.
> >>>>>>
> >>>>>>Maybe there are other effects I'm missing? I'll follow up again once I
> >>>>>>get a test build/run of iproute2 and let you know whether I can see
> >>>>>>the problem.
> >>>>>okay. if you need a remote access to a octeon system using musl (my
> >>>>>fixed variant), just tell me.
> >>>>That would be really helpful. Something's wrong with the userspace for
> >>>>the Aboriginal mips64 binaries (SIGBUS in init) and debugging that
> >>>>would be a big distraction.
> >>>>
> >>>>BTW do you have gdb and strace available?
> >>>not on the system itself. i'm not sure if strace works on mips64.
> >>>never tried it.
> >>>but you're free to copy any binary to the /tmp dir. it has 2 gb ram.
> >>>so enough space for static binaries if you want to play with.
> >>>i will send you the ssh data in a private email
> >>I haven't been able to reproduce the error on your system. I've tried
> >>building my own static-linked version of the "ip" utility with a
> >>mips64-linux-musl softfloat compiler, and uploading my libc.so and
> >>using it to run both your version of ip and a dynamic-linked one I
> >>just built. They all work fine for adding/removing a 127.0.0.2 address
> >>to the "lo" interface.
> >>
> >>Next I'm going to try to get a minimal testcase that tries to
> >>intentionally misalign the control message buffers. I suspect I'm just
> >>"getting lucky" and my buffer happens to be aligned the way the kernel
> >>wants by chance.
> >I've managed to track down the cause of the breakage. Somehow your
> >iproute2 has been miscompiled. What I did was add debug logic to
> >libc.so to print the contents of the msghdr struct passed in before
> >fixups, after fixups, and after the syscall. The output I got was:
> >
> >msghdr: 0xffffd58e08 12 0xffffd58df8 1 0 0 0 0 0
> >msghdr: 0xffffd58e08 12 0xffffd58df8 0 0 0 0 0 0
> >msghdr: 0xffffd58e08 12 0xffffd58df8 0 0 0 0 0 32
> >
> >The fields (including __pad1 and __pad2) are printed in order. So as
> >you can see, ip passed in a structure with a 1 in __pad1 and a 0 in
> >msg_iovlen. The source (libnetlink.c) stores 1 to msg_iovlen, so my
> >guess is that somehow it ended up getting the wrong-endian version of
> >the structure definition. You could confirm this by adding #error to
> >the little-endian case in arch/mips64/bits/socket.h and recompiling. I
> >suspect it's going to take some additional work to track down the
> >cause, which is likely specific to something in your toolchain (it
> >didn't happen for me when I built my own iproute2).
> i tried that already before i contacted you. the #error case never
> raises within the little endian case

Was that when compiling musl or iproute2? The problem is in how
iproute2 was built; your libc.so seems fine.

> so your guess doesnt match reality. (i even tried it again right
> now. all is fine. it only uses the big endian case)

If it's not the endian tests, I don't know what else would have caused
this. I'll get a disassembly dump of the function to show you. Is
there any way I can reproduce your exact toolchain to see if I can get
the same miscompilation to happen?

Rich


  reply	other threads:[~2016-04-21 15:36 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-31 18:20 size_t and int64_t on a new platform Dan Gohman
2016-03-31 19:25 ` Rich Felker
2016-03-31 20:10   ` Szabolcs Nagy
2016-03-31 20:23     ` Alexander Monakov
2016-03-31 20:30       ` Rich Felker
2016-04-01  9:16         ` recvmsg/sendmsg broken on mips64 Sebastian Gottschall
2016-04-01  9:49           ` Szabolcs Nagy
2016-04-01 10:29             ` Sebastian Gottschall
2016-04-01 11:31               ` Szabolcs Nagy
2016-04-01 11:37                 ` Sebastian Gottschall
2016-04-01 12:21                   ` Masanori Ogino
2016-04-01 12:42                     ` Sebastian Gottschall
2016-04-01 13:17                       ` Szabolcs Nagy
2016-04-02  9:52                         ` Sebastian Gottschall
2016-04-07  9:48                           ` Szabolcs Nagy
2016-04-07 11:42                             ` Sebastian Gottschall
2016-04-07 18:46                               ` Szabolcs Nagy
2016-04-07 23:33                                 ` Sebastian Gottschall
2016-04-10 22:18                                   ` Rich Felker
2016-04-10 22:24                                     ` Sebastian Gottschall
2016-04-10 22:29                                       ` Rich Felker
2016-04-10 22:33                                         ` Sebastian Gottschall
2016-04-11  2:35                                           ` Rich Felker
2016-04-11  6:35                                             ` Sebastian Gottschall
2016-04-11 18:32                                               ` Rich Felker
2016-04-11 19:01                                                 ` Sebastian Gottschall
2016-04-14 14:10                                                 ` Sebastian Gottschall
2016-04-15 16:19                                                   ` Rich Felker
2016-04-21  1:37                                             ` Rich Felker
2016-04-21  7:22                                               ` Sebastian Gottschall
2016-04-21 15:36                                                 ` Rich Felker [this message]
2016-04-21 17:16                                                   ` Rich Felker
2016-04-21 19:30                                                     ` Sebastian Gottschall
2016-04-21 19:29                                                   ` Sebastian Gottschall
2016-04-01  0:35   ` size_t and int64_t on a new platform Dan Gohman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160421153637.GY21636@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).