From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.2 Received: (qmail 26733 invoked from network); 23 Apr 2020 12:24:00 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with UTF8ESMTPZ; 23 Apr 2020 12:24:00 -0000 Received: (qmail 19595 invoked by uid 550); 23 Apr 2020 12:22:46 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 19550 invoked from network); 23 Apr 2020 12:22:45 -0000 Date: Thu, 23 Apr 2020 14:22:34 +0200 From: Szabolcs Nagy To: Paul Sokolovsky Cc: Rich Felker , musl@lists.openwall.com Message-ID: <20200423122233.GH23945@port70.net> Mail-Followup-To: Paul Sokolovsky , Rich Felker , musl@lists.openwall.com References: <20200423022531.502e9d26@zenbook> <20200423023941.GQ11469@brightrain.aerifal.cx> <20200423121627.3e4d3ecd@zenbook> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200423121627.3e4d3ecd@zenbook> User-Agent: Mutt/1.10.1 (2018-07-13) Subject: Re: [musl] foreign-dlopen: dlopen() from static binary, again (and not the way you think!) * Paul Sokolovsky [2020-04-23 12:16:26 +0300]: > Hello, > > On Wed, 22 Apr 2020 22:39:41 -0400 > Rich Felker wrote: > > [] > > > > Oh, forgot to say that I'm not looking for a way to load a > > > particular musl-dynlinked shared library into musl-staticlinked > > > binary. So, arguments like "but you'll need to carry around musl's > > > libc.so" don't apply. What I'm looking for is a way to have a > > > static closed-world application, but let it, at the user's request, > > > to interface with whatever system may be outside. > [] > > > of concept code is at > https://github.com/pfalcon/foreign-dlopen . > > > > In your example it looks like you're foreign_dlopen'ing glibc. That > > simply *can't* work, because part of the interface contract of all > > glibc functions is that they're called with the thread pointer > > register (%gs or %fs on i386 or x86_64 respectively) pointing to a > > glibc TCB, which will not be the case when they're invoked from a > > musl-linked (or other non-glibc-linked) program. > > Thanks for the response and for the word of warning. As I mentioned, > this is essentially a proof of concept, and so far was tested only by > calling glibc's printf() from a host app which was either linked with > glibc itself or -nostdlib and static. And that was already more than > with any other ELF loader which I tried (which worked for simple > functions like write(), but crashed in anything more complex like > printf()). > > But it certainly doesn't touch a case you describe, when "foreign" vs > local libc expect different values of %gs/%fs (so apparently, "foreign > function call" facility would need to swap them around a call). yes, libc functions should be called on libc owned threads and your code can only run on the same thread if you follow the same abi (which is more than just the call convention), swapping the thread pointer means that the foreign libc has to create the thread on which you invoke the foreign function (or it has to be the main thread) since the data structures at tp are set up at thread creation (or early libc init for the main thread). what's worse is that some process global state also has to be under the control of libc (e.g. libc internal signal handlers or global state controlled via prctl or libc may want fd 0,1,2 in a particular state) so cross calling a different libc involves system calls (e.g. the go runtime gets this wrong for obvious reasons: calling c from go would be really slow, this is why you normally try to avoid using your own libc independent runtime. go gets away with this because libc internal signals are rarely relevant and most process state is per thread on linux so if you let the foreign libc to create the os threads and take over the signal handlers and signal masks then things work) > > If you relax to the case where you're not doing that, and instead only > > opening *pure library* code which has no tie-in to global state or TLS > > contracts, then it should be able to work. it's not documented what api is implemented as pure library code and in principle libc code may call other libc code via plt and then lazy binding can happen which is not pure. (glibc tries to avoid this of course, but it does have some runtime loaded components e.g. for locale specific char conversions so things that may seem pure from the outside can end up unpure).