From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/14394 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: open64 and similar Date: Wed, 10 Jul 2019 21:58:51 -0400 Message-ID: <20190711015851.GA1506@brightrain.aerifal.cx> References: <20190710224301.GZ1506@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="123454"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-14410-gllmg-musl=m.gmane.org@lists.openwall.com Thu Jul 11 03:59:08 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1hlOMV-000Vwm-OA for gllmg-musl@m.gmane.org; Thu, 11 Jul 2019 03:59:07 +0200 Original-Received: (qmail 7708 invoked by uid 550); 11 Jul 2019 01:59:04 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 7680 invoked from network); 11 Jul 2019 01:59:03 -0000 Content-Disposition: inline In-Reply-To: <20190710224301.GZ1506@brightrain.aerifal.cx> Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:14394 Archived-At: On Wed, Jul 10, 2019 at 06:43:01PM -0400, Rich Felker wrote: > I'd like it if we could remove this stuff entirely, except for the > ABI-compat. Maybe it could be done by getting rid of the actual > symbols and just putting magic in the dynamic linker to resolve them > to the non-64 ones. Since I'd like to actually go forward with this in the next release cycle, an outline for how it would work: Removal is simple: just ripping out all the _LARGEFILE64_SOURCE stuff from the headers and all instances of weak_alias(x,x64) (and the few exceptions to this pattern) from the source files. Restoration/preservation of glibc-ABI-compat (and ABI-compat with any musl binaries that might have somehow found a way to produce a reference to one of the *64 symbols) is harder. There are two possible approaches. One is to add to dynlink.c a special case for symbol lookup failure in libc.so, whereby, for a list of symbol names, the lookup is retried with the 64 removed (or with other transformations as needed). A second, possibly more graceful, way to do it is to generate as static data an ELF symbol table for all the symbols that we want to offer as ABI-compat only, and add a dummy DSO to the DSO list at dynamic linker startup, just after libc.so, to hook up the symbol table for the existing normal code paths to use. A third, awful possibility would be using symbol versioning to set them up as non-default (invisible to ld) versioned symbols aliased to the real functions. There are lots of good reasons not to want to do this (and not to want any symver table in libc, even if we do actually want to resolve symbol versions for other libs later). My leaning is towards the second option since it's rather elegant, non-invasive to the hot code paths, and easy to extend to other "junk" symbols we might want to offer for ABI-compat only. But I'm also open to simpler ideas. For example if there's a way to "poison" the symbols to ld so that it refuses to link to them (generating errors when configure scripts try), that should suffice without removing the symbols, and would be easier (and avoid the need for any special dynamic linker work). Note that if we do this, we might also want to offer a static liblfs64.a that just redirects all the LFS64 symbols to the standard ones (this is mildly annoying to do for open64, since it's variadic...). This is to allow ABI-compat linking of static (possibly closed-source) libs that were made for use with glibc, to the extent possible, and is not something you'd want to do by default since it would expose the symbols to configure scripts again. Rich