From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/14415 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: time_t progress/findings Date: Sat, 20 Jul 2019 00:48:40 -0400 Message-ID: <20190720044840.GZ1506@brightrain.aerifal.cx> References: <20190718154132.GR1506@brightrain.aerifal.cx> <20190718163745.GT1506@brightrain.aerifal.cx> <20190718205237.GU1506@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="87756"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-14431-gllmg-musl=m.gmane.org@lists.openwall.com Sat Jul 20 06:48:58 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1hohIn-000Mk0-Ta for gllmg-musl@m.gmane.org; Sat, 20 Jul 2019 06:48:57 +0200 Original-Received: (qmail 19828 invoked by uid 550); 20 Jul 2019 04:48:53 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 19810 invoked from network); 20 Jul 2019 04:48:53 -0000 Content-Disposition: inline In-Reply-To: <20190718205237.GU1506@brightrain.aerifal.cx> Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:14415 Archived-At: On Thu, Jul 18, 2019 at 04:52:37PM -0400, Rich Felker wrote: > On Thu, Jul 18, 2019 at 12:37:45PM -0400, Rich Felker wrote: > > Second bit of progress here: stat. First change can be done before any > > actual time64 work is done or even decided upon: changing all the > > stat-family functions to use fstatat (possibly with AT_EMPTY_PATH) as > > their backend, and changing fstatat to do proper fallbacks if > > SYS_fstatat is missing. Now there's a single point of potential stat > > conversion rather than 4 functions. > > > > Next, add an internal, arch-provided kstat type and make fstatat > > translate from this to the public stat type. This eliminates the need > > for all the mips*/syscall_arch.h hacks. > > This step admits a few questions about how to do it best, inspired in > part by a related question: > > What should the new time64 stat structures look like? > > There are at least three possible goals: > > 1. Make them as clean and uniform as possible, same for all archs. > > 2. Avoid increasing the size at all cost so as to maximize > memory-safety of mismatched interfaces between libc consumers > defined in terms of struct stat. > > 3. Make the start of the new struct match the old struct to minimize > behavioral errors under mismatched interfaces between libc > consumers defined in terms of struct stat. > > Choice 2 is pretty much out because I think it's impossible on at > least one arch, and would impose really ugly constraints (making > timespec 24-byte, relying on non-64bit-alignment) on others. In many > ways choice 3 is actually more appealing, because when third-party > libraries *do* use stat in public interfaces, it's usually understood > that the same party both allocates and fills it in, and shares the > contents with the other party. > > There are actually 2 subvariants of choice 3: either keep exposing the > 32-bit time in the old locations so that mismatched consumers just > work, or fill it in with something like INT_MIN (year~=1902) so that > breakage is caught quickly. > > Now, back to kstat and the above-quoted text. If we go with option 3, > we don't actually need a kstat struct. The existing stat syscalls just > write into the beginning of the buffer, and then we copy the result to > the time64 timespecs at the end that make up the new public interface. > This results in the smallest code, and the least amount of new > per-arch definitions. But it doesn't clean up the existing mips > stat-translation hell (currently buried in mips*/syscall_arch.h), and > it imposes assumptions about the relationship between kernel types and > public libc types. > > On the other hand, if we make archs define a struct kstat and always > translate everything, the code is a bit larger, but we: > > - don't impose any particular choice 1/2/3 above. > - make it easy to cleanup the mips brokenness. > - facilitate future musl archs/ABIs (e.g. a ".2 ABI") where userspace > stat has nothing to do with the legacy kernel stat structs. > > So I'm leaning strongly towards just always doing the translation, > even though I'm also leaning towards choice 3 above that won't require > it. If nothing else, it allows me to do the prep work that will set > the stage for time64 transition now, without having finalize the > decisions about how time64 will look. Another data point in favor of choice 3: libc actually has some functions of its own that pass stat structures to callbacks: ftw and nftw. With choice 3, these don't need any change; a legacy binary calling them will get back stat structures it can read (with some extra 64-bit timespecs afterwards that it's not aware of). With any other choice, these functions would need painful replacements, and just wrapping them is not easy because they lack a context argument to pass through. Since similar usage is likely common in third-party library code, I think this is a really strong argument in favor of choice 3. FWIW the existing glibc proposal looks like option 1, and they weren't aware of this problem until I reported it just now. Rich