From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/14416 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: time_t progress/findings Date: Sat, 20 Jul 2019 17:46:02 -0400 Message-ID: <20190720214602.GA1506@brightrain.aerifal.cx> References: <20190718154132.GR1506@brightrain.aerifal.cx> <20190718163745.GT1506@brightrain.aerifal.cx> <20190718205237.GU1506@brightrain.aerifal.cx> <20190720044840.GZ1506@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="36895"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-14432-gllmg-musl=m.gmane.org@lists.openwall.com Sat Jul 20 23:46:23 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1hoxBP-0009TD-DJ for gllmg-musl@m.gmane.org; Sat, 20 Jul 2019 23:46:23 +0200 Original-Received: (qmail 24373 invoked by uid 550); 20 Jul 2019 21:46:16 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 24350 invoked from network); 20 Jul 2019 21:46:15 -0000 Content-Disposition: inline In-Reply-To: <20190720044840.GZ1506@brightrain.aerifal.cx> Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:14416 Archived-At: On Sat, Jul 20, 2019 at 12:48:40AM -0400, Rich Felker wrote: > On Thu, Jul 18, 2019 at 04:52:37PM -0400, Rich Felker wrote: > > On Thu, Jul 18, 2019 at 12:37:45PM -0400, Rich Felker wrote: > > > Second bit of progress here: stat. First change can be done before any > > > actual time64 work is done or even decided upon: changing all the > > > stat-family functions to use fstatat (possibly with AT_EMPTY_PATH) as > > > their backend, and changing fstatat to do proper fallbacks if > > > SYS_fstatat is missing. Now there's a single point of potential stat > > > conversion rather than 4 functions. > > > > > > Next, add an internal, arch-provided kstat type and make fstatat > > > translate from this to the public stat type. This eliminates the need > > > for all the mips*/syscall_arch.h hacks. > > > > This step admits a few questions about how to do it best, inspired in > > part by a related question: > > > > What should the new time64 stat structures look like? > > > > There are at least three possible goals: > > > > 1. Make them as clean and uniform as possible, same for all archs. > > > > 2. Avoid increasing the size at all cost so as to maximize > > memory-safety of mismatched interfaces between libc consumers > > defined in terms of struct stat. > > > > 3. Make the start of the new struct match the old struct to minimize > > behavioral errors under mismatched interfaces between libc > > consumers defined in terms of struct stat. > > > > Choice 2 is pretty much out because I think it's impossible on at > > least one arch, and would impose really ugly constraints (making > > timespec 24-byte, relying on non-64bit-alignment) on others. In many > > ways choice 3 is actually more appealing, because when third-party > > libraries *do* use stat in public interfaces, it's usually understood > > that the same party both allocates and fills it in, and shares the > > contents with the other party. > > > > There are actually 2 subvariants of choice 3: either keep exposing the > > 32-bit time in the old locations so that mismatched consumers just > > work, or fill it in with something like INT_MIN (year~=1902) so that > > breakage is caught quickly. > > > > Now, back to kstat and the above-quoted text. If we go with option 3, > > we don't actually need a kstat struct. The existing stat syscalls just > > write into the beginning of the buffer, and then we copy the result to > > the time64 timespecs at the end that make up the new public interface. > > This results in the smallest code, and the least amount of new > > per-arch definitions. But it doesn't clean up the existing mips > > stat-translation hell (currently buried in mips*/syscall_arch.h), and > > it imposes assumptions about the relationship between kernel types and > > public libc types. > > > > On the other hand, if we make archs define a struct kstat and always > > translate everything, the code is a bit larger, but we: > > > > - don't impose any particular choice 1/2/3 above. > > - make it easy to cleanup the mips brokenness. > > - facilitate future musl archs/ABIs (e.g. a ".2 ABI") where userspace > > stat has nothing to do with the legacy kernel stat structs. > > > > So I'm leaning strongly towards just always doing the translation, > > even though I'm also leaning towards choice 3 above that won't require > > it. If nothing else, it allows me to do the prep work that will set > > the stage for time64 transition now, without having finalize the > > decisions about how time64 will look. > > Another data point in favor of choice 3: libc actually has some > functions of its own that pass stat structures to callbacks: ftw and > nftw. With choice 3, these don't need any change; a legacy binary > calling them will get back stat structures it can read (with some > extra 64-bit timespecs afterwards that it's not aware of). With any > other choice, these functions would need painful replacements, and > just wrapping them is not easy because they lack a context argument to > pass through. > > Since similar usage is likely common in third-party library code, I > think this is a really strong argument in favor of choice 3. FWIW the > existing glibc proposal looks like option 1, and they weren't aware of > this problem until I reported it just now. Related find: for struct rusage, we can satisfy both (2) and (3) simultaneously. This means no new structures or symbols are needed for getrusage, wait3, and wait4. The existing 32-bit musl structs left 16 slots for extensibility at the end, so we can just put the new 64-bit time fields there, and still fill in the 32-bit ones too for legacy callers. This is basically what I was already planning for utmp, except that it's less interesting for utmp because the functions are stubs. Rich