From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/14552 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.comp.lib.glibc.alpha,gmane.linux.lib.musl.general Subject: Re: time64 abi choices for glibc and musl Date: Mon, 12 Aug 2019 13:11:01 -0400 Message-ID: <20190812171101.GQ9017@brightrain.aerifal.cx> References: <20190810175808.GA13205@brightrain.aerifal.cx> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="215300"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mutt/1.5.21 (2010-09-15) Cc: GNU C Library , musl@lists.openwall.com To: Zack Weinberg Original-X-From: libc-alpha-return-104319-glibc-alpha=m.gmane.org@sourceware.org Mon Aug 12 19:11:13 2019 Return-path: Envelope-to: glibc-alpha@blaine.gmane.org Original-Received: from server1.sourceware.org ([209.132.180.131] helo=sourceware.org) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1hxDqi-000tuk-N7 for glibc-alpha@blaine.gmane.org; Mon, 12 Aug 2019 19:11:13 +0200 DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:from:to:cc:subject:message-id:references :mime-version:content-type:in-reply-to; q=dns; s=default; b=ma7E R/elLrDMFB41Tdm3hQE6IJVSUnutaBzDwPhjX4ubajknWs0/7Ku0fyq9/2ZHx+yW hjnLRruy0LfI4t273GQz7o3GiLrukm5Ywb8V+oywp14utbDBMWQSTHFSxcz319EF 6JDYRX9sn83HNO0OEjfPXv2VcbhPPY93y7huQgQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:from:to:cc:subject:message-id:references :mime-version:content-type:in-reply-to; s=default; bh=ZlC3nJfK9g MnGJi3SLTOVmm0Pb4=; b=U8SrAJr3PDN3DGk04ja5p/KZoJG+c9uCz59fShoc8Z Spryce6a5R3twl5aMcRfojyw6dVSJEUuhyMWEBAWl3jFoFTjT6O9u8epBR8AoXKU PyxLzLhMBmKU1IfiA8qav7j2Vr8jCNdEET7NqAiKvMA1b3DG6N2PdKqZJDJwoPmz w= Original-Received: (qmail 21888 invoked by alias); 12 Aug 2019 17:11:08 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Original-Sender: libc-alpha-owner@sourceware.org Original-Received: (qmail 21875 invoked by uid 89); 12 Aug 2019 17:11:07 -0000 Authentication-Results: sourceware.org; auth=none X-HELO: brightrain.aerifal.cx Content-Disposition: inline In-Reply-To: Xref: news.gmane.org gmane.comp.lib.glibc.alpha:96674 gmane.linux.lib.musl.general:14552 Archived-At: On Mon, Aug 12, 2019 at 12:29:55PM -0400, Zack Weinberg wrote: > On Sat, Aug 10, 2019 at 1:58 PM Rich Felker wrote: > > As far as I can tell, most time64 work/discussion on the glibc side so > > far has been about implementation mechanisms and symbol-binding ABI > > aspects, and one aspect that doesn't seem to have been addressed is > > making good choices about the actual types involved. Some of them have > > been done: > > > > struct timespec (endian-matching padding) > > struct timeval (64-bit suseconds_t) > > struct itimerspec, itimerval, utimbuf, timeb (obvious definitions) > > > > but I haven't seen a clear proposal with rationale for the choices in > > defining: > > > > struct stat > > struct msqid_ds, semid_ds, shmid_ds (sysvipc) > > struct rusage > > struct timex > > struct utmp[x] > > The sysvipc structures, struct rusage, and struct timex are passed > directly from and to the kernel, so this ship has already sailed -- I > don't see that we have any choice but to match the kernel's layouts of > these structures in time64 mode, whether or not that is convenient for > the C library or for call compatibility between procedures compiled > with _TIME_BITS=32 and _TIME_BITS=64. The kernel does not have a format for the sysvipc structures. It gives you the upper bits of the 64-bit (or, for mips, 48-bit) time_t's packed into various padding members. This *forces* libc to define a type and do some sort of translation; the only question is what it will do. (Strictly speaking, on i386 you can use the kernel high bits in-place, but this only works because i386 (1) i386 is little-endian, and (2) i386 ABI doesn't align 64-bit types to 64-bit boundaries, only to 32-bit boundaries. 32-bit ppc also got lucky and makes it work for a different reason. But the rest fundamentally *can't* work with the kernel types in-place. If you're interested in details I can cite my findings working on this for musl.) Regarding rusage, there is no SYS_getrusage_time64 syscall at present. The only kernel type on the existing 32-bit archs (not including RV32 which will use a 64-bit type I think, but I'm not sure) is the 32-bit one, so some kind of translation is necessary to get a 64-bit struct. Only 3 functions use this structure (and one, wait3, is just a trivial wrapper around wait4), so conversion is no big burden. For timex, it is indeed a kernel interface, and opting to use the 64-bit version (with endian-appropriate padding for the members documented as long but that use kernel int64's) is a very reasonable choice. Note that, due to the padding issue, if the kernel reads and interprets the padding as part of the value, a temp copy of the structure is needed anyway, in which case it's not really any cheaper than just translating the whole thing. So it might make sense to define the libc type without the padding. This is my leaning for musl. > Regarding struct stat, I tend to agree with Paul that having both 32- > and 64-bit time fields _visible_ at the same time will cause more > problems than it solves. What I would suggest is that we add the > 64-bit 'struct timespec's at the end, as you suggest, and continue to > write (possibly truncated) valid data to the legacy 32-bit 'struct > timespec's, but the public header file _immediately_ converts the > legacy fields to __reservedN fields. I'm perfectly fine with this solution, and find it better than exposing the names, even in the __st_* namespace. > This should work except for when > an existing _TIME_BITS=32 binary caller changes the 32-bit fields of a > struct stat, passes it to a _TIME_BITS=64 callee, and expects it to > notice. Is there any practical way for us to find out whether any > such combination actually exists? I don't know a good way right off, but it's known, and inherent, that some things will break under mixing time32 and time64 object files/shared libs in a single program. What you're describing is a weird usage and probably only appears either in LD_PRELOAD stuff that's faking stat results (in which case it has to be aware of all the glibc __xstat stuff and ABI versions, and use of pre-time64 builds of it will not work on time64 binaries at all), or perhaps in archival tools reusing the system's struct stat to move around information about archive contents (as opposed to stat structures built by calling one of the libc functions). > It also occurs to me that this > might mean we can share stat/fstat/lstat for _TIME_BITS=32 and =64. You can't share them entirely because the time64 ones will overflow a time32 caller's buffer. What you can do is make the time32 compat functions thin shims that just do: struct stat st; int r = stat(pathname, &st); if (!r) memcpy(st32, &st, sizeof *st32); return r; In addition, if you want to catch overflows, you could either check that the time64 member values fit in 32 bits, or compare them against the time32 members, and produce EOVERFLOW. > struct utmp[x] describes a format for long-lived files on disk, so not > only do we have to keep the time field where it is, we have to make it > possible for new programs to read and understand old records. And > there is no padding on either side of the ut_tv field, so we would be > stuck ... except that it uses struct timeval, and it's silly to track > login events to sub-second precision. So I'm going to suggest that we > redefine that field as a bare time64_t: > > #define UT_LINESIZE 32 > #define UT_NAMESIZE 32 > #define UT_HOSTSIZE 256 > > struct utmp{,x} > { > uint16_t ut_type; > uint16_t __reserved1; > uint32_t ut_pid; > char ut_line[UT_LINESIZE]; > char ut_id[4]; > char ut_user[UT_NAMESIZE]; > char ut_host[UT_HOSTSIZE]; > struct exit_status { > int16_t e_termination; > int16_t e_exit; > } ut_exit; > #if __32_BIT_UT_SESSION > int32_t ut_session; > #else > int64_t ut_session; > #endif > union { > _Alignas(__old_time_t) uint64_t ut_time64; Per the C standard, _Alignas() does not/cannot produce alignment smaller than the standard required alignment for the type. You'd need a GNUC-specific attribute here instead. > struct { > __old_time_t tv_sec; // historic time_t for this ABI, whatever that was > int32_t tv_usec; > } ut_tv; > }; > #if __PAD_AFTER_UT_TV > int32_t __reserved2; > #endif > int32_t ut_addr_v6[4]; > char __reserved3[20]; > }; > > We could define a new set of ut_type codes to indicate which of ut_tv > and ut_time64 is meaningful. > > Unfortunately no such hack appears to be possible for struct lastlog, > and that has the same problem. Anyone have an idea for that one? Note that struct utmpx is specified by POSIX (XSI option) and required to contain struct timespec ut_tv. So these changes are non-conforming, but the x86_64 definition is already non-conforming (using 32-bit timeval for compat with 32-bit x86 binaries instead of a real timeval), so presumably glibc doesn't care. I'd like to see this fixed but it's a huge pain since the structure is used on-disk. Short of that, any solution glibc adopts is not a significant regression from the current situation in my eyes, so I don't have much of an opinion on this. (FWIW I also hate utmp and consider it at best a creepy stalking tool of 80s/90s era BOFH's, so I'm not the best source of opinions on this stuff.) Rich