From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 23368 invoked from network); 3 May 2023 15:11:28 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 3 May 2023 15:11:28 -0000 Received: (qmail 28453 invoked by uid 550); 3 May 2023 15:11:25 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 28418 invoked from network); 3 May 2023 15:11:24 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=inria.fr; s=dc; h=date:from:to:cc:subject:message-id:in-reply-to: references:mime-version; bh=aybV9VjQ0uweCdHNT4aoz60axvcm+RgrRglW0xRzlH0=; b=Ox1S0RSZHAvlrdP/pBs4oN5J0qrFY0l2D3MmvuVAQssZvoZyVBS+bQLn jjQS1OaJW/SwBn5gqTsR6iro+Ywlg/pyxjyXTaXE8BB+L2pE1cHqxZWZn Jh/7AZCwXzrZKoPcWrgnHRWZsrfN3D5nb7I8/IUQqUrAli88GEJfmAu8g Y=; Authentication-Results: mail2-relais-roc.national.inria.fr; dkim=none (message not signed) header.i=none; spf=SoftFail smtp.mailfrom=jens.gustedt@inria.fr; dmarc=fail (p=none dis=none) d=inria.fr X-IronPort-AV: E=Sophos;i="5.99,247,1677538800"; d="scan'208";a="106079861" Date: Wed, 3 May 2023 17:11:11 +0200 From: =?UTF-8?B?SuKCkeKCmeKCmw==?= Gustedt To: Rich Felker Cc: musl@lists.openwall.com Message-ID: <20230503171111.15092dbb@inria.fr> In-Reply-To: <20230503141619.GW4163@brightrain.aerifal.cx> References: <20230501205037.29e42745@inria.fr> <20230501194121.GS4163@brightrain.aerifal.cx> <20230502085740.23ff20d5@inria.fr> <20230502155903.30bee099@inria.fr> <20230502232009.GT4163@brightrain.aerifal.cx> <20230503000045.GU4163@brightrain.aerifal.cx> <20230503111246.00ba409e@inria.fr> <20230503141619.GW4163@brightrain.aerifal.cx> Organization: inria.fr X-Mailer: Claws Mail 4.0.0 (GTK+ 3.24.33; x86_64-pc-linux-gnu) X-Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAAAXNSR0IArs4c6QAAACRQTFRFERslNjAsLTE9Ok9wUk9TaUs8iWhSrYZkj42Rz6aD3sGZ MIME-Version: 1.0 Content-Type: multipart/signed; boundary="Sig_/wtlshy4PpvpcFj7JkD/UeJ."; protocol="application/pgp-signature"; micalg=pgp-sha1 Subject: Re: [musl] patches for C23 --Sig_/wtlshy4PpvpcFj7JkD/UeJ. Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Rich, on Wed, 3 May 2023 10:16:19 -0400 you (Rich Felker ) wrote: > On Wed, May 03, 2023 at 11:12:46AM +0200, J=E2=82=91=E2=82=99=E2=82=9B Gu= stedt wrote: > > > Language/compiler baseline for building musl is not going to go > > > up, so this complicates some things, especially implementing the > > > int128 stuff. This will need pop_arg to call out to an > > > arch-provided asm function that bypasses the C type system to get > > > the nonexistent-type argument off the va_list and store it in a > > > pair of uint64_t. =20 > >=20 > > I don't see that. `pop_arg` just uses `va_arg` and that in turn is > > fixed to `__builtin_va_arg`. The proposed patches assume that if > > `__SIZEOF_INT128__` is defined by the compiler that then the > > compiler provides the `__int128` types and knows how to deal with > > them in `__builtin_va_arg`. Is there anything wrong with that > > assumtion? =20 >=20 > Yes. We don't require a compiler that has an __int128. sure, but all the uses are protected by `__SIZEOF_INT128__`. So if the compiler don't has this, they will not see that code when compiling musl. Also application side compilation with a different compiler that has no `__int128` would not see these interfaces, so such application code can never call into the library with `__int128` types. > The feature set of the library is not allowed to vary depending on > the compiler version it was built with. I am not sure that I can follow. The feature set will vary, modelled with feature macros as provided by the compiler. That is what these things are made for, no? Basically that would mean that we can never extend ABIs. This will for sure occur in the future for `_BitInt` types where both gcc and clang have the intention to augment `__BITINT_MAXWIDTH__` from currently 128 to much bigger numbers. So since we have to provide `BITINT_MAXWIDTH` we would not be able to adapt because that would change the feature set. > The only non-UB way to get an __int128 out of a va_list if the > compiler has no idea there's such a thing as __int128 is to write > asm that bypasses the C type system. But nobody will want to do that because that would be completely useless. If the compiler does not implement these types, there is no point or even possibilty of feeding them in or out of the library. (If the library was compiled with int128 support and the actual compiler doesn't provide it, that is just a bit of dead code in the library.) (And relying on `__builtin_va_arg` basically means that, yes, the feature set changes with the compiler that is used to compile the library.) In the other direction, when user code has `__int128` and the library hasn't, for the use of functions that have 128 bit types in their interfaces would fail on linkage. The use of `va_arg` functions in the library (which are `printf` and `scanf` functions) would just have the calls fail, because the library wouldn't know how to handle `w128` format specifiers. If we would like to have the latter fail at compilation or link time, there would certainly be a way to do that with some artificial symbol `__needs_128_bit_types` or so. (Aplication `va_arg` functions are safe because they would use the updated `__builtin_va_arg`.) > There can be a "generic" version of this TU, I guess, for archs > where __int128 has always been part of the arch ABI definition, that > just uses a C function calling va_arg; this would also be suitable > for folks reusing the code in places like wasm where an asm > implementation isn't suitable and where they have more control over > the tooling. > > > As above, strict conformance to outdated versions of the standard > > > is just not a priority. musl's claim/target is conformance to > > > current versions only and sometimes, on a case-by-case basis, > > > partial support for older ones. =20 > >=20 > > Yes. But this here is really something to consider. Legacy > > executables that are linked dynamically may behave semantically > > different with this patch. This might even have security > > implications. E.g within musl itself in inet_aton.c there is a use > > with a base of `0` that could perhaps be abused to do spoofy > > things. =20 >=20 > One thing that could be done here, but I'm not sure if it's useful or > appropriate, is linking an object file defining a symbol named > something like __c23_profile with value 1 into the application or > shared library built in c23 mode. This would override (via > interposition) a definition with the value zero internal to libc, and > could be used to switch on incompatible features like this. I'm > skeptical whether this kind of thing is something we should do or want > to do, but it's at least something we could consider... >=20 > It seems unfortunate that the committee did not consider this > adequately. I tried my best, but this was basically brushed over without much arguments, and with a quite pittyful attitude. > It would have made a lot more sense to leave the behavior > of base=3D=3D0 alone and add new behaviors with base=3D-1 or something. B= ut > FWIW the same kind of incompatible change already happened with > floating point in the past (strtod/scanf %e/f/g consuming hex floats > rather than reading "0x..." as 0) and the world didn't explode. The integer functions are much more widely used, in particular in security critical situations such as translating IP or memory addresses. If someone happens to place 0b numbers in a textual IP address for example and the application parses them with base 0, c23-compiled libraries would get the right address and c17-compiled forgotten legacy platforms would see a default IP address with all 0. And as musl itself shows, it is tempting to have base 0, here, because currently some people use decimal and some may use hexadecimal numbers. J=E2=82=91=E2=82=99=E2=82=9B --=20 :: ICube :::::::::::::::::::::::::::::: deputy director :: :: Universit=C3=A9 de Strasbourg :::::::::::::::::::::: ICPS :: :: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus :: :: :::::::::::::::::::::::::::::::::::: =E2=98=8E +33 368854536 :: :: https://icube-icps.unistra.fr/index.php/Jens_Gustedt :: --Sig_/wtlshy4PpvpcFj7JkD/UeJ. Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- iF0EARECAB0WIQSN9stI2OFN1pLljN0P0+hp2tU34gUCZFJ5kAAKCRAP0+hp2tU3 4i7+AJ4sXwQgqo9yANUiszMqEtKvjXaFwQCeMq5O8vicml4clIME76/681m7wAY= =2Uol -----END PGP SIGNATURE----- --Sig_/wtlshy4PpvpcFj7JkD/UeJ.--