mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] changes for scanf in C23
@ 2023-05-29 10:32 Jₑₙₛ Gustedt
  2023-05-29 15:59 ` Rich Felker
  0 siblings, 1 reply; 4+ messages in thread
From: Jₑₙₛ Gustedt @ 2023-05-29 10:32 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 2161 bytes --]

Hi,
we already discussed this but it doesn't seem that we have come to a
conclusion.

The problem is that for C23 semantics of several string to integer
conversion functions change: a 'b' or 'B' that previously was the stop
condition for integer parsing may become part of the integer
string. This concerns all `scanf` and `strto` derivatives.

This is probably not a problem for most applications that parse
strings to integers, but it could be in some situations, and in
particular it could open vulnerabilities. E.g network addresses that
are read with base `0` (musl does this at some point to allow to have
decimal or hex strings) could be open to attacks, once people start
using binary encodings for integers more often. Another scenario where
this could lead to harm is automatically produced output that is
automatically scanned, and where nobody previously took care of proper
word boundaries.

My current idea is to have two sets of these functions, one that has
the old semantics and one that has the new.

 - Newly compiled objects that don't do fancy stuff (such as
   `(scanf)(...)` or `#undef scanf`) would see hard-coded linker
   symbols such as `scanf-c17` or `scanf-c23` according to the
   standard's version they compile against. When linking statically,
   this would just chose that one particular set of functions. The
   dynamic library would always have both versions, to accomdate
   objects that have been compiled with any standard's version.

 - Old compiled objects and executables as well as those where users
   chose to `#undef` or use their own headers/prototyes would receive
   a default (something like: starting with version X, musl uses C23
   semantics), but which could be overwritten under the responsibility
   of the provider of the compiled musl library.

Jₑₙₛ

-- 
:: ICube :::::::::::::::::::::::::::::: deputy director ::
:: Université de Strasbourg :::::::::::::::::::::: ICPS ::
:: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
:: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
:: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [musl] changes for scanf in C23
  2023-05-29 10:32 [musl] changes for scanf in C23 Jₑₙₛ Gustedt
@ 2023-05-29 15:59 ` Rich Felker
  2023-05-29 19:10   ` Jₑₙₛ Gustedt
  0 siblings, 1 reply; 4+ messages in thread
From: Rich Felker @ 2023-05-29 15:59 UTC (permalink / raw)
  To: Jₑₙₛ Gustedt; +Cc: musl

On Mon, May 29, 2023 at 12:32:02PM +0200, Jₑₙₛ Gustedt wrote:
> Hi,
> we already discussed this but it doesn't seem that we have come to a
> conclusion.
> 
> The problem is that for C23 semantics of several string to integer
> conversion functions change: a 'b' or 'B' that previously was the stop
> condition for integer parsing may become part of the integer
> string. This concerns all `scanf` and `strto` derivatives.
> 
> This is probably not a problem for most applications that parse
> strings to integers, but it could be in some situations, and in
> particular it could open vulnerabilities. E.g network addresses that
> are read with base `0` (musl does this at some point to allow to have
> decimal or hex strings) could be open to attacks, once people start
> using binary encodings for integers more often. Another scenario where
> this could lead to harm is automatically produced output that is
> automatically scanned, and where nobody previously took care of proper
> word boundaries.
> 
> My current idea is to have two sets of these functions, one that has
> the old semantics and one that has the new.

This was rejected already in the first proposal (thread here):

Message-ID: <20230503000045.GU4163@brightrain.aerifal.cx>
https://www.openwall.com/lists/musl/2023/05/03/1

    "There are not going to be different versions of scanf/strto*
    because there's just no way to do that in a conforming way..."

There are other reasons for this too that basically amount to not
repeating glibc mistakes.

At some point I proposed a way that we could do C-version-specific
behavior via branching on an extern defined by linking in c23+ mode,
if this is really necessary. This probably needs more thought to flesh
out a design that's robust and has the right properties and make sure
we don't do anything that locks us into future trouble.

However, as I've said before, C users have survived multiple repeated
incompatible changes of this form, including the same thing happening
with hex floats. Moreover, strto* already are permitted to accept
arbitrary additional implementation-defined sequences except in the C
locale, so there's only any change at all in the C locale. My leaning,
if the committee is going to make these kinds of incompatible changes,
is to say that applications just have to be prepared to deal with them
and do any additional validation they deem necessary to their usage
cases.

Rich

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [musl] changes for scanf in C23
  2023-05-29 15:59 ` Rich Felker
@ 2023-05-29 19:10   ` Jₑₙₛ Gustedt
  2023-08-26 20:58     ` Fangrui Song
  0 siblings, 1 reply; 4+ messages in thread
From: Jₑₙₛ Gustedt @ 2023-05-29 19:10 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 2029 bytes --]

Rich,

on Mon, 29 May 2023 11:59:29 -0400 you (Rich Felker <dalias@libc.org>)
wrote:

> On Mon, May 29, 2023 at 12:32:02PM +0200, Jₑₙₛ Gustedt wrote:
> > Hi,
> > we already discussed this but it doesn't seem that we have come to a
> > conclusion.
> > 
> > The problem is that for C23 semantics of several string to integer
> > conversion functions change: a 'b' or 'B' that previously was the
> > stop condition for integer parsing may become part of the integer
> > string. This concerns all `scanf` and `strto` derivatives.
> > 
> > This is probably not a problem for most applications that parse
> > strings to integers, but it could be in some situations, and in
> > particular it could open vulnerabilities. E.g network addresses that
> > are read with base `0` (musl does this at some point to allow to
> > have decimal or hex strings) could be open to attacks, once people
> > start using binary encodings for integers more often. Another
> > scenario where this could lead to harm is automatically produced
> > output that is automatically scanned, and where nobody previously
> > took care of proper word boundaries.
> > 
> > My current idea is to have two sets of these functions, one that has
> > the old semantics and one that has the new.  
> 
> This was rejected already in the first proposal (thread here):
> 
> Message-ID: <20230503000045.GU4163@brightrain.aerifal.cx>
> https://www.openwall.com/lists/musl/2023/05/03/1
> 
>     "There are not going to be different versions of scanf/strto*
>     because there's just no way to do that in a conforming way..."

Alright, saves me a lot of trouble. I'll forward all complaints by
users to you ;-)

Jₑₙₛ

-- 
:: ICube :::::::::::::::::::::::::::::: deputy director ::
:: Université de Strasbourg :::::::::::::::::::::: ICPS ::
:: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
:: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
:: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [musl] changes for scanf in C23
  2023-05-29 19:10   ` Jₑₙₛ Gustedt
@ 2023-08-26 20:58     ` Fangrui Song
  0 siblings, 0 replies; 4+ messages in thread
From: Fangrui Song @ 2023-08-26 20:58 UTC (permalink / raw)
  To: musl; +Cc: Rich Felker, Jₑₙₛ Gustedt

On Mon, May 29, 2023 at 12:11 PM Jₑₙₛ Gustedt <jens.gustedt@inria.fr> wrote:
>
> Rich,
>
> on Mon, 29 May 2023 11:59:29 -0400 you (Rich Felker <dalias@libc.org>)
> wrote:
>
> > On Mon, May 29, 2023 at 12:32:02PM +0200, Jₑₙₛ Gustedt wrote:
> > > Hi,
> > > we already discussed this but it doesn't seem that we have come to a
> > > conclusion.
> > >
> > > The problem is that for C23 semantics of several string to integer
> > > conversion functions change: a 'b' or 'B' that previously was the
> > > stop condition for integer parsing may become part of the integer
> > > string. This concerns all `scanf` and `strto` derivatives.
> > >
> > > This is probably not a problem for most applications that parse
> > > strings to integers, but it could be in some situations, and in
> > > particular it could open vulnerabilities. E.g network addresses that
> > > are read with base `0` (musl does this at some point to allow to
> > > have decimal or hex strings) could be open to attacks, once people
> > > start using binary encodings for integers more often. Another
> > > scenario where this could lead to harm is automatically produced
> > > output that is automatically scanned, and where nobody previously
> > > took care of proper word boundaries.
> > >
> > > My current idea is to have two sets of these functions, one that has
> > > the old semantics and one that has the new.
> >
> > This was rejected already in the first proposal (thread here):
> >
> > Message-ID: <20230503000045.GU4163@brightrain.aerifal.cx>
> > https://www.openwall.com/lists/musl/2023/05/03/1
> >
> >     "There are not going to be different versions of scanf/strto*
> >     because there's just no way to do that in a conforming way..."
>
> Alright, saves me a lot of trouble. I'll forward all complaints by
> users to you ;-)
>
> Jₑₙₛ
>
> --
> :: ICube :::::::::::::::::::::::::::::: deputy director ::
> :: Université de Strasbourg :::::::::::::::::::::: ICPS ::
> :: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus ::
> :: :::::::::::::::::::::::::::::::::::: ☎ +33 368854536 ::
> :: https://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

Nice! Not introducing new symbols saved me time on
https://reviews.llvm.org/D158943 (sanitizer patch to intercept glibc
2.38 introduced __isoc23_strtol and __isoc23_scanf family functions
for binary compatibility).

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-08-26 20:59 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-29 10:32 [musl] changes for scanf in C23 Jₑₙₛ Gustedt
2023-05-29 15:59 ` Rich Felker
2023-05-29 19:10   ` Jₑₙₛ Gustedt
2023-08-26 20:58     ` Fangrui Song

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).