From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FROM,MAILING_LIST_MULTI, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 21717 invoked from network); 4 Aug 2023 21:17:46 -0000 Received: from minnie.tuhs.org (50.116.15.146) by inbox.vuxu.org with ESMTPUTF8; 4 Aug 2023 21:17:46 -0000 Received: from minnie.tuhs.org (localhost [IPv6:::1]) by minnie.tuhs.org (Postfix) with ESMTP id 1B3B4426A0; Sat, 5 Aug 2023 07:17:42 +1000 (AEST) Received: from mail-lj1-x231.google.com (mail-lj1-x231.google.com [IPv6:2a00:1450:4864:20::231]) by minnie.tuhs.org (Postfix) with ESMTPS id 84BCE4269F for ; Sat, 5 Aug 2023 07:17:35 +1000 (AEST) Received: by mail-lj1-x231.google.com with SMTP id 38308e7fff4ca-2b9ba3d6157so41129301fa.3 for ; Fri, 04 Aug 2023 14:17:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1691183853; x=1691788653; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=tkqPDC2JH1T+RbcMghvgCBQqzzZ9Yp6yullr/MAvUYA=; b=WqExuKQLFEL2acEoLbg5BgqMH2jvA84Bty8zEvei7VJxpjk6W8tgSUmvyiTPZiTw8B Y9AMVowmOpXTgrMMz05MxRn07+UTp68HQ1cNeaZtP4eUfL5cX7bFmxjfss5GDM3LSCGf 2l/ep36USiNdwtppCyA1XzcuggeFGUFPYQD+TTFK6aY+NIecsaC7LJxoHjCsNk9y1yAi 4jaZalgtn6lh5x0OqnmD+aMg04bNGyrzh9bAV2NWW3vsJ8WOop62q2AMBE5J3uJ1WkQI iNVDdeKKKaqCszJ0zxRg+AW/Lb+39znFxkkmGwtjNeqi5pgXd6PlZAX8yts83IrjnCYj xjyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691183853; x=1691788653; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tkqPDC2JH1T+RbcMghvgCBQqzzZ9Yp6yullr/MAvUYA=; b=lq1peU+ug+d0if/zp4xPod1PgSEMERKiI0wqUA1rstUgVA2QiaMt8z8LS6C1QZhxpr b7nm+CKHJb2136uv7mkoM2hy3s9nsy37Pl3XxGte9nN1hm+CSWZPnOT3F4I9eCdn2tZ4 1M/xBoc5PHlbc/9WujiFN0RsMCDXnKPn3bRHdvZ5/j4qj/mKx25EGp6phTNwuFf+Ah6a YCYNCHoczavikdgszNX4DH/Hbf6ugxBoWswqN6kIBj/aizg+j81rZKNV3S8eIAFbKpjm WGyR/7L2MKtbmtYWYu2pEPbJweU1c99Xf+1JsTm3iyFlhML2rgQw6ZTHTw4qhdpcIFE3 8zlQ== X-Gm-Message-State: AOJu0YwTmuWtXPma4ULIptWBvDQ7ZETvWDK+ugCA4oRCuJFx1eHI1vLZ 7kyfT1at3t16S1KvEeykxUZCwxjWD1IQaoCw+szNRB5YXOU= X-Google-Smtp-Source: AGHT+IErct91pMg8KgCmLrnchnrLyGKOW/YZ9g493x4J8aYAwp4g6nRwtihqkn1B10q3ldS++B9TbjJAyfZoHaT36Kc= X-Received: by 2002:a2e:890e:0:b0:2ba:18e5:106d with SMTP id d14-20020a2e890e000000b002ba18e5106dmr2472617lji.1.1691183852981; Fri, 04 Aug 2023 14:17:32 -0700 (PDT) MIME-Version: 1.0 References: <8246.1690761540@cesium.clock.org> <29602.1690887524@cesium.clock.org> <20230803005106.GA12652@mcvoy.com> <202308031657.373GvVvW008640@ultimate.com> <8e45b8b1-cf3c-e47c-bd15-d19a2ae5cc1d@gmail.org> <5dd50d2c-0049-31bf-8975-08ed899a9387@gmail.org> In-Reply-To: From: Dan Cross Date: Fri, 4 Aug 2023 17:16:56 -0400 Message-ID: To: Alejandro Colomar Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Message-ID-Hash: LZ7VOAACDLRVA3DRU4IB7FXCYKZ2LBVU X-Message-ID-Hash: LZ7VOAACDLRVA3DRU4IB7FXCYKZ2LBVU X-MailFrom: crossd@gmail.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: tuhs@tuhs.org X-Mailman-Version: 3.3.6b1 Precedence: list Subject: [TUHS] Re: printf (was: python) List-Id: The Unix Heritage Society mailing list Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Fri, Aug 4, 2023 at 12:57=E2=80=AFPM Alejandro Colomar wrote: > On 2023-08-04 18:06, Dan Cross wrote: > > On Thu, Aug 3, 2023 at 7:55=E2=80=AFPM Alejandro Colomar wrote: > >> On 2023-08-03 23:29, Dan Cross wrote: > >>> On Thu, Aug 3, 2023 at 2:05=E2=80=AFPM Alejandro Colomar wrote: > >>>> - It is type-safe, with the right tools. > >>> > >>> No it's not, and it really can't be. True, there are linters that can > >>> try to match up types _if_ the format string is a constant and all th= e > >>> arguments are known at e.g. compile time, but C permits one to > >>> construct the format string at run time (or just select between a > >>> bunch of variants); the language gives you no tools to enforce type > >>> safety in a meaningful way once you do that. > >> > >> Isn't a variable format string a security vulnerability? Where do you > >> need it? > > > > It _can_ be a security vulnerability, but it doesn't necessarily > > _need_ to be. If one is careful in how one constructs it, such things > > can be very safe indeed. > > > > As to where one needs it, there are examples like `vsyslog()`, > > I guessed you'd mention v*() formatting functions, as that's the only > case where a variable format string is indeed necessary (or kind of). I think you are conflating "necessary" with "possible." > I'll simplify your example to vwarnx(3), from the BSDs, which does less > job, but has a similar API regarding our discussion. > > I'm not sure if you meant vsyslog() uses or its implementation, but > I'll cover both (but for vwarnx(3)). > > Uses: > > This function (and all v*() functions) will be used to implement a > wrapper variadic function, like for example warnx(3). It's there, in > the variadic function, where the string /must be/ a literal, and where No, the format string does not need to be a literal at all: it can be constructed at runtime. Is that a good idea? Perhaps not. Is it possible? Yes. Can the compiler type-check it in that case? No, it cannot (since it hasn't been constructed at compile time). Consider this program: : chandra; cat warn.c #include #include #include #include int main(void) { char buf[1024]; strlcpy(buf, "%s ", sizeof(buf)); strlcat(buf, "%s ", sizeof(buf)); strlcat(buf, "%d", sizeof(buf)); warnx(buf, "Hello", "World", 42); return EXIT_SUCCESS; } : chandra; cc -Wall -Werror -o warn warn.c : chandra; ./warn warn: Hello World 42 : chandra; That's a perfectly legal C program, even if it is a silly one. "Don't do that" isn't a statement about the language, it's a statement about programmer practice, which is the point. > the arguments are checked. There's never a good reason to use a > non-literal there (AFAIK), I believe that you believe that. You may even be right. However, that's not how the language works. > and there are compiler warnings and linters > to enforce that. Since those args have been previously checked, you > should just pass the va_list pristine to other formatting functions. I'm afraid that this reasonable advice misses the point: there's nothing in the language that says you _have_ to do it this way. Some tools may _help_, but they cannot cover all (reasonable) situations. Here again `syslog()` is an interesting example, as it supports the `%m` formatting verb. _An_ implementation of this may work by interpreting the format string and constructing a new one, substituting `strerror(errno)` whenever it hits "%m" and then using `snprintf` (or equivalent) to create the file string that is sent to `syslogd`. You may argue that programmers should only pass constant strings (left deliberately vague since there are reasonable cases where named string constants may be passed as a format string argument in lieu of a literal) that can be checked by clang and gcc, but again, nothing in the language _requires_ that, but the implementation of `vsyslog` that actually implements that logic has no way of knowing that its caller has done this correctly. Similarly, someone may choose to implement a templating language that converts a custom format to a new format string, but assumes that the arguments are in a `va_list` or similar. Bad idea? Probably. Legal in C? Yes. > Then, as long as libc doesn't have bugs, you're fine. That's a tall order. > In the implementation of a v*() function: > > Do /not/ touch the va_list. Just pass it to the next function. Of > course, in the end, libc will have to iterate over it and do the job, > but that's not the typical programmer's problem. Here's the libbsd > implementation of vwarnx(3), which does exactly that: no messing with > the va_list. > > $ grepc vwarnx > ./include/bsd/err.h:63: > void vwarnx(const char *format, va_list ap) > __printflike(1, 0); > > > ./src/err.c:97: > void > vwarnx(const char *format, va_list ap) > { > fprintf(stderr, "%s: ", getprogname()); > if (format) > vfprintf(stderr, format, ap); > fprintf(stderr, "\n"); > } > > > Just put a [[gnu::format(printf)]] in the outermost wrapper, which > should be using a string literal, and you'll be fine. Using a number of extensions aside here, again, that's just (sadly) not how the language works. > > but > > that's almost besides the point, which is that given that you _can_ do > > things like that, the language can't really save you by type-checking > > the arguments to printf; and once varargs are in the mix? Forget about > > it. > > Not really. You can do that _only_ if you really want. Yes, that's the point: if we're talking about language-level guarantees, the language can't help you here. It can try, and it can hit a lot of really useful cases, but not all. By contrast, formatting in Go and Rust is type-safe by construction. > If you want to > not be able, you can "drop privileges" by adding a few flags to your > compiler, such as -Werror=3Dformat-security -Werror=3Dformat-nonliteral, > and add a bunch of linters to your build system for more redundancy, > and voila, your project is now safe. Provided that you use a compiler that provides those options, or that those linters are viable in your codebase. ;-) - Dan C.