From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/12506 Path: news.gmane.org!.POSTED!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: stdio review Date: Tue, 13 Feb 2018 23:33:04 -0500 Message-ID: <20180214043304.GP1627@brightrain.aerifal.cx> References: <20180213165953.GN1627@brightrain.aerifal.cx> <20180213192131.GO1627@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1518582697 1236 195.159.176.226 (14 Feb 2018 04:31:37 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 14 Feb 2018 04:31:37 +0000 (UTC) User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-12523-gllmg-musl=m.gmane.org@lists.openwall.com Wed Feb 14 05:31:33 2018 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1eloiv-0007Pe-UX for gllmg-musl@m.gmane.org; Wed, 14 Feb 2018 05:31:14 +0100 Original-Received: (qmail 5332 invoked by uid 550); 14 Feb 2018 04:33:17 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 5307 invoked from network); 14 Feb 2018 04:33:16 -0000 Content-Disposition: inline In-Reply-To: Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:12506 Archived-At: On Tue, Feb 13, 2018 at 10:49:04PM -0500, Dale Weiler wrote: > >> fgetpos.c: > >> fgetpos: [bug] > >> using *(off_t*) to write _Int64 data into fpos_t structure when > >> memcpy should be used, this breaks strict aliasing. Maybe add > >> an off_t to the fpos_t union? > > > My leaning would be to add the off_t, but the type might not be > > exposed and thus we would need to find a matching type that is > > exposed. memcpy would be the nicest solution, but only if we had a way > > of allowing the compiler to use builtin memcpy; otherwise it's a > > gratuitous call. > > Seeing as the type _is_ exposed, adding the off_t to the union is likewise It's not exposed in plain C profile, only with POSIX or higher feature test macros. We could just use long long though since the actual type doesn't need to match; it only needs to be an integer type capable of holding the range of off_t. > > Compound literal table to reference whence as a lookup table > > as a single expression. > > > I thought this was cute. > > It's definitely cute, but it does depend on the seek argument being > one of the macro definitions in the [0, 2] range which I had to check, > obviously those have no reason to ever be anything but those values; > ABI and all but it's just additional mental load to ensure they weren't > hence why I brought it up. The check right above demonstrates the assumption of those definitions. I suppose we could use designated initializers to make it so the order is clear without changing the generated code... > >> fwrite.c: > >> fwrite: [question] > >> Should there be a check for size*nmemb overflow? > > > This is actually a complicated topic. Formally, I think the C standard > > reads such that it would be valid for size*nmemb to exceed the size of > > the data object to be written if you somehow know you'll hit a write > > error before that happens. However real world implementations don't > > work like that. In particular, the kernel will error out with EFAULT > > if the buffer length extends past the valid userspace address range, > > even if the writes would never happen; the only way to avoid this > > would be to break longer-than-page writes down into separate > > page-sized writes. So I think for practical purposes, we have to > > interpret the standard as requiring that size*nmemb actually reflect > > the size of the object passed in, and in that case, the multiplication > > necessarily does not overflow. If there's an interpretation from WG14 > > contrary to this, we'll have to revisit it. > > > See also https://sourceware.org/bugzilla/show_bug.cgi?id=19165 > > That is an interesting and somewhat odd edge case. Maybe for the time > being a comment within here w.r.t it maybe needing to be revisited > wouldn't hurt. In either case it doesn't appear to be harming anything. Yes, a short comment here (and in fread.c) might be nice. > >> gets.c: > >> gets: [optimize] > >> The length of the string is calculated twice to strip the > >> newline character off it. Why not rewrite it as: > >> if (ret) { size_t i = strlen(s)-1; if (s[i] == '\n') s[i] = 0; } > > > Seriously, this is gets. It's always unsafe, deprecated, removed from > > the current C standard. If it's gratuitously slow too, great. :-) > > Yes it's gets, but fixing it for O(n) instead of O(n*2) does make the musl > static set slightly smaller, also makes programs using it crash twice as > fast ;-) :-) > >> stdio_impl.h: [style] > >> FUNLOCK macro has a trailing else which prompted me to look at every > >> single instance of FUNLOCK to make sure all of them contained a > >> semicolon. This is just dangerous, why not use the more common > >> idiom of do { } while (0). > > > Indeed that should be fine. > > I think it's better understood by most folks as well, glad we're on the same > page w.r.t this one. At least then you cannot fail to forget the semicolon. Yes, exactly. > >> intscan.h: [style] > >> It isn't apparent for why needs to be included. Should > >> just forward declare struct FILE; here instead. > > > That would not work, because it's *not* struct FILE, it's FILE, which > > happens to be defined as "struct _IO_FILE", but that's an > > implementation detail. Including is the clean way to have > > that. > > I don't understand why you couldn't replicate that behavior. It's what > stdio.h already _does_ and seeing as the associated translation unit > already includes stdio.h it seems gratuitously excessive. It's just an > opaque pointer type being passed, how is a forward declaration > incorrect. Does C distinguish between "opaque T" and "opaque T" > with different underlying struct? If so I have many of code that needs > to be changed on my end. See 6.2.7 Compatible type and composite type ΒΆ1: "Moreover, two structure, union, or enumerated types declared in separate translation units are compatible if their tags and members satisfy the following requirements: If one is declared with a tag, the other shall be declared with the same tag. If ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ both are completed anywhere within their respective translation units, then the following additional requirements apply: there shall be a one-to-one correspondence between their members such that each pair of corresponding members are declared with compatible types; if one member of the pair is declared with an alignment specifier, the other is declared with an equivalent alignment specifier; and if one member of the pair is declared with a name, the other is declared with the same name. For two structures, corresponding members shall be declared in the same order. For two structures or unions, corresponding bit-fields shall have the same widths. For two enumerations, corresponding members shall have the same values." Without fancy linker-level checks for UB or LTO, you might say the UB from declarations with incompatible types is purely theoretical, but the bigger issue is that, when you did include both intscan.h with "struct FILE *" and stdio_impl.h in the implementation file intscan.c, the conflicting definitions would be visible and you'd get an error. Yes of course we could declare it explicitly using "struct _IO_FILE *" in intscan.h, but that would violate DRY and would break if the tag were ever changed (e.g. in a future ABI not using glibc compat cruft). Including a header that defines FILE really is the right way to do this. Rich