From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/13235 Path: news.gmane.org!.POSTED!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: string-backed FILEs mess Date: Wed, 12 Sep 2018 14:03:10 -0400 Message-ID: <20180912180310.GA1878@brightrain.aerifal.cx> References: <20180912140239.GV1878@brightrain.aerifal.cx> <20180912150941.GB13976@voyager> <20180912154306.GW1878@brightrain.aerifal.cx> <20180912174112.GC13976@voyager> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1536775287 15202 195.159.176.226 (12 Sep 2018 18:01:27 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 12 Sep 2018 18:01:27 +0000 (UTC) User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-13251-gllmg-musl=m.gmane.org@lists.openwall.com Wed Sep 12 20:01:23 2018 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1g09S3-0003oO-HC for gllmg-musl@m.gmane.org; Wed, 12 Sep 2018 20:01:19 +0200 Original-Received: (qmail 24376 invoked by uid 550); 12 Sep 2018 18:03:28 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 24344 invoked from network); 12 Sep 2018 18:03:22 -0000 Content-Disposition: inline In-Reply-To: <20180912174112.GC13976@voyager> Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:13235 Archived-At: On Wed, Sep 12, 2018 at 07:41:12PM +0200, Markus Wichmann wrote: > On Wed, Sep 12, 2018 at 11:43:06AM -0400, Rich Felker wrote: > > On Wed, Sep 12, 2018 at 05:09:41PM +0200, Markus Wichmann wrote: > > > Well, first of all, I might set my foot wrong here very badly, but I > > > generally don't care about C standard UB as long as the behavior is > > > defined elsewhere. > > > > Like where? In order for it to be defined, the *compiler* has to > > define it, since otherwise it can make transformations that assume the > > behavior is undefined. So what you're asking for here is basically > > amounting to only supporting certain compilers (with certain flags), > > and notably *not supporting* UBSan, which is a really valuable tool > > for catching bugs. > > Oh, I didn't think of that. But the compiler still has to follow the > ABI, and the ABI says we have linear addresses. The ABI defines an interface boundary. These transformations do not take place at or across boundaries but in contexts where no boundary is present and ABI does not apply. The possibility of them is the only reason a tool like UBSan or _FORTIFY_SOURCE can work; otherwise what these tools do would be invalid, arbitrarily breaking well-defined code because they think it's bad style rather than justifiedly changing the behavior of cases where the behavior is not defined. > So the pointer to > integer mapping still has to work, and (void*)-1 is defined in the SysV > ABI. Wouldn't make much sense for DOS, but hey, that's not a supported > platform. (Actually that's a bad example, because it would totally make > sense as the far pointer to FFFF:FFFF, but you get my point.) Creating a pointer like (void*)-1 is implementation- (platform ABI-) defined, but that doesn't mean you can perform arbitrary operations on it and have the results be meaningful or even defined. In particular the -,<,<=,>,>= operators are only defined for a pair of pointers into the same array. Since (void*)-1 is not a pointer into any array object, comparing or subtracting it is not defined. > Besides, you're opening a very scary door there: The C standard's > chapter 7 contains a whole lot of UB in the library, and a compiler > writer could now say: Since it is undefined, obviously it is never going > to happen (and if it does, it is your own fault), so I can write the > optimizer to assume all arguments to functions are such that UB does not > occur. The standard says fflush() is only defined for output streams, so > we're going to assume any stream passed into fflush() is an output > stream and... I don't know, assume all input functions are going to fail > until the next fseek()? Actually, I'm drawing a blank as to what they > could do with this, but the GCC folks would find a way to mess with my > code. That's exactly why we have to use -ffreestanding, which says we want to use the compiler as a freestanding C implementation that does not include the standard library functions and corresponding assumptions about their behavior. Without that, for example, the code in calloc that only zero-fills the buffer produced by malloc if it's not already zero will be optimized out, since the inspection of uninitialized memory from malloc is undefined. Likewise the implementation of memcpy could be optimized to a call to itself (infinite recursion). > As for UBSan: Can't these sanitizers get their fingers out of the system > implementation? If you use UBSan for an application, it's of course not going to do anything to code in libc. However you can also use it when building libc, in order to find dangerous bugs. See this recent GCC bug report where, if not for a bug in UBSan, it would have caught a serious, dangerous regression in musl: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87191 (Thankfully it was caught manually before release.) Rich