From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.2 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by inbox.vuxu.org (OpenSMTPD) with SMTP id d4ecff06 for ; Sun, 19 Jan 2020 21:34:45 +0000 (UTC) Received: (qmail 20421 invoked by uid 550); 19 Jan 2020 21:34:43 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 20403 invoked from network); 19 Jan 2020 21:34:43 -0000 To: musl@lists.openwall.com References: <20200119110743.GD2020@voyager> <20200119113134.GJ23985@port70.net> <8299f261-7870-57a6-37cf-d4ce482ad81e@openwall.com> <06cd383a-d1c3-d59c-ddd0-c0bd38ad2962@openwall.com> <20200119163936.GF30412@brightrain.aerifal.cx> From: Alexander Cherepanov Message-ID: Date: Mon, 20 Jan 2020 00:34:31 +0300 MIME-Version: 1.0 In-Reply-To: <20200119163936.GF30412@brightrain.aerifal.cx> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [musl] Minor style patch to exit.c On 19/01/2020 19.39, Rich Felker wrote: > On Sun, Jan 19, 2020 at 07:33:08PM +0300, Alexander Cherepanov wrote: >> On 19/01/2020 17.46, Alexander Monakov wrote: >>> On Sun, 19 Jan 2020, Alexander Cherepanov wrote: >>> >>>> Couldn't _start defined as an array? Then separate values could be accessed >>>> simply as elements of this array. And casts to integers could be limited to >>>> calculating the number of elements, the terminating value or something. >>> >>> Yeah, I think usually such linker-provided symbols are declared as >>> extern arrays. I'm surprised that isn't the case in musl. I don't think >>> declaring them as arrays helps with making casts pedantically suitable for >>> calculating number of elements though - as you said, any bijection between >>> intptr_t and pointers would be a valid implementation of a cast, you're not >> >> Well, we want use from C some outside info, there could be no >> pedantic way to do this. Let's see, we know that the _end array >> follows the _start array in memory. This means that &_start[i] == >> &_end[0] for some i. But different provenance of the pointers means >> that we cannot do it just like that. Adding a cast should fix this. >> Summarizing, it should look like this: >> >> for (size_t i = 0; (uintptr_t)&_start[i] != (uintptr_t)&_end[0]; i++) >> >> or >> >> for (type *p = _start; (uintptr_t)p != (uintptr_t)_end; p++) > > This works for forward walk, not backwards walk. Oops, then asm barriers look more attractive. >>> guaranteed that (intptr_t)&a[i] == (intptr_t)a + i * sizeof *a. >> >> While you are inside one object, I think this should be safe in >> practice. For gcc, this is more or less guaranteed by [3]. BTW there >> is an explicit restriction there: >> >> "When casting from pointer to integer and back again, the resulting >> pointer must reference the same object as the original pointer, >> otherwise the behavior is undefined. That is, one may not use >> integer arithmetic to avoid the undefined behavior of pointer >> arithmetic as proscribed in C99 and C11 6.5.6/8." >> >> [3] https://gcc.gnu.org/onlinedocs/gcc/Arrays-and-pointers-implementation.html > > GCC is badly wrong here, and it breaks XOR linked lists and other > things. Why is that? Integers (and pointers as it turned out) could have several provenances as tracked by gcc, so XOR linked lists should be fine. > It's also worded imprecisely. Sure. > What does it mean if arithmetic > is performed on the value between the cast and cast back. What if two > pointers go into the arithmetic, but complex mathematical relations > result in one of the original values coming out, and the compiler can > only "see" the other pointer going in? Will it then wrongly assume > that the result points to the same object as the pointer it "saw" go > in? I looked into exactly this about 3 week ago:-) Rediscovered an old gcc bug and found that the problem happens even without any casts -- see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49330#c28 . > This whole provenance thing is a trashfire. Pluses of the provenance thing: `a[i] = 1;` could be moved over `b[j] = 2;` when `a` and `b` are different array while `i` and `j` are unknown. Minuses of the provenance thing: slight inconvenience in cases like with _start & _end. The pluses seem to outweigh the minuses. Did I miss something important? What I recently found definitely wrong is instability of equality `&x + 1 == &y`. This leads to outright nonsense. -- Alexander Cherepanov