From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 10560 invoked from network); 16 Jul 2023 18:00:13 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 16 Jul 2023 18:00:13 -0000 Received: (qmail 16156 invoked by uid 550); 16 Jul 2023 18:00:10 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 16120 invoked from network); 16 Jul 2023 18:00:09 -0000 Date: Sun, 16 Jul 2023 19:59:57 +0200 From: Robert Clausecker To: musl@lists.openwall.com Message-ID: References: <20230716174945.qc6234b654k5eebx@gen2.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20230716174945.qc6234b654k5eebx@gen2.localdomain> Subject: Re: [musl] strcmp() guarantees and assumptions Hi NRK, Thank you for your response. Am Sun, Jul 16, 2023 at 11:49:45PM +0600 schrieb NRK: > Hi Robert, > > > Or to phrase it differently, is the following a legal implementation of > > strcmp()? > > > > int strcmp(char *a, char *b) { > > size_t la = strlen(a), lb = strlen(b); > > > > if (la != lb) > > return ((la > lb) - (lb > la)); > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > I don't see how this can ever be a valid strcmp implementation. The > return value of the comparison functions must be about the first > mismatching byte, not about the string lengths. > > | The sign of a nonzero value returned by the comparison functions is > | determined by the sign of the difference between the values of the > | first pair of characters that differ in the objects being compared. > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Yes, sorry. The code would have to be extended to call memcmp() on the common prefix in case there is a mismatch in length. E.g. if (la != lb) return (memcmp(la, lb, la > lb ? lb + 1 : la + 1)); > ref: https://port70.net/~nsz/c/c11/n1570.html#7.24.4p1 > > > Or is it generally agreed upon that libc implementations support > > strcmp() calls on unterminated strings? > > memchr (since C11) has the following requirement: > > | The implementation shall behave as if it reads the characters > | sequentially and stops as soon as a matching character is found. > > I don't believe any such requirement exists for strcmp, so unless > someone proves otherwise, I'd say it's fair game for libc to assume that > the strings are nul-terminated. That's good to hear. Any idea on the “what do existing libc implementations permit” bit? Yours, Robert Clausecker -- () ascii ribbon campaign - for an 8-bit clean world /\ - against html email - against proprietary attachments