From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 9061 invoked from network); 16 Jul 2023 17:23:07 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 16 Jul 2023 17:23:07 -0000 Received: (qmail 22176 invoked by uid 550); 16 Jul 2023 17:23:02 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 22141 invoked from network); 16 Jul 2023 17:23:02 -0000 Date: Sun, 16 Jul 2023 19:22:38 +0200 From: Robert Clausecker To: musl@lists.openwall.com Cc: mjg@freebsd.org Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit Subject: [musl] strcmp() guarantees and assumptions Greetings, I am currently developing SIMD-enhanced implementations of libc functions for the FreeBSD libc. One of the next functions I want to tackle is strcmp(). There, the following question obtains: Is strcmp() permitted to assume that its arguments are NUL terminated strings? Or to phrase it differently, is the following a legal implementation of strcmp()? int strcmp(char *a, char *b) { size_t la = strlen(a), lb = strlen(b); if (la != lb) return ((la > lb) - (lb > la)); return memcmp(a, b, la); } A situation I dimly recall where this assumption did not hold was in a program that used strcmp() to compare two buffers known to have a mismatch somewhere, but without guaranteed NUL termination. A naïve strcmp() implementation processed this just fine, but this one might crash. I have previously asked the ISO/IEC 9899:2023 editor [1] who indicated that he believes my interpretation to be correct, but asked me to look for a second opinion. Assuming that my assumption on strcmp() is correct, is this an assumption common libc implementations make? Or is it generally agreed upon that libc implementations support strcmp() calls on unterminated strings? Thank you for your help. Yours, Robert Clausecker [1]: https://twitter.com/__phantomderp/status/1680614038567354370 -- () ascii ribbon campaign - for an 8-bit clean world /\ - against html email - against proprietary attachments