From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.1 required=5.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FROM,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 13385 invoked from network); 16 Jul 2023 19:29:41 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 16 Jul 2023 19:29:41 -0000 Received: (qmail 3628 invoked by uid 550); 16 Jul 2023 19:29:38 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 3593 invoked from network); 16 Jul 2023 19:29:38 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1689535766; x=1692127766; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=5FLfJcr+tJD3ofaCuvOufprkFZf6UEGOJhnpL1rQwaI=; b=XpqqP5tjDa2Gz637MmeX3Ax7uiuq89LAsUCurw1GCNLpXlKHuKoKxl++2sWn3Sf6jF UmbWheqhG1u8CP0PSRmL6Z/kJve4rGc7D99u94OAr6kG1VJNcpCGAbPQYk2SXEzAGa1j 1gsgzfm1PWLJhhLVVBlCGBUTzXRdBN5EGNQPAvAcGEAcnW/Xb/LWno75c/C4W3fxg+hv JJ4pUP+73KdO1tY6unhmtwi6x+4WDzenY0I8TRXgDgH2yEqvIY6KevAJGXPf6FMmDKQN 7d+nA0d16Nx7WxwDNb+SNaiCDLdN29T++OrexJ+OtxlJHWd1sPcx9RpvV5kqaAe0AuDp 3YRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689535766; x=1692127766; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5FLfJcr+tJD3ofaCuvOufprkFZf6UEGOJhnpL1rQwaI=; b=PTyryuUrS8M765tsPRP91ui9nzYi7OT3ssOmowkzyZkm/XbA8kObBUMUNkOfY6XPPc 0Jxp3kkZ1zLHCfMy6cQkEq2ejDWRL7TMvVD+cZ0PUm0fsDAU8M0HdEsF9OajnI+y6ihN /DqmoxOWL60IIo36wbTSnZMSqs4nL3RTSlCFTPsaZE93XOWOF6mBGRQRfctbmSuk5ua7 o8vbJCVZRBu+u5jUqd/qOTjCk3CD+3Mukw+t6HiKn/IjxzBRfbRiJDE9C6QKFXQNS6OH ZW+ghXvwtyvpY2v+EFeSkTDLicMIt2MgN+mNY4QePfQBhM+9MTXPyzUwpvL6XTW3PWKF OUeA== X-Gm-Message-State: ABy/qLa+XDfHzYPO0w19D6xkFvq534jaA1J+YgXTGg2uB1ue/i8gNpms oiz4wJVPqvRB4QRfnG9uFe4SlY9evED/Z2CuCe2GoIoMEP0= X-Google-Smtp-Source: APBJJlFc2qi9NBvUBfElV77znnstKa0XuRv+ikbjlSuUzQRmhnxQYBaTGiQUAkfc0ekOzIHwdUs/iYK0QkFN+4Svx9s= X-Received: by 2002:a05:620a:29ce:b0:767:4a4:f3cc with SMTP id s14-20020a05620a29ce00b0076704a4f3ccmr13174406qkp.26.1689535765736; Sun, 16 Jul 2023 12:29:25 -0700 (PDT) MIME-Version: 1.0 References: <20230716174945.qc6234b654k5eebx@gen2.localdomain> In-Reply-To: From: Pedro Falcato Date: Sun, 16 Jul 2023 20:29:14 +0100 Message-ID: To: musl@lists.openwall.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [musl] strcmp() guarantees and assumptions On Sun, Jul 16, 2023 at 8:24=E2=80=AFPM Pedro Falcato wrote: > > On Sun, Jul 16, 2023 at 7:00=E2=80=AFPM Robert Clausecker wr= ote: > > > > Hi NRK, > > > > Thank you for your response. > > > > Am Sun, Jul 16, 2023 at 11:49:45PM +0600 schrieb NRK: > > > Hi Robert, > > > > > > > Or to phrase it differently, is the following a legal implementatio= n of > > > > strcmp()? > > > > > > > > int strcmp(char *a, char *b) { > > > > size_t la =3D strlen(a), lb =3D strlen(b); > > > > > > > > if (la !=3D lb) > > > > return ((la > lb) - (lb > la)); > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > > > I don't see how this can ever be a valid strcmp implementation. The > > > return value of the comparison functions must be about the first > > > mismatching byte, not about the string lengths. > > > > > > | The sign of a nonzero value returned by the comparison functions is > > > | determined by the sign of the difference between the values of the > > > | first pair of characters that differ in the objects being compared. > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > Yes, sorry. The code would have to be extended to call memcmp() on the > > common prefix in case there is a mismatch in length. E.g. > > > > if (la !=3D lb) > > return (memcmp(la, lb, la > lb ? lb + 1 : la + 1)); > > > > > ref: https://port70.net/~nsz/c/c11/n1570.html#7.24.4p1 > > > > > > > Or is it generally agreed upon that libc implementations support > > > > strcmp() calls on unterminated strings? > > > > > > memchr (since C11) has the following requirement: > > > > > > | The implementation shall behave as if it reads the characters > > > | sequentially and stops as soon as a matching character is found. > > > > > > I don't believe any such requirement exists for strcmp, so unless > > > someone proves otherwise, I'd say it's fair game for libc to assume t= hat > > > the strings are nul-terminated. > > > > That's good to hear. Any idea on the =E2=80=9Cwhat do existing libc > > implementations permit=E2=80=9D bit? > > Looks like it's permissive. > At the moment, musl does (non-SIMD, obviously) unsigned long loads *as > long as they're aligned* (you don't want to page fault! and reads > don't have side effects unless it's MMIO or something, and that's > non-standard) and does standard(tm) bit tricks to find null bytes in > that same word. Oops, sorry, had a brainfart there and misread your strcmp as strlen. In any case, it is AFAIK permissive as you could tell from implementations such as bionic's ssse3-strcmp-atom.S. --=20 Pedro