mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Pedro Falcato <pedro.falcato@gmail.com>
To: musl@lists.openwall.com
Subject: Re: [musl] strcmp() guarantees and assumptions
Date: Sun, 16 Jul 2023 20:29:14 +0100	[thread overview]
Message-ID: <CAKbZUD3O_NeiK3WiyQDsOD5u2KrPptMynx6CK9PXRkRh_NRmhQ@mail.gmail.com> (raw)
In-Reply-To: <CAKbZUD2+=xieZ8cDEabEnjAubc_C8hWSPVy8E8kn+QKRrXdMMA@mail.gmail.com>

On Sun, Jul 16, 2023 at 8:24 PM Pedro Falcato <pedro.falcato@gmail.com> wrote:
>
> On Sun, Jul 16, 2023 at 7:00 PM Robert Clausecker <fuz@fuz.su> wrote:
> >
> > Hi NRK,
> >
> > Thank you for your response.
> >
> > Am Sun, Jul 16, 2023 at 11:49:45PM +0600 schrieb NRK:
> > > Hi Robert,
> > >
> > > > Or to phrase it differently, is the following a legal implementation of
> > > > strcmp()?
> > > >
> > > >     int strcmp(char *a, char *b) {
> > > >             size_t la = strlen(a), lb = strlen(b);
> > > >
> > > >             if (la != lb)
> > > >                     return ((la > lb) - (lb > la));
> > >                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > >
> > > I don't see how this can ever be a valid strcmp implementation. The
> > > return value of the comparison functions must be about the first
> > > mismatching byte, not about the string lengths.
> > >
> > > | The sign of a nonzero value returned by the comparison functions is
> > > | determined by the sign of the difference between the values of the
> > > | first pair of characters that differ in the objects being compared.
> > >   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >
> > Yes, sorry.  The code would have to be extended to call memcmp() on the
> > common prefix in case there is a mismatch in length.  E.g.
> >
> >     if (la != lb)
> >         return (memcmp(la, lb, la > lb ? lb + 1 : la + 1));
> >
> > > ref: https://port70.net/~nsz/c/c11/n1570.html#7.24.4p1
> > >
> > > > Or is it generally agreed upon that libc implementations support
> > > > strcmp() calls on unterminated strings?
> > >
> > > memchr (since C11) has the following requirement:
> > >
> > > | The implementation shall behave as if it reads the characters
> > > | sequentially and stops as soon as a matching character is found.
> > >
> > > I don't believe any such requirement exists for strcmp, so unless
> > > someone proves otherwise, I'd say it's fair game for libc to assume that
> > > the strings are nul-terminated.
> >
> > That's good to hear.  Any idea on the “what do existing libc
> > implementations permit” bit?
>
> Looks like it's permissive.
> At the moment, musl does (non-SIMD, obviously) unsigned long loads *as
> long as they're aligned* (you don't want to page fault! and reads
> don't have side effects unless it's MMIO or something, and that's
> non-standard) and does standard(tm) bit tricks to find null bytes in
> that same word.

Oops, sorry, had a brainfart there and misread your strcmp as strlen.
In any case, it is AFAIK permissive as you could tell from
implementations such as bionic's ssse3-strcmp-atom.S.

-- 
Pedro

  reply	other threads:[~2023-07-16 19:29 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-16 17:22 Robert Clausecker
2023-07-16 17:49 ` NRK
2023-07-16 17:59   ` Robert Clausecker
2023-07-16 19:24     ` Pedro Falcato
2023-07-16 19:29       ` Pedro Falcato [this message]
2023-07-16 19:33     ` Markus Wichmann
2023-07-16 21:13       ` Robert Clausecker
2023-07-17 16:22         ` Adhemerval Zanella Netto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKbZUD3O_NeiK3WiyQDsOD5u2KrPptMynx6CK9PXRkRh_NRmhQ@mail.gmail.com \
    --to=pedro.falcato@gmail.com \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).