I just saw the new commit of strndup that uses strlen and memmove. memmove is unnecessary. strlen is dangerous. Here is a reference implementation from musl. http://git.musl-libc.org/cgit/musl/tree/src/string/strndup.c #include <stdlib.h> #include <string.h> char *strndup(const char *s, size_t n) { size_t l = strnlen(s, n); char *d = malloc(l+1); if (!d) return NULL; memcpy(d, s, l); d[l] = 0; return d; }
[-- Attachment #1: Type: text/plain, Size: 56 bytes --] In 9front, memcpy == memmove. Why is strlen dangerous? [-- Attachment #2: Type: message/rfc822, Size: 5829 bytes --] From: Xiao-Yong Jin <meta.jxy@gmail.com> To: 9front@9front.org Subject: [9front] strndup should use strnlen and memcpy Date: Fri, 18 Dec 2020 02:22:25 -0600 Message-ID: <CEDEBD0B-FC39-43D4-960F-05E348970FE8@gmail.com> I just saw the new commit of strndup that uses strlen and memmove. memmove is unnecessary. strlen is dangerous. Here is a reference implementation from musl. http://git.musl-libc.org/cgit/musl/tree/src/string/strndup.c #include <stdlib.h> #include <string.h> char *strndup(const char *s, size_t n) { size_t l = strnlen(s, n); char *d = malloc(l+1); if (!d) return NULL; memcpy(d, s, l); d[l] = 0; return d; }
On Fri, Dec 18, 2020 at 09:38:00AM +0100, Sigrid Solveig HaflÃnudóttir wrote: > In 9front, memcpy == memmove. > > Why is strlen dangerous? Because the terminating '\0' may not be here. But it's simpler to inline: size_t l; for (l = 0; l < n && s[l]; l++) ; preceeded by an assertion about the not 'nil'ity of s. FWIW, T. Laronde > Date: Fri, 18 Dec 2020 02:22:25 -0600 > From: Xiao-Yong Jin <meta.jxy@gmail.com> > Subject: [9front] strndup should use strnlen and memcpy > To: 9front@9front.org > X-Mailer: Apple Mail (2.3654.40.0.2.32) > > I just saw the new commit of strndup that uses strlen and memmove. > memmove is unnecessary. > strlen is dangerous. > > Here is a reference implementation from musl. > > http://git.musl-libc.org/cgit/musl/tree/src/string/strndup.c > > #include <stdlib.h> > #include <string.h> > > char *strndup(const char *s, size_t n) > { > size_t l = strnlen(s, n); > char *d = malloc(l+1); > if (!d) return NULL; > memcpy(d, s, l); > d[l] = 0; > return d; > } -- Thierry Laronde <tlaronde +AT+ polynum +dot+ com> http://www.kergis.com/ http://kertex.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
> Because the terminating '\0' may not be here.
Why wouldn't it be there? Isn't strndup supposed to work on strings?
Is there any (trustworthy standard-like) source that mentions
non-terminated strings as something that should work with strndup?
2020-12-18 12:59 GMT+03:00, Sigrid Solveig Haflínudóttir <ftrvxmtrx@gmail.com>: >> Because the terminating '\0' may not be here. > > Why wouldn't it be there? Isn't strndup supposed to work on strings? > Is there any (trustworthy standard-like) source that mentions > non-terminated strings as something that should work with strndup? > > Open Group does not specify if arg1 should be null terminated or not. https://pubs.opengroup.org/onlinepubs/9699919799/functions/strdup.html
> Open Group does not specify if arg1 should be null terminated or not.
> https://pubs.opengroup.org/onlinepubs/9699919799/functions/strdup.html
"Applications should not assume that strndup() will allocate ( size + 1)
bytes when strlen(s) is smaller than size."
Pretty sure this means it's supposed to be null-terminated, or
otherwise it wouldn't mention the "dangerous" strlen right there.
2020-12-18 13:31 GMT+03:00, Sigrid Solveig Haflínudóttir <ftrvxmtrx@gmail.com>:
>> Open Group does not specify if arg1 should be null terminated or not.
>> https://pubs.opengroup.org/onlinepubs/9699919799/functions/strdup.html
>
> "Applications should not assume that strndup() will allocate ( size + 1)
> bytes when strlen(s) is smaller than size."
>
> Pretty sure this means it's supposed to be null-terminated, or
> otherwise it wouldn't mention the "dangerous" strlen right there.
>
>
I don't know, I also think that str is \0 terminated but that seems a
bit unclear.
I am very surprised either way :o
it should definitely be strnlen... strn* are specifically for cases
where you want to specify a length, specifically where the strings are
not necessarily null-terminated
this is probably all my fault, I apologize for horsing around. it does
seem like I saw the cover of a DVD of that movie wargames once but I'm
sure there's more to it...?
On Fri, Dec 18, 2020 at 2:50 AM Kemal <kemalinanc8@gmail.com> wrote:
>
> 2020-12-18 13:31 GMT+03:00, Sigrid Solveig Haflínudóttir <ftrvxmtrx@gmail.com>:
> >> Open Group does not specify if arg1 should be null terminated or not.
> >> https://pubs.opengroup.org/onlinepubs/9699919799/functions/strdup.html
> >
> > "Applications should not assume that strndup() will allocate ( size + 1)
> > bytes when strlen(s) is smaller than size."
> >
> > Pretty sure this means it's supposed to be null-terminated, or
> > otherwise it wouldn't mention the "dangerous" strlen right there.
> >
> >
> I don't know, I also think that str is \0 terminated but that seems a
> bit unclear.
it's kind of surprising to me, I wasn't able to find much more on google about it than this https://cseweb.ucsd.edu/classes/fa18/cse127-a/CSE127fa18.3-Savage.pdf On Fri, Dec 18, 2020 at 3:02 AM Eli Cohen <echoline@gmail.com> wrote: > > I am very surprised either way :o > > it should definitely be strnlen... strn* are specifically for cases > where you want to specify a length, specifically where the strings are > not necessarily null-terminated > > this is probably all my fault, I apologize for horsing around. it does > seem like I saw the cover of a DVD of that movie wargames once but I'm > sure there's more to it...? > > On Fri, Dec 18, 2020 at 2:50 AM Kemal <kemalinanc8@gmail.com> wrote: > > > > 2020-12-18 13:31 GMT+03:00, Sigrid Solveig Haflínudóttir <ftrvxmtrx@gmail.com>: > > >> Open Group does not specify if arg1 should be null terminated or not. > > >> https://pubs.opengroup.org/onlinepubs/9699919799/functions/strdup.html > > > > > > "Applications should not assume that strndup() will allocate ( size + 1) > > > bytes when strlen(s) is smaller than size." > > > > > > Pretty sure this means it's supposed to be null-terminated, or > > > otherwise it wouldn't mention the "dangerous" strlen right there. > > > > > > > > I don't know, I also think that str is \0 terminated but that seems a > > bit unclear.
On Fri, Dec 18, 2020 at 11:31:22AM +0100, Sigrid Solveig HaflC-nudC3ttir wrote: > > Open Group does not specify if arg1 should be null terminated or not. > > https://pubs.opengroup.org/onlinepubs/9699919799/functions/strdup.html > > "Applications should not assume that strndup() will allocate ( size + 1) > bytes when strlen(s) is smaller than size." > > Pretty sure this means it's supposed to be null-terminated, or > otherwise it wouldn't mention the "dangerous" strlen right there. As I read it---but I'm not an english native speaker---this only means that one shall not infer that the allocation will be len + 1 (since if the string is shorter, it will be less). This does not say anything about a terminating '\0' and this is precisely the purpose to copy some amount of chars in a null terminated array of chars (the dup will be null terminated; it is not a requirement of the chars array duplicated). the name could have been mem2nstr (fixed maximum size length str as a result, not as an argument). -- Thierry Laronde <tlaronde +AT+ polynum +dot+ com> http://www.kergis.com/ http://kertex.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Well, I can't see where it says anything about strings having to be null-terminated when calling strdup either, so who knows? :)
I know on the order of nothing, but I vote in favor of strnlen
...?
On Fri, Dec 18, 2020 at 3:39 AM Sigrid Solveig Haflínudóttir
<ftrvxmtrx@gmail.com> wrote:
>
> Well, I can't see where it says anything about strings having to be
> null-terminated when calling strdup either, so who knows? :)
>
At this point I have no idea what we should do now.
I know what we should do but have no idea on a personal level
On Fri, Dec 18, 2020 at 3:53 AM Kemal <kemalinanc8@gmail.com> wrote:
>
> At this point I have no idea what we should do now.
I want strnlen to be used just to make sure but I think that arg1
should be \0 terminated normally.
2020-12-18 14:55 GMT+03:00, Eli Cohen <echoline@gmail.com>:
> I know what we should do but have no idea on a personal level
>
> On Fri, Dec 18, 2020 at 3:53 AM Kemal <kemalinanc8@gmail.com> wrote:
>>
>> At this point I have no idea what we should do now.
>
I'd also use strnlen. My current understanding is: strnlen is used to
impose a limit on a bigger str, without caring about its end if it is
bigger than size chars. strncmp compares strings up to a limit,
without caring about their ends if they are bigger than size chars.
The "n" limitation disregards the terminator, so here we shouldn't
make exception. Also, using strnlen makes strndup unaware of the
problem, relying on the correct behavior of strnlen. The only drawback
would be efficiency, but let's be honest: that branch will be
predicted pretty easily by modern CPU.
Just my 2 cents, take all of this with a grain of salt.
On Fri, Dec 18, 2020 at 1:00 PM Eli Cohen <echoline@gmail.com> wrote:
>
> I know what we should do but have no idea on a personal level
>
> On Fri, Dec 18, 2020 at 3:53 AM Kemal <kemalinanc8@gmail.com> wrote:
> >
> > At this point I have no idea what we should do now.
Hi Sigrid, The C2x working draft[1] incorporates strdup and strndup. Of strndup, it says, "[t]he strndup function creates a string initialized with no more than size initial characters of the *array pointed to by s*..." (emphasis mine). Strings in the standard library are always called strings. While any object occupying fewer than `size+1` bytes must be a string, this is a side-effect of how the API is specified to behave. POSIX is ambiguous, but implies that it expects a string either way: "[t]he strndup() function shall be equivalent to the strdup() function...", without subsequently placing exceptions on the input constraints. Many (most?) implementations don't bother to check (or explicitly allow non-strings), and WG14 likely recognized this when standardizing the functionality in libc. Anyway, if taking the "find length before allocation" approach to implementation, the WG14 wording requires use of strnlen semantics. One can implement without a strnlen by always allocating size+1 bytes, which is why the standard can put strnlen_s in an annex. --dho [1]: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2454.pdf On Fri, Dec 18, 2020 at 2:01 AM Sigrid Solveig Haflínudóttir <ftrvxmtrx@gmail.com> wrote: > > > Because the terminating '\0' may not be here. > > Why wouldn't it be there? Isn't strndup supposed to work on strings? > Is there any (trustworthy standard-like) source that mentions > non-terminated strings as something that should work with strndup? >
On Fri, Dec 18, 2020 at 2:31 AM Sigrid Solveig Haflínudóttir
<ftrvxmtrx@gmail.com> wrote:
>
> > Open Group does not specify if arg1 should be null terminated or not.
> > https://pubs.opengroup.org/onlinepubs/9699919799/functions/strdup.html
>
> "Applications should not assume that strndup() will allocate ( size + 1)
> bytes when strlen(s) is smaller than size."
(As this pertains to my last message, it's also possible for the
interface to malloc(size + 1), copy N bytes, and realloc(N+1). This
allows that statement to be true while still not requiring the input
to be a string when it exceeds size.)
Quoth Xiao-Yong Jin <meta.jxy@gmail.com>:
> I just saw the new commit of strndup that uses strlen and memmove.
> memmove is unnecessary.
> strlen is dangerous.
Good catch; Fix committed.