From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/10950 Path: news.gmane.org!.POSTED!not-for-mail From: Bernardo Pascoal Figueiredo Newsgroups: gmane.linux.lib.musl.general Subject: Re: Re: Implementation of GLOB_TILDE Date: Fri, 20 Jan 2017 09:56:16 +0000 Message-ID: <165a4213044e5de5fe904948f6d1aec5@mail.tecnico.ulisboa.pt> References: <3e4dd55ae9eb7ec89aac886ad962786b@mail.tecnico.ulisboa.pt> <1320d9deaa1f9e6808a3aa795cb5adcb@mail.tecnico.ulisboa.pt> <20170117193947.GG1533@brightrain.aerifal.cx> <1eb121885ed7b1577b33f5b1fdab6a43@mail.tecnico.ulisboa.pt> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=_95000ca91c02c7c971ae91e9c2484ca0" X-Trace: blaine.gmane.org 1484906211 2639 195.159.176.226 (20 Jan 2017 09:56:51 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 20 Jan 2017 09:56:51 +0000 (UTC) User-Agent: Roundcube Webmail/1.1.7 To: musl@lists.openwall.com Original-X-From: musl-return-10965-gllmg-musl=m.gmane.org@lists.openwall.com Fri Jan 20 10:56:46 2017 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1cUVvn-000786-CB for gllmg-musl@m.gmane.org; Fri, 20 Jan 2017 10:56:27 +0100 Original-Received: (qmail 25913 invoked by uid 550); 20 Jan 2017 09:56:30 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 25894 invoked from network); 20 Jan 2017 09:56:29 -0000 X-Virus-Scanned: by amavisd-new-2.10.1 (20141025) (Debian) at ist.utl.pt DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tecnico.ulisboa.pt; s=mail; t=1484906176; bh=Yk+J0Feaya8U87alkkSWiLRP09sZcs7lxF/9E7Iuuz0=; h=Date:From:To:Subject:In-Reply-To:References; b=uEUKea3Gh1fBxW3FUHFm003K+F/YkM6eAF7ZhroBZBFnSD9SBYGXKIvGQlX3vdrQa Zpkn2dHuCxoVAqMKuvZdLgz0EyG/ge9hRK9BWnZn70GQeJDHO3PIGvzE1tIap2MkW1 KqOuNqjZnRS9IIPfJwd9o/HqmOg+FXB66isYmN84= In-Reply-To: <1eb121885ed7b1577b33f5b1fdab6a43@mail.tecnico.ulisboa.pt> X-Sender: bernardopascoalfigueiredo@tecnico.ulisboa.pt Xref: news.gmane.org gmane.linux.lib.musl.general:10950 Archived-At: --=_95000ca91c02c7c971ae91e9c2484ca0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII; format=flowed Sending a new patch without strlcpy and strlcat. Thanks for the help on IRC. On 2017-01-18 19:51, Bernardo Pascoal Figueiredo wrote: > On 2017-01-17 19:39, Rich Felker wrote: >> On Tue, Jan 17, 2017 at 03:49:03PM +0000, Bernardo Pascoal Figueiredo >> wrote: >>> >I have a few questions though: >>> > * How do I contribute to musl? Should I just send patches to this >>> >mailing list >> >> This is the preferred way, yes. >> >>> > * I defined GLOB_TILDE as 0x100, but I think this won't work on >>> >architectureshttps://webmail.tecnico.ulisboa.pt/rc/?_task=mail&_action=compose&_id=5497761105881de774e18d# >>> > that have sizeof(int) == 2, as the flags argument in glob is an int. >> >> That's not an issue. (1) POSIX requires >=32bit int; musl is even more >> restrictive. (2) 0xff is 8 bits, not 16. Even ISO C requires 0x100 to >> fit in int. You do need to match whatever value glibc uses, though; I >> haven't checked that. > > They use 1 << 12 which is 0x1000. I fixed this in the patch below. > Does this really matter? Is musl supposed to be binary compatible with > glibc? > OpenBSD's libc uses 0x800 for example. > >> >>> > * I think it's best to define GLOB_TILDE in glob.h inside a '#if >>> > defined(_GNU_SOURCE) || defined(_BSD_SOURCE)' what do you think? >>> > >>> > * I had to copy strlcat and strlcpy to glob.c so I could use >>> >them. I had to do >>> > this because musl isn't compile as _GNU_SOURCE or _BSD_SOURCE >>> >so string.h >>> > doesn't expose these functions. How should I fix this? >> >> Just don't use them. The same thing can be achieved much better with >> portable functions strnlen and memcpy. >> >> Note that even if you added #define for _GNU_SOURCE to this file, you >> couldn't reference these functions because then a function in the >> standard namespace would depend on symbols in nonstandard namespace. >> > > What I was thinking was to have private musl implementations of strlcat > and > strlcpy and have the public strlcat and strlcpy call these private > ones. > That way it'd be possible to use strlcat and strlcpy within musl. > I'd prefer that to strnlen and memcpy because that's what strlcat and > strlcpy > already do, so I'd be duplicating it's functionality. This is way more > error > prone. > >>> diff --git a/src/regex/glob.c b/src/regex/glob.c >>> index 5b6ff124..f40da380 100644 >>> --- a/src/regex/glob.c >>> +++ b/src/regex/glob.c >>> @@ -8,6 +8,9 @@ >>> #include >>> #include >>> #include "libc.h" >>> +#include >> >> Just use int for boolean values; bool is not idiomatic in musl. > > Fixed > >> >>> +/*"~" or "~/(...)" case*/ >>> +static bool expand_tilde_cur_user(const char *pat_after_tilde, char >>> *new_pat, size_t new_pat_size) >>> +{ >>> + char *home; >>> + struct passwd pw_store, *pw_result; >>> + char pw_buf[1024]; >>> + >>> + /*FIXME: add check for issetugid as in libc of openbsd?*/ >>> + home = getenv("HOME"); >>> + if(home == NULL) { >>> + getpwuid_r(getuid(), &pw_store, pw_buf, sizeof(pw_buf), >>> &pw_result); >>> + if(pw_result == NULL) { >>> + return false; >>> + } >>> + home = pw_store.pw_dir; >>> + } >>> + >>> + return glob_strlcpy(new_pat, home, new_pat_size) < new_pat_size >>> + && glob_strlcat(new_pat, pat_after_tilde, new_pat_size) < >>> new_pat_size; >>> +} >>> + >>> +/* "~user/(...) case*/ >>> +static bool expand_tilde_named_user(const char *pat_after_tilde, >>> char *new_pat, size_t new_pat_size) >>> +{ >>> + struct passwd pw_store, *pw_result; >>> + char pw_buf[1024], username[1024]; >>> + const char *slash_pos = strchr(pat_after_tilde, '/'); >>> + if(slash_pos == NULL) { >>> + return false; >>> + } >>> + >>> + ptrdiff_t pat_username_size = slash_pos - pat_after_tilde; >>> + if(pat_username_size <= 0 || pat_username_size >= >>> sizeof(username)) { >>> + return false; >>> + } >>> + strncpy(username, pat_after_tilde, pat_username_size); >>> + username[pat_username_size] = '\0'; >>> + >>> + getpwnam_r(username, &pw_store, pw_buf, sizeof(pw_buf), >>> &pw_result); >>> + if (pw_result == NULL) >>> + return false; >>> + >>> + return glob_strlcpy(new_pat, pw_store.pw_dir, new_pat_size) < >>> new_pat_size >>> + && glob_strlcat(new_pat, slash_pos, new_pat_size) < >>> new_pat_size; >>> +} >> >> It should be possible to reduce this code considerably, and to avoid >> some of the large stack buffers. Certainly username[] does not need to >> be 1k; I believe there's a macro for the max supported in limits.h or >> such and it's something like 16 or 32 bytes. > > I searched and found there is a define for LOGIN_NAME_MAX which is 256. > According to the useradd man page the maximum number of characters of > an > username is 32. > There is also in limits.h _POSIX_NAME_MAX which is 14 and > _POSIX_LOGIN_NAME_MAX > which is 9. > The openbsd pwd.h has a define for _PW_NAME_LEN which is 31 (without > '\0'). > > Which alternative is the best? > > (I updated the username arrays to be LOGIN_NAME_MAX) > > Relative to the pw_buf I don't think musl's code has any variable that > defines > the maximum supported passwd line. > I used 1024 because that's what I saw in the openbsd code. They have a > global > char [] of size _PW_BUF_LEN which is 1024. > > Reduce the code in which way? Try to refactor some code of > expand_tilde_cur_user and expand_tilde_named_user, of make the code > "less > verbose"? > >> >>> int glob(const char *restrict pat, int flags, int (*errfunc)(const >>> char *path, int err), glob_t *restrict g) >>> { >>> - const char *p=pat, *d; >>> + const char *p, *d; >>> + char new_pat[PATH_MAX + 1]; >>> struct match head = { .next = NULL }, *tail = &head; >>> size_t cnt, i; >>> size_t offs = (flags & GLOB_DOOFFS) ? g->gl_offs : 0; >>> int error = 0; >>> + >>> + /*even if expanding fails(e.g. expansion make pat too big) >>> + * we should try to match the ~ or ~user literally*/ >>> + bool should_expand_tilde = (flags & GLOB_TILDE) && (pat[0] == >>> '~'); >>> + if(should_expand_tilde && expand_tilde(pat, new_pat, >>> sizeof(new_pat))) { >>> + p = new_pat; >>> + } else { >>> + p = pat; >>> + } >> >> Don't introduce gratuitous variables for expressions used just once. >> The conditions going into that bool can just be part of the if. > If I put it inside the if the line would be too big and those two > checks go > together. > I think it's alot clearer. > >> >> I haven't checked what happens when the input pat is too long to fit >> in new_pat. This needs to be an error condition, not silent >> truncation, but error conditions can't be handled until further down; >> see commit 769f53598e781ffc89191520f3f8a93cb58db91f for why. Also, >> new_pat should probably be a VLA whose length is 1 unless tilde >> expansion is needed, and PATH_MAX otherwise, so that glob doesn't blow >> an extra page on the stack when tilde expansion is not in use. > > I changed it to char new_pat[tilde_flag ? PATH_MAX : 1] where > tilde_flag = > flags & GLOB_TILDE. > (new_pat no longer uses an extra page.) > > Alternatively I can put most of the code of glob in another function > and only > create a new pat buffer when we should try to expand the tilde. > This would be good to reduce the scope of new_pat. > > When the expanded pattern is too big to fit new_pat, the expansion > fails and > the unexpanded pattern is used. > According to glibc GLOB_TILDE should never make glob return an error. > If > anything related to the expansion of tilde fails it's supposed to > continue glob > with the pattern unexpanded. > At least that how I interpret the manual: > " > If the username is invalid, or the home directory cannot be determined, > then no > substitution is performed. > " > What I'm trying to say is that GLOB_TILDE never produces an error, so > the code > that zeros the glob_t structure always runs. > > If the code that checks for leading '/' is put just before the call to > match_dir, it'd be ok to put the code that calls expand_tilde after the > glob_t > zeroing and strlen checking and before the leading '/' code. > The "downside" of this is that the caller of glob has a ton of leading > '/' it > could now fail the strlen check, but I think that's the fault of who is > calling > glob with "/////////////////////(...)" > A patch for that is in attachment > "move_leading_slash_check_down.patch". > > >> >> Rich > > I also send the patch with the things I fixed as attachment. It's > still not okay. --=_95000ca91c02c7c971ae91e9c2484ca0 Content-Transfer-Encoding: base64 Content-Type: text/x-diff; name=glob_tilde.patch Content-Disposition: attachment; filename=glob_tilde.patch; size=3593 ZGlmZiAtLWdpdCBhL2luY2x1ZGUvZ2xvYi5oIGIvaW5jbHVkZS9nbG9iLmgKaW5kZXggNzZmNmMx YzYuLjQ3N2RkZjJiIDEwMDY0NAotLS0gYS9pbmNsdWRlL2dsb2IuaAorKysgYi9pbmNsdWRlL2ds b2IuaApAQCAtMzAsNiArMzAsNyBAQCB2b2lkIGdsb2JmcmVlKGdsb2JfdCAqKTsKICNkZWZpbmUg R0xPQl9BUFBFTkQgICAweDIwCiAjZGVmaW5lIEdMT0JfTk9FU0NBUEUgMHg0MAogI2RlZmluZQlH TE9CX1BFUklPRCAgIDB4ODAKKyNkZWZpbmUgR0xPQl9USUxERSAgICAweDEwMDAKIAogI2RlZmlu ZSBHTE9CX05PU1BBQ0UgMQogI2RlZmluZSBHTE9CX0FCT1JURUQgMgpkaWZmIC0tZ2l0IGEvc3Jj L3JlZ2V4L2dsb2IuYyBiL3NyYy9yZWdleC9nbG9iLmMKaW5kZXggNWI2ZmYxMjQuLjE2YTBiMWY3 IDEwMDY0NAotLS0gYS9zcmMvcmVnZXgvZ2xvYi5jCisrKyBiL3NyYy9yZWdleC9nbG9iLmMKQEAg LTgsNiArOCw4IEBACiAjaW5jbHVkZSA8ZXJybm8uaD4KICNpbmNsdWRlIDxzdGRkZWYuaD4KICNp bmNsdWRlICJsaWJjLmgiCisjaW5jbHVkZSA8cHdkLmg+CisjaW5jbHVkZSA8dW5pc3RkLmg+CiAK IHN0cnVjdCBtYXRjaAogewpAQCAtMTU0LDEzICsxNTYsOTkgQEAgc3RhdGljIGludCBzb3J0KGNv bnN0IHZvaWQgKmEsIGNvbnN0IHZvaWQgKmIpCiAJcmV0dXJuIHN0cmNtcCgqKGNvbnN0IGNoYXIg KiopYSwgKihjb25zdCBjaGFyICoqKWIpOwogfQogCitzdGF0aWMgaW50IGNvbmNhdChjaGFyICpi dWYsIHNpemVfdCBsZW4sIGNvbnN0IGNoYXIgKnMxLCBjb25zdCBjaGFyICpzMikKK3sKKwlzaXpl X3QgbjEgPSBzdHJubGVuKHMxLCBsZW4pOworCWxlbiAtPSBuMTsKKwlzaXplX3QgbjIgPSBzdHJu bGVuKHMyLCBsZW4pOworCWxlbiAtPSBuMjsKKwlpZiAoIWxlbikKKwkJcmV0dXJuIDA7CisJbWVt Y3B5KGJ1ZiwgczEsIG4xKTsKKwltZW1jcHkoYnVmK24xLCBzMiwgbjIrMSk7CisJcmV0dXJuIDE7 Cit9CisKKy8qIn4iIG9yICJ+LyguLi4pIiBjYXNlKi8KK3N0YXRpYyBpbnQgZXhwYW5kX3RpbGRl X2N1cl91c2VyKGNvbnN0IGNoYXIgKnBhdF9hZnRlcl90aWxkZSwgY2hhciAqbmV3X3BhdCwgc2l6 ZV90IG5ld19wYXRfc2l6ZSkKK3sKKwljaGFyICpob21lOworCXN0cnVjdCBwYXNzd2QgcHdfc3Rv cmUsICpwd19yZXN1bHQ7CisJY2hhciBwd19idWZbTE9HSU5fTkFNRV9NQVhdOworCisJLypGSVhN RTogYWRkIGNoZWNrIGZvciBpc3NldHVnaWQgYXMgaW4gbGliYyBvZiBvcGVuYnNkPyovCisJaG9t ZSA9IGdldGVudigiSE9NRSIpOworCWlmKGhvbWUgPT0gTlVMTCkgeworCQlnZXRwd3VpZF9yKGdl dHVpZCgpLCAmcHdfc3RvcmUsIHB3X2J1Ziwgc2l6ZW9mKHB3X2J1ZiksICZwd19yZXN1bHQpOwor CQlpZihwd19yZXN1bHQgPT0gTlVMTCkgeworCQkJcmV0dXJuIDA7CisJCX0KKwkJaG9tZSA9IHB3 X3N0b3JlLnB3X2RpcjsKKwl9CisKKwlyZXR1cm4gY29uY2F0KG5ld19wYXQsIG5ld19wYXRfc2l6 ZSwgaG9tZSwgcGF0X2FmdGVyX3RpbGRlKTsKK30KKworLyogIn51c2VyLyguLi4pIGNhc2UqLwor c3RhdGljIGludCBleHBhbmRfdGlsZGVfbmFtZWRfdXNlcihjb25zdCBjaGFyICpwYXRfYWZ0ZXJf dGlsZGUsIGNoYXIgKm5ld19wYXQsIHNpemVfdCBuZXdfcGF0X3NpemUpCit7CisJc3RydWN0IHBh c3N3ZCBwd19zdG9yZSwgKnB3X3Jlc3VsdDsKKwljaGFyIHB3X2J1ZlsxMDI0XSwgdXNlcm5hbWVb TE9HSU5fTkFNRV9NQVhdOworCWNvbnN0IGNoYXIgKnNsYXNoX3BvcyA9IHN0cmNocihwYXRfYWZ0 ZXJfdGlsZGUsICcvJyk7CisJaWYoc2xhc2hfcG9zID09IE5VTEwpIHsKKwkJcmV0dXJuIDA7CisJ fQorCisJcHRyZGlmZl90IHBhdF91c2VybmFtZV9zaXplID0gc2xhc2hfcG9zIC0gcGF0X2FmdGVy X3RpbGRlOworCWlmKHBhdF91c2VybmFtZV9zaXplIDw9IDAgfHwgcGF0X3VzZXJuYW1lX3NpemUg Pj0gc2l6ZW9mKHVzZXJuYW1lKSkgeworCQlyZXR1cm4gMDsKKwl9CisJc3RybmNweSh1c2VybmFt ZSwgcGF0X2FmdGVyX3RpbGRlLCBwYXRfdXNlcm5hbWVfc2l6ZSk7CisJdXNlcm5hbWVbcGF0X3Vz ZXJuYW1lX3NpemVdID0gJ1wwJzsKKworCWdldHB3bmFtX3IodXNlcm5hbWUsICZwd19zdG9yZSwg cHdfYnVmLCBzaXplb2YocHdfYnVmKSwgJnB3X3Jlc3VsdCk7CisJaWYgKHB3X3Jlc3VsdCA9PSBO VUxMKQorCQlyZXR1cm4gMDsKKworCXJldHVybiBjb25jYXQobmV3X3BhdCwgbmV3X3BhdF9zaXpl LCBwd19zdG9yZS5wd19kaXIsIHNsYXNoX3Bvcyk7Cit9CisKKy8qIGV4cGFuZHM6CisgKiAgfiBp bnRvIC9ob21lL3VzZXIvCisgKiAgfi9hc2QgaW50byAvaG9tZS91c2VyL2FzZAorICogIH51c2Vy MS9hc2QgaW50byAvaG9tZS91c2VyMS9hc2QKKyAqIHRoZSB2YWx1ZXMgZm9yIHRoZSBob21lIGRp cmVjdG9yeSBhcmUgdGFrZW4gZnJvbSBwYXNzd2QKKyAqCisgKiByZXR1cm5pbmcgdHJ1ZSBtZWFu cyBzdWNjZXNzZnVsIGV4cGFuc2lvbiBhbmQgdGhhdCBleHBhbmRlZF9wYXQgaXMgdmFsaWQKKyAq Lworc3RhdGljIGludCBleHBhbmRfdGlsZGUoY29uc3QgY2hhciAqcGF0LCBjaGFyICpuZXdfcGF0 LCBpbnQgbmV3X3BhdF9zaXplKQoreworCWNvbnN0IGNoYXIgKnBhdF9hZnRlcl90aWxkZSA9IHBh dCArIDE7CisJaWYoKnBhdF9hZnRlcl90aWxkZSA9PSAnXDAnIHx8ICpwYXRfYWZ0ZXJfdGlsZGUg PT0gJy8nKSB7CisJCXJldHVybiBleHBhbmRfdGlsZGVfY3VyX3VzZXIocGF0X2FmdGVyX3RpbGRl LCBuZXdfcGF0LCBuZXdfcGF0X3NpemUpOworCX0gZWxzZSB7CisJCXJldHVybiBleHBhbmRfdGls ZGVfbmFtZWRfdXNlcihwYXRfYWZ0ZXJfdGlsZGUsIG5ld19wYXQsIG5ld19wYXRfc2l6ZSk7CisJ fQorfQorCiBpbnQgZ2xvYihjb25zdCBjaGFyICpyZXN0cmljdCBwYXQsIGludCBmbGFncywgaW50 ICgqZXJyZnVuYykoY29uc3QgY2hhciAqcGF0aCwgaW50IGVyciksIGdsb2JfdCAqcmVzdHJpY3Qg ZykKIHsKLQljb25zdCBjaGFyICpwPXBhdCwgKmQ7CisJY29uc3QgY2hhciAqcCwgKmQ7CisJY29u c3QgaW50IHRpbGRlX2ZsYWcgPSBmbGFncyAmIEdMT0JfVElMREU7CisJY2hhciBuZXdfcGF0W3Rp bGRlX2ZsYWcgPyBQQVRIX01BWCA6IDFdOwogCXN0cnVjdCBtYXRjaCBoZWFkID0geyAubmV4dCA9 IE5VTEwgfSwgKnRhaWwgPSAmaGVhZDsKIAlzaXplX3QgY250LCBpOwogCXNpemVfdCBvZmZzID0g KGZsYWdzICYgR0xPQl9ET09GRlMpID8gZy0+Z2xfb2ZmcyA6IDA7CiAJaW50IGVycm9yID0gMDsK KworCS8qZXZlbiBpZiBleHBhbmRpbmcgZmFpbHMoZS5nLiBleHBhbnNpb24gbWFrZSBwYXQgdG9v IGJpZykKKwkgKiB3ZSBzaG91bGQgdHJ5IHRvIG1hdGNoIHRoZSB+IG9yIH51c2VyIGxpdGVyYWxs eSovCisJaW50IHNob3VsZF9leHBhbmRfdGlsZGUgPSB0aWxkZV9mbGFnICYmIChwYXRbMF0gPT0g J34nKTsKKwlpZihzaG91bGRfZXhwYW5kX3RpbGRlICYmIGV4cGFuZF90aWxkZShwYXQsIG5ld19w YXQsIHNpemVvZihuZXdfcGF0KSkpIHsKKwkJcCA9IG5ld19wYXQ7CisJfSBlbHNlIHsKKwkJcCA9 IHBhdDsKKwl9CiAJCiAJaWYgKCpwID09ICcvJykgewogCQlmb3IgKDsgKnAgPT0gJy8nOyBwKysp Owo= --=_95000ca91c02c7c971ae91e9c2484ca0--