* $watch, log and Cyrillic usernames @ 2023-10-05 21:29 Максим 2023-10-05 21:42 ` Bart Schaefer 2023-10-07 2:05 ` Oliver Kiddle 0 siblings, 2 replies; 9+ messages in thread From: Максим @ 2023-10-05 21:29 UTC (permalink / raw) To: zsh-workers [-- Attachment #1: Type: text/plain, Size: 813 bytes --] Hello again. I found another bug with cyrillic usernames in zsh (again on Cygwin, but can be reproduced on Linux) Tested on latest master: 1. Create user "Студент" (for example, via /etc/passwd editing) 2. Create /var/run/utmp file if it is missing (may be necessary on Cygwin) 3. Login into "Студент" (via login or ssh) 4. % zsh-5.9.0.1-dev -f % whoami Студент % watch=(all); log # (everything is ok) root has logged on /proc/10045/fd/2 from . Студент has logged on pts/29 from 127.0.0.1. % watch=(notme); log # ("Студент" record is not filtered) root has logged on /proc/10045/fd/2 from . Студент has logged on pts/29 from 127.0.0.1. % watch=(Студент); log # ("Студент" record is missing) Sent via Gmail [-- Attachment #2: Type: text/html, Size: 1623 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: $watch, log and Cyrillic usernames 2023-10-05 21:29 $watch, log and Cyrillic usernames Максим @ 2023-10-05 21:42 ` Bart Schaefer 2023-10-07 2:05 ` Oliver Kiddle 1 sibling, 0 replies; 9+ messages in thread From: Bart Schaefer @ 2023-10-05 21:42 UTC (permalink / raw) To: Максим; +Cc: zsh-workers I'm going to leave this one to Oliver as he most recently fiddled with the "watch" code (e.g., moved to a module) but I have to ask: Is there really such a thing as a time-sharing Cygwin system? If not, of what use is "watch"? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: $watch, log and Cyrillic usernames 2023-10-05 21:29 $watch, log and Cyrillic usernames Максим 2023-10-05 21:42 ` Bart Schaefer @ 2023-10-07 2:05 ` Oliver Kiddle 2023-10-07 16:49 ` Максим 2023-10-08 21:47 ` metafy() (was Re: $watch, log and Cyrillic usernames) Oliver Kiddle 1 sibling, 2 replies; 9+ messages in thread From: Oliver Kiddle @ 2023-10-07 2:05 UTC (permalink / raw) To: Максим; +Cc: zsh-workers Максим wrote: > Hello again. I found another bug with cyrillic usernames in zsh (again on > Cygwin, but can be reproduced on Linux) Reproducing does involve bypassing utilities like useradd which complain about invalid usernames. But I can imagine such rules will increasingly be relaxed and there's no reason for zsh to make assumptions. > % watch=(Студент); log # ("Студент" record is missing) The value from $watch is metafied and that's what patcompile() and pattry() need so the fix below uses metafy() on the username from utmp. However, in looking closer at the code I observed the existing use of sizeof(u->ut_name) which is 32 on my system. So I tried creating 32 and 33 character usernames (which, incidentally, useradd was happy with) and as I suspected u->ut_name is not null-terminated for these. So the patch uses strnlen() with the sizeof() for n to get the length to pass to metafy(). We have no existing uses of strnlen() but I don't foresee portability issues given that it is attributed to the 2008 POSIX standard and is supported in Solaris 10 which is from a few years prior to that. If needed, it'd be easy to provide an alternative implementation. To match the 33 character username, it does need to be truncated in $watch. last -w does manage to print the full username, would be good to know how. For the hostname, our code was using strlen() rather than sizeof(). I can't see why this would be needed. I would have tried putting UTF-8 in my hosts file to test that that but I'm only getting IP addresses in utmp. I guess we could do reverse lookups but it hardly seems worth it for the amount of use watch/log likely get these days. The example also uses an uppercase letter. Usernames on Unix are case-sensitive but it wouldn't surprise me if they aren't on Cygwin. If so, should we add #ifdefs for that? Oliver diff --git a/Src/Modules/watch.c b/Src/Modules/watch.c index 0de8cbf9a..2ad962fb6 100644 --- a/Src/Modules/watch.c +++ b/Src/Modules/watch.c @@ -423,20 +423,22 @@ watchlog2(int inout, WATCH_STRUCT_UTMP *u, char *fmt, int prnt, int fini) /* See if the watch entry matches */ static int -watchlog_match(char *teststr, char *actual, int len) +watchlog_match(char *teststr, char *actual, size_t buflen) { int ret = 0; Patprog pprog; char *str = dupstring(teststr); + int len = strnlen(actual, buflen); + char *user = metafy(actual, len, META_USEHEAP); tokenize(str); if ((pprog = patcompile(str, PAT_STATIC, 0))) { queue_signals(); - if (pattry(pprog, actual)) + if (pattry(pprog, user)) ret = 1; unqueue_signals(); - } else if (!strncmp(actual, teststr, len)) + } else if (!strcmp(user, teststr)) ret = 1; return ret; } @@ -488,7 +490,7 @@ watchlog(int inout, WATCH_STRUCT_UTMP *u, char **w, char *fmt) for (vv = ++v; *vv && *vv != '%'; vv++); sav = *vv; *vv = '\0'; - if (!watchlog_match(v, u->ut_host, strlen(v))) + if (!watchlog_match(v, u->ut_host, sizeof(u->ut_host))) bad = 1; *vv = sav; v = vv; ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: $watch, log and Cyrillic usernames 2023-10-07 2:05 ` Oliver Kiddle @ 2023-10-07 16:49 ` Максим 2023-10-08 0:24 ` Oliver Kiddle 2023-10-08 21:47 ` metafy() (was Re: $watch, log and Cyrillic usernames) Oliver Kiddle 1 sibling, 1 reply; 9+ messages in thread From: Максим @ 2023-10-07 16:49 UTC (permalink / raw) To: Oliver Kiddle; +Cc: zsh-workers [-- Attachment #1: Type: text/plain, Size: 4633 bytes --] Found out that there is no need in bypassing "adduser", as this command works just fine: % adduser --allow-all-names --home /root --shell /usr/bin/zsh Плохой So it is actually possible to create user with unicode username in a correct way (at least on Linux). Now about the patch: % watch=(all); log root has logged on /proc/10045/fd/2 from . Студент has logged on pts/29 from 127.0.0.1. % watch=(notme); log # Broken root has logged on /proc/10045/fd/2 from . Студент has logged on pts/29 from 127.0.0.1. % watch=(Студент); log Студент has logged on pts/29 from 127.0.0.1. Also tested "too long" username: % echo ${#USERNAME} 33 % watch=(all); log root has logged on /proc/10045/fd/2 from . oooooooooooooooooooooooooooooooo has logged on pts/29 from 127.0.0.1. % watch=(notme); log root has logged on /proc/10045/fd/2 from . % watch=(${(l:33::o:):-}); log % watch=(${(l:32::o:):-}); log oooooooooooooooooooooooooooooooo has logged on pts/29 from . And no, Cygwin usernames are case-sensitive. Sent via Gmail сб, 7 окт. 2023 г., 05:05 Oliver Kiddle <opk@zsh.org>: > Максим wrote: > > Hello again. I found another bug with cyrillic usernames in zsh (again on > > Cygwin, but can be reproduced on Linux) > > Reproducing does involve bypassing utilities like useradd which complain > about invalid usernames. But I can imagine such rules will increasingly > be relaxed and there's no reason for zsh to make assumptions. > > > % watch=(Студент); log # ("Студент" record is missing) > > The value from $watch is metafied and that's what patcompile() and > pattry() need so the fix below uses metafy() on the username from > utmp. > > However, in looking closer at the code I observed the existing use of > sizeof(u->ut_name) which is 32 on my system. So I tried creating 32 and > 33 character usernames (which, incidentally, useradd was happy with) and > as I suspected u->ut_name is not null-terminated for these. So the patch > uses strnlen() with the sizeof() for n to get the length to pass > to metafy(). We have no existing uses of strnlen() but I don't foresee > portability issues given that it is attributed to the 2008 POSIX > standard and is supported in Solaris 10 which is from a few years prior > to that. If needed, it'd be easy to provide an alternative > implementation. > > To match the 33 character username, it does need to be truncated in > $watch. last -w does manage to print the full username, would be good > to know how. For the hostname, our code was using strlen() rather than > sizeof(). I can't see why this would be needed. I would have tried > putting UTF-8 in my hosts file to test that that but I'm only getting IP > addresses in utmp. I guess we could do reverse lookups but it hardly > seems worth it for the amount of use watch/log likely get these days. > > The example also uses an uppercase letter. Usernames on Unix are > case-sensitive but it wouldn't surprise me if they aren't on Cygwin. > If so, should we add #ifdefs for that? > > Oliver > > diff --git a/Src/Modules/watch.c b/Src/Modules/watch.c > index 0de8cbf9a..2ad962fb6 100644 > --- a/Src/Modules/watch.c > +++ b/Src/Modules/watch.c > @@ -423,20 +423,22 @@ watchlog2(int inout, WATCH_STRUCT_UTMP *u, char > *fmt, int prnt, int fini) > /* See if the watch entry matches */ > > static int > -watchlog_match(char *teststr, char *actual, int len) > +watchlog_match(char *teststr, char *actual, size_t buflen) > { > int ret = 0; > Patprog pprog; > char *str = dupstring(teststr); > + int len = strnlen(actual, buflen); > + char *user = metafy(actual, len, META_USEHEAP); > > tokenize(str); > > if ((pprog = patcompile(str, PAT_STATIC, 0))) { > queue_signals(); > - if (pattry(pprog, actual)) > + if (pattry(pprog, user)) > ret = 1; > unqueue_signals(); > - } else if (!strncmp(actual, teststr, len)) > + } else if (!strcmp(user, teststr)) > ret = 1; > return ret; > } > @@ -488,7 +490,7 @@ watchlog(int inout, WATCH_STRUCT_UTMP *u, char **w, > char *fmt) > for (vv = ++v; *vv && *vv != '%'; vv++); > sav = *vv; > *vv = '\0'; > - if (!watchlog_match(v, u->ut_host, strlen(v))) > + if (!watchlog_match(v, u->ut_host, sizeof(u->ut_host))) > bad = 1; > *vv = sav; > v = vv; > [-- Attachment #2: Type: text/html, Size: 6666 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: $watch, log and Cyrillic usernames 2023-10-07 16:49 ` Максим @ 2023-10-08 0:24 ` Oliver Kiddle 2023-10-08 6:20 ` Максим 0 siblings, 1 reply; 9+ messages in thread From: Oliver Kiddle @ 2023-10-08 0:24 UTC (permalink / raw) To: Максим; +Cc: zsh-workers Максим wrote: > Found out that there is no need in bypassing "adduser", as this command works > just fine: These utilities can be quite distribution specific. There is no adduser command on my system but the old manual approach is always possible. I also looked into how last is able to print longer usernames and it's using sqlite and a library for access. The 32 char limit is specific to the getutxent() interface and could be different on other systems. > % watch=(notme); log # Broken > root has logged on /proc/10045/fd/2 from . > Студент has logged on pts/29 from 127.0.0.1. Thanks for testing. "notme" is handled separately and the patch below should be applied on top of the previous one. The w++ line is a minor change from the old behaviour - moving on from a notme entry. An alternative would be to move the return from just above to that line. The code is arguably now wasting time and memory metafying the same strings multiple times. We could convert the whole utmp structure up-front. Not sure that's worth the bother, though (?). Oliver diff --git a/Src/Modules/watch.c b/Src/Modules/watch.c index 2ad962fb6..d30f5ff98 100644 --- a/Src/Modules/watch.c +++ b/Src/Modules/watch.c @@ -458,10 +458,14 @@ watchlog(int inout, WATCH_STRUCT_UTMP *u, char **w, char *fmt) (void)watchlog2(inout, u, fmt, 1, 0); return; } - if (*w && !strcmp(*w, "notme") && - strncmp(u->ut_name, get_username(), sizeof(u->ut_name))) { - (void)watchlog2(inout, u, fmt, 1, 0); - return; + if (*w && !strcmp(*w, "notme")) { + char *username = metafy(u->ut_name, + strnlen(u->ut_name, sizeof(u->ut_name)), META_USEHEAP); + if (strcmp(username, get_username())) { + (void)watchlog2(inout, u, fmt, 1, 0); + return; + } + w++; } for (; *w; w++) { bad = 0; ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: $watch, log and Cyrillic usernames 2023-10-08 0:24 ` Oliver Kiddle @ 2023-10-08 6:20 ` Максим 0 siblings, 0 replies; 9+ messages in thread From: Максим @ 2023-10-08 6:20 UTC (permalink / raw) To: Oliver Kiddle; +Cc: zsh-workers [-- Attachment #1: Type: text/plain, Size: 2455 bytes --] Tested: % watch=(all); log root has logged on /proc/10045/fd/2 from . Студент has logged on pts/29 from 127.0.0.1. % watch=(notme); log root has logged on /proc/10045/fd/2 from . % watch=(Студент); log Студент has logged on pts/29 from 127.0.0.1. Everything works now, thanks Sent via Gmail вс, 8 окт. 2023 г., 03:24 Oliver Kiddle <opk@zsh.org>: > Максим wrote: > > Found out that there is no need in bypassing "adduser", as this command > works > > just fine: > > These utilities can be quite distribution specific. There is no adduser > command on my system but the old manual approach is always possible. I > also looked into how last is able to print longer usernames and it's > using sqlite and a library for access. The 32 char limit is specific to > the getutxent() interface and could be different on other systems. > > > % watch=(notme); log # Broken > > root has logged on /proc/10045/fd/2 from . > > Студент has logged on pts/29 from 127.0.0.1. > > Thanks for testing. "notme" is handled separately and the patch below > should be applied on top of the previous one. > > The w++ line is a minor change from the old behaviour - moving on from > a notme entry. An alternative would be to move the return from just > above to that line. > > The code is arguably now wasting time and memory metafying the same > strings multiple times. We could convert the whole utmp structure > up-front. Not sure that's worth the bother, though (?). > > Oliver > > diff --git a/Src/Modules/watch.c b/Src/Modules/watch.c > index 2ad962fb6..d30f5ff98 100644 > --- a/Src/Modules/watch.c > +++ b/Src/Modules/watch.c > @@ -458,10 +458,14 @@ watchlog(int inout, WATCH_STRUCT_UTMP *u, char **w, > char *fmt) > (void)watchlog2(inout, u, fmt, 1, 0); > return; > } > - if (*w && !strcmp(*w, "notme") && > - strncmp(u->ut_name, get_username(), sizeof(u->ut_name))) { > - (void)watchlog2(inout, u, fmt, 1, 0); > - return; > + if (*w && !strcmp(*w, "notme")) { > + char *username = metafy(u->ut_name, > + strnlen(u->ut_name, sizeof(u->ut_name)), META_USEHEAP); > + if (strcmp(username, get_username())) { > + (void)watchlog2(inout, u, fmt, 1, 0); > + return; > + } > + w++; > } > for (; *w; w++) { > bad = 0; > [-- Attachment #2: Type: text/html, Size: 3471 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* metafy() (was Re: $watch, log and Cyrillic usernames) 2023-10-07 2:05 ` Oliver Kiddle 2023-10-07 16:49 ` Максим @ 2023-10-08 21:47 ` Oliver Kiddle 2023-10-09 2:01 ` Bart Schaefer 1 sibling, 1 reply; 9+ messages in thread From: Oliver Kiddle @ 2023-10-08 21:47 UTC (permalink / raw) To: zsh-workers I wrote: > However, in looking closer at the code I observed the existing use of > sizeof(u->ut_name) which is 32 on my system. So I tried creating 32 and > 33 character usernames (which, incidentally, useradd was happy with) and It occurred to be to double check what metafy() does about null termination with META_USEHEAP. The comment by META_USEHEAP is "get memory from the heap. This leaves buf unchanged". The main way it differs from META_HEAPDUP is that if no characters that need metafying are found, it will return back the original passed buf. However, it does add a terminating null at the len + 1 position so while the buf pointer is unchanged, what it points to does get changed. There aren't especially many calls to metafy with META_USEHEAP and in most cases, the call uses the result of getkeystring(). I noticed one case where we do need to add 1 byte to an allocation to accomodate this null. I'm not sure what the best approach is for the watch module. Subtracting 1 from n in each call to strnlen() avoids writing a null past the end of the buffer but is not ideal for 32 character usernames. Using META_HEAPDUP instead means a lot of heap allocations in the normal case where there are only short ASCII-only usernames. Any ideas? Oliver diff --git a/Src/subst.c b/Src/subst.c index cdbfc138a..60d850feb 100644 --- a/Src/subst.c +++ b/Src/subst.c @@ -1506,7 +1506,7 @@ substevalchar(char *ptr) } #ifdef MULTIBYTE_SUPPORT else if (isset(MULTIBYTE) && ires > 127) { - ptr = zhalloc(MB_CUR_MAX); + ptr = zhalloc(MB_CUR_MAX+1); len = ucs4tomb((unsigned int)ires & 0xffffffff, ptr); } if (len <= 0) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: metafy() (was Re: $watch, log and Cyrillic usernames) 2023-10-08 21:47 ` metafy() (was Re: $watch, log and Cyrillic usernames) Oliver Kiddle @ 2023-10-09 2:01 ` Bart Schaefer 2023-10-10 21:46 ` Oliver Kiddle 0 siblings, 1 reply; 9+ messages in thread From: Bart Schaefer @ 2023-10-09 2:01 UTC (permalink / raw) To: zsh-workers On Sun, Oct 8, 2023 at 2:47 PM Oliver Kiddle <opk@zsh.org> wrote: > > I'm not sure what the best approach is for the watch module. Subtracting 1 > from n in each call to strnlen() avoids writing a null past the end of > the buffer but is not ideal for 32 character usernames. Using > META_HEAPDUP instead means a lot of heap allocations in the normal case > where there are only short ASCII-only usernames. Any ideas? Handle the special case explicitly? If I follow your explanation, something like this? int len = strnlen(u->ut_name, sizeof(u->ut_name)); char *username = metafy(u->ut_name, len, (len == sizeof(u->ut_name) ? META_HEAPDUP /* allow for nul terminator */ : META_USEHEAP)); ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: metafy() (was Re: $watch, log and Cyrillic usernames) 2023-10-09 2:01 ` Bart Schaefer @ 2023-10-10 21:46 ` Oliver Kiddle 0 siblings, 0 replies; 9+ messages in thread From: Oliver Kiddle @ 2023-10-10 21:46 UTC (permalink / raw) To: Bart Schaefer; +Cc: zsh-workers Bart Schaefer wrote: > Handle the special case explicitly? If I follow your explanation, > something like this? Thanks. That works. Full patch follows. Oliver diff --git a/Src/Modules/watch.c b/Src/Modules/watch.c index 0de8cbf9a..97d4fa608 100644 --- a/Src/Modules/watch.c +++ b/Src/Modules/watch.c @@ -423,20 +423,23 @@ watchlog2(int inout, WATCH_STRUCT_UTMP *u, char *fmt, int prnt, int fini) /* See if the watch entry matches */ static int -watchlog_match(char *teststr, char *actual, int len) +watchlog_match(char *teststr, char *actual, size_t buflen) { int ret = 0; Patprog pprog; char *str = dupstring(teststr); + int len = strnlen(actual, buflen); + char *user = metafy(actual, len, + len == buflen ? META_HEAPDUP : META_USEHEAP); tokenize(str); if ((pprog = patcompile(str, PAT_STATIC, 0))) { queue_signals(); - if (pattry(pprog, actual)) + if (pattry(pprog, user)) ret = 1; unqueue_signals(); - } else if (!strncmp(actual, teststr, len)) + } else if (!strcmp(user, teststr)) ret = 1; return ret; } @@ -456,10 +459,17 @@ watchlog(int inout, WATCH_STRUCT_UTMP *u, char **w, char *fmt) (void)watchlog2(inout, u, fmt, 1, 0); return; } - if (*w && !strcmp(*w, "notme") && - strncmp(u->ut_name, get_username(), sizeof(u->ut_name))) { - (void)watchlog2(inout, u, fmt, 1, 0); - return; + if (*w && !strcmp(*w, "notme")) { + int len = strnlen(u->ut_name, sizeof(u->ut_name)); + char *username = metafy(u->ut_name, len, + (len == sizeof(u->ut_name) ? + META_HEAPDUP /* allow for nul terminator */ : + META_USEHEAP)); + if (strcmp(username, get_username())) { + (void)watchlog2(inout, u, fmt, 1, 0); + return; + } + w++; } for (; *w; w++) { bad = 0; @@ -488,7 +498,7 @@ watchlog(int inout, WATCH_STRUCT_UTMP *u, char **w, char *fmt) for (vv = ++v; *vv && *vv != '%'; vv++); sav = *vv; *vv = '\0'; - if (!watchlog_match(v, u->ut_host, strlen(v))) + if (!watchlog_match(v, u->ut_host, sizeof(u->ut_host))) bad = 1; *vv = sav; v = vv; ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-10-10 21:46 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-10-05 21:29 $watch, log and Cyrillic usernames Максим 2023-10-05 21:42 ` Bart Schaefer 2023-10-07 2:05 ` Oliver Kiddle 2023-10-07 16:49 ` Максим 2023-10-08 0:24 ` Oliver Kiddle 2023-10-08 6:20 ` Максим 2023-10-08 21:47 ` metafy() (was Re: $watch, log and Cyrillic usernames) Oliver Kiddle 2023-10-09 2:01 ` Bart Schaefer 2023-10-10 21:46 ` Oliver Kiddle
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).