zsh-workers
 help / color / mirror / code / Atom feed
* Using the history expansion modifier 'a' results in excessive lstat calls
@ 2022-05-20 10:08 Stefan Radziuk
  2022-05-20 10:15 ` Peter Stephenson
  2022-05-20 10:34 ` Mikael Magnusson
  0 siblings, 2 replies; 8+ messages in thread
From: Stefan Radziuk @ 2022-05-20 10:08 UTC (permalink / raw)
  To: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 2156 bytes --]

zsh uses zgetdir to evaluate ${FOO:a}. When in a directory in a different filesystem to its parent,  zgetdir will lstat many entries in the parent. This leads to slowness, especially when some of the lstatted entries are on remote filesystems.



This behaviour can be avoided by making zsh use getcwd instead (build zsh with USE_GETCWD). Interestingly, getcwd will also be used in the regular zsh build if zgetdir fails (see the implementation of zgetcwd).



I have looked through some threads in the mailing list to find out why zgetdir is being used over getcwd in the first place. It seems it was implemented this way in the 1990s to work around a bug in some implementations of getcwd, which should not be a concern on modern systems.



I was wondering if zgetdir is still the right way to do this? Are there use cases where it is preferred over getcwd? Maybe zgetcwd could simply use getcwd primarily (i.e. not as fallback), or perhaps USE_GETCWD could be enabled by default (currently it is only enabled on QNX builds).

________________________________
This communication is intended only for the addressee(s), may contain confidential, privileged or proprietary information, and may be protected by US and other laws. Your acceptance of this communication constitutes your agreement to keep confidential all the confidential information contained in this communication, as well as any information derived by you from the confidential information contained in this communication. We do not waive any confidentiality by misdelivery.

If you receive this communication in error, any use, dissemination, printing or copying is strictly prohibited; please destroy all electronic and paper copies and notify the sender immediately. Nothing in this email is intended to constitute (1) investment or trading advice or recommendations or any advertisement or (2) a solicitation of an investment in any jurisdiction in which such a solicitation would be unlawful.

Please note that PDT Partners UK, LLP, including its affiliates, reserves the right to intercept, archive, monitor and review all communications to and from its network.

[-- Attachment #2: Type: text/html, Size: 5019 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Using the history expansion modifier 'a' results in excessive lstat calls
  2022-05-20 10:08 Using the history expansion modifier 'a' results in excessive lstat calls Stefan Radziuk
@ 2022-05-20 10:15 ` Peter Stephenson
  2022-05-25  9:53   ` Peter Stephenson
  2022-05-20 10:34 ` Mikael Magnusson
  1 sibling, 1 reply; 8+ messages in thread
From: Peter Stephenson @ 2022-05-20 10:15 UTC (permalink / raw)
  To: Stefan Radziuk, zsh-workers

On 20 May 2022 at 11:08 Stefan Radziuk <sradziuk@pdtpartners.com> wrote:
> I have looked through some threads in the mailing list to find out why
> zgetdir > is being used over getcwd in the first place. It seems it was
> implemented this way in the 1990s to work around a bug in some implementations
> of getcwd, which should not be a concern on modern systems. 
> 
> I was wondering if zgetdir is still the right way to do this? Are there
> use cases where it is preferred over getcwd? Maybe zgetcwd could simply
> use getcwd primarily (i.e. not as fallback), or perhaps USE_GETCWD could
> be enabled by default (currently it is only enabled on QNX builds).

The best bet would probably be to try that by default once we're sure the
release is out of the way --- as you say there's a good chance that's the
right way to go now and I don't think we're going to find out the problems
any other way.

pws


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Using the history expansion modifier 'a' results in excessive lstat calls
  2022-05-20 10:08 Using the history expansion modifier 'a' results in excessive lstat calls Stefan Radziuk
  2022-05-20 10:15 ` Peter Stephenson
@ 2022-05-20 10:34 ` Mikael Magnusson
  2022-05-23 11:32   ` (EXT) " Stefan Radziuk
  1 sibling, 1 reply; 8+ messages in thread
From: Mikael Magnusson @ 2022-05-20 10:34 UTC (permalink / raw)
  To: Stefan Radziuk; +Cc: zsh-workers

On 5/20/22, Stefan Radziuk <sradziuk@pdtpartners.com> wrote:
> zsh uses zgetdir to evaluate ${FOO:a}. When in a directory in a different
> filesystem to its parent,  zgetdir will lstat many entries in the parent.
> This leads to slowness, especially when some of the lstatted entries are on
> remote filesystems.
>
>
>
> This behaviour can be avoided by making zsh use getcwd instead (build zsh
> with USE_GETCWD). Interestingly, getcwd will also be used in the regular zsh
> build if zgetdir fails (see the implementation of zgetcwd).
>
>
>
> I have looked through some threads in the mailing list to find out why
> zgetdir is being used over getcwd in the first place. It seems it was
> implemented this way in the 1990s to work around a bug in some
> implementations of getcwd, which should not be a concern on modern systems.
>
>
>
> I was wondering if zgetdir is still the right way to do this? Are there use
> cases where it is preferred over getcwd? Maybe zgetcwd could simply use
> getcwd primarily (i.e. not as fallback), or perhaps USE_GETCWD could be
> enabled by default (currently it is only enabled on QNX builds).

Do you get the same results with :P ? :a and :A are not guaranteed to
result in a path leading to the same file (eg, symlink/../ will be
simply deleted without following the symlink etc)

-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: (EXT) Re: Using the history expansion modifier 'a' results in excessive lstat calls
  2022-05-20 10:34 ` Mikael Magnusson
@ 2022-05-23 11:32   ` Stefan Radziuk
  0 siblings, 0 replies; 8+ messages in thread
From: Stefan Radziuk @ 2022-05-23 11:32 UTC (permalink / raw)
  Cc: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 3069 bytes --]

> Do you get the same results with :P ? :a and :A are not guaranteed to
> result in a path leading to the same file (eg, symlink/../ will be
> simply deleted without following the symlink etc)

Yes, running with strace confirms that :a,A,P all exhibit the same lstatting behaviour (they all call zgetcwd).


From: Mikael Magnusson <mikachu@gmail.com>
Date: Friday, May 20, 2022 at 11:35 AM
To: Stefan Radziuk <sradziuk@pdtpartners.com>
Cc: zsh-workers@zsh.org <zsh-workers@zsh.org>
Subject: (EXT) Re: Using the history expansion modifier 'a' results in excessive lstat calls
On 5/20/22, Stefan Radziuk <sradziuk@pdtpartners.com> wrote:
> zsh uses zgetdir to evaluate ${FOO:a}. When in a directory in a different
> filesystem to its parent,  zgetdir will lstat many entries in the parent.
> This leads to slowness, especially when some of the lstatted entries are on
> remote filesystems.
>
>
>
> This behaviour can be avoided by making zsh use getcwd instead (build zsh
> with USE_GETCWD). Interestingly, getcwd will also be used in the regular zsh
> build if zgetdir fails (see the implementation of zgetcwd).
>
>
>
> I have looked through some threads in the mailing list to find out why
> zgetdir is being used over getcwd in the first place. It seems it was
> implemented this way in the 1990s to work around a bug in some
> implementations of getcwd, which should not be a concern on modern systems.
>
>
>
> I was wondering if zgetdir is still the right way to do this? Are there use
> cases where it is preferred over getcwd? Maybe zgetcwd could simply use
> getcwd primarily (i.e. not as fallback), or perhaps USE_GETCWD could be
> enabled by default (currently it is only enabled on QNX builds).

Do you get the same results with :P ? :a and :A are not guaranteed to
result in a path leading to the same file (eg, symlink/../ will be
simply deleted without following the symlink etc)

--
Mikael Magnusson
________________________________
This communication is intended only for the addressee(s), may contain confidential, privileged or proprietary information, and may be protected by US and other laws. Your acceptance of this communication constitutes your agreement to keep confidential all the confidential information contained in this communication, as well as any information derived by you from the confidential information contained in this communication. We do not waive any confidentiality by misdelivery.

If you receive this communication in error, any use, dissemination, printing or copying is strictly prohibited; please destroy all electronic and paper copies and notify the sender immediately. Nothing in this email is intended to constitute (1) investment or trading advice or recommendations or any advertisement or (2) a solicitation of an investment in any jurisdiction in which such a solicitation would be unlawful.

Please note that PDT Partners UK, LLP, including its affiliates, reserves the right to intercept, archive, monitor and review all communications to and from its network.

[-- Attachment #2: Type: text/html, Size: 8199 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Using the history expansion modifier 'a' results in excessive lstat calls
  2022-05-20 10:15 ` Peter Stephenson
@ 2022-05-25  9:53   ` Peter Stephenson
  2022-05-25 10:22     ` (EXT) " Stefan Radziuk
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Stephenson @ 2022-05-25  9:53 UTC (permalink / raw)
  To: Stefan Radziuk, zsh-workers

> On 20 May 2022 at 11:15 Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> On 20 May 2022 at 11:08 Stefan Radziuk <sradziuk@pdtpartners.com> wrote:
> > I have looked through some threads in the mailing list to find out why
> > zgetdir > is being used over getcwd in the first place. It seems it was
> > implemented this way in the 1990s to work around a bug in some implementations
> > of getcwd, which should not be a concern on modern systems. 
> > 
> > I was wondering if zgetdir is still the right way to do this? Are there
> > use cases where it is preferred over getcwd? Maybe zgetcwd could simply
> > use getcwd primarily (i.e. not as fallback), or perhaps USE_GETCWD could
> > be enabled by default (currently it is only enabled on QNX builds).
> 
> The best bet would probably be to try that by default once we're sure the
> release is out of the way --- as you say there's a good chance that's the
> right way to go now and I don't think we're going to find out the problems
> any other way.

Will it be worth seeing what effects the following has?

pws

diff --git a/configure.ac b/configure.ac
index c72148d06..77e381f50 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2021,11 +2021,15 @@ if test x$zsh_cv_sys_superroot = xyes; then
 fi
 
 dnl CHECK FOR SYSTEMS REQUIRING GETCWD
+dnl This is now turned on by default, as we expect modern getcwd
+dnl implementations to work correctly.  Any exceptions should be added
+dnl to the first case.  Currently there are none, hence it is forced
+dnl not to match.
 AC_CACHE_CHECK(whether we should use the native getcwd,
 zsh_cv_use_getcwd,
 [case "${host_cpu}-${host_vendor}-${host_os}" in
-    *QNX*) zsh_cv_use_getcwd=yes ;;
-    *) zsh_cv_use_getcwd=no ;;
+    *NOMATCH*) zsh_cv_use_getcwd=no ;;
+    *) zsh_cv_use_getcwd=yes ;;
  esac])
 AH_TEMPLATE([USE_GETCWD],
 [Define to 1 if you need to use the native getcwd.])


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: (EXT) Re: Using the history expansion modifier 'a' results in excessive lstat calls
  2022-05-25  9:53   ` Peter Stephenson
@ 2022-05-25 10:22     ` Stefan Radziuk
  2022-05-25 10:33       ` Peter Stephenson
  0 siblings, 1 reply; 8+ messages in thread
From: Stefan Radziuk @ 2022-05-25 10:22 UTC (permalink / raw)
  To: Peter Stephenson, zsh-workers

[-- Attachment #1: Type: text/plain, Size: 3445 bytes --]

> Will it be worth seeing what effects the following has?

This resolves the issue for me -- verified with strace. Thanks.

From: Peter Stephenson <p.w.stephenson@ntlworld.com>
Date: Wednesday, May 25, 2022 at 10:53 AM
To: Stefan Radziuk <sradziuk@pdtpartners.com>, zsh-workers@zsh.org <zsh-workers@zsh.org>
Subject: (EXT) Re: Using the history expansion modifier 'a' results in excessive lstat calls
> On 20 May 2022 at 11:15 Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> On 20 May 2022 at 11:08 Stefan Radziuk <sradziuk@pdtpartners.com> wrote:
> > I have looked through some threads in the mailing list to find out why
> > zgetdir > is being used over getcwd in the first place. It seems it was
> > implemented this way in the 1990s to work around a bug in some implementations
> > of getcwd, which should not be a concern on modern systems.
> >
> > I was wondering if zgetdir is still the right way to do this? Are there
> > use cases where it is preferred over getcwd? Maybe zgetcwd could simply
> > use getcwd primarily (i.e. not as fallback), or perhaps USE_GETCWD could
> > be enabled by default (currently it is only enabled on QNX builds).
>
> The best bet would probably be to try that by default once we're sure the
> release is out of the way --- as you say there's a good chance that's the
> right way to go now and I don't think we're going to find out the problems
> any other way.

Will it be worth seeing what effects the following has?

pws

diff --git a/configure.ac b/configure.ac
index c72148d06..77e381f50 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2021,11 +2021,15 @@ if test x$zsh_cv_sys_superroot = xyes; then
 fi

 dnl CHECK FOR SYSTEMS REQUIRING GETCWD
+dnl This is now turned on by default, as we expect modern getcwd
+dnl implementations to work correctly.  Any exceptions should be added
+dnl to the first case.  Currently there are none, hence it is forced
+dnl not to match.
 AC_CACHE_CHECK(whether we should use the native getcwd,
 zsh_cv_use_getcwd,
 [case "${host_cpu}-${host_vendor}-${host_os}" in
-    *QNX*) zsh_cv_use_getcwd=yes ;;
-    *) zsh_cv_use_getcwd=no ;;
+    *NOMATCH*) zsh_cv_use_getcwd=no ;;
+    *) zsh_cv_use_getcwd=yes ;;
  esac])
 AH_TEMPLATE([USE_GETCWD],
 [Define to 1 if you need to use the native getcwd.])
________________________________
This communication is intended only for the addressee(s), may contain confidential, privileged or proprietary information, and may be protected by US and other laws. Your acceptance of this communication constitutes your agreement to keep confidential all the confidential information contained in this communication, as well as any information derived by you from the confidential information contained in this communication. We do not waive any confidentiality by misdelivery.

If you receive this communication in error, any use, dissemination, printing or copying is strictly prohibited; please destroy all electronic and paper copies and notify the sender immediately. Nothing in this email is intended to constitute (1) investment or trading advice or recommendations or any advertisement or (2) a solicitation of an investment in any jurisdiction in which such a solicitation would be unlawful.

Please note that PDT Partners UK, LLP, including its affiliates, reserves the right to intercept, archive, monitor and review all communications to and from its network.

[-- Attachment #2: Type: text/html, Size: 10358 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: (EXT) Re: Using the history expansion modifier 'a' results in excessive lstat calls
  2022-05-25 10:22     ` (EXT) " Stefan Radziuk
@ 2022-05-25 10:33       ` Peter Stephenson
  2022-05-26  8:35         ` Peter Stephenson
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Stephenson @ 2022-05-25 10:33 UTC (permalink / raw)
  To: zsh-workers

> On 25 May 2022 at 11:22 Stefan Radziuk <sradziuk@pdtpartners.com> wrote:
> > Will it be worth seeing what effects the following has?
> 
> This resolves the issue for me -- verified with strace. Thanks.
>
>  dnl CHECK FOR SYSTEMS REQUIRING GETCWD
> +dnl This is now turned on by default, as we expect modern getcwd
> +dnl implementations to work correctly.  Any exceptions should be added
> +dnl to the first case.  Currently there are none, hence it is forced
> +dnl not to match.

Unless anyone has strong views, I'll commit this --- hoping at the most
we'll be look at edge effects on older or rarer systems which we're not
going to track down any other way.

pws


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: (EXT) Re: Using the history expansion modifier 'a' results in excessive lstat calls
  2022-05-25 10:33       ` Peter Stephenson
@ 2022-05-26  8:35         ` Peter Stephenson
  0 siblings, 0 replies; 8+ messages in thread
From: Peter Stephenson @ 2022-05-26  8:35 UTC (permalink / raw)
  To: zsh-workers

> On 25 May 2022 at 11:33 Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> > On 25 May 2022 at 11:22 Stefan Radziuk <sradziuk@pdtpartners.com> wrote:
> > > Will it be worth seeing what effects the following has?
> > 
> > This resolves the issue for me -- verified with strace. Thanks.
> >
> >  dnl CHECK FOR SYSTEMS REQUIRING GETCWD
> > +dnl This is now turned on by default, as we expect modern getcwd
> > +dnl implementations to work correctly.  Any exceptions should be added
> > +dnl to the first case.  Currently there are none, hence it is forced
> > +dnl not to match.
> 
> Unless anyone has strong views, I'll commit this --- hoping at the most
> we'll be look at edge effects on older or rarer systems which we're not
> going to track down any other way.

Not hearing comments, so this is committed.  Workers with such systems
are advised to keep watch for any oddities (very likely edge cases) in
getting the current directory.

pws


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-05-26  8:35 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-20 10:08 Using the history expansion modifier 'a' results in excessive lstat calls Stefan Radziuk
2022-05-20 10:15 ` Peter Stephenson
2022-05-25  9:53   ` Peter Stephenson
2022-05-25 10:22     ` (EXT) " Stefan Radziuk
2022-05-25 10:33       ` Peter Stephenson
2022-05-26  8:35         ` Peter Stephenson
2022-05-20 10:34 ` Mikael Magnusson
2022-05-23 11:32   ` (EXT) " Stefan Radziuk

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).