[-- Attachment #1: Type: text/plain, Size: 2156 bytes --] zsh uses zgetdir to evaluate ${FOO:a}. When in a directory in a different filesystem to its parent, zgetdir will lstat many entries in the parent. This leads to slowness, especially when some of the lstatted entries are on remote filesystems. This behaviour can be avoided by making zsh use getcwd instead (build zsh with USE_GETCWD). Interestingly, getcwd will also be used in the regular zsh build if zgetdir fails (see the implementation of zgetcwd). I have looked through some threads in the mailing list to find out why zgetdir is being used over getcwd in the first place. It seems it was implemented this way in the 1990s to work around a bug in some implementations of getcwd, which should not be a concern on modern systems. I was wondering if zgetdir is still the right way to do this? Are there use cases where it is preferred over getcwd? Maybe zgetcwd could simply use getcwd primarily (i.e. not as fallback), or perhaps USE_GETCWD could be enabled by default (currently it is only enabled on QNX builds). ________________________________ This communication is intended only for the addressee(s), may contain confidential, privileged or proprietary information, and may be protected by US and other laws. Your acceptance of this communication constitutes your agreement to keep confidential all the confidential information contained in this communication, as well as any information derived by you from the confidential information contained in this communication. We do not waive any confidentiality by misdelivery. If you receive this communication in error, any use, dissemination, printing or copying is strictly prohibited; please destroy all electronic and paper copies and notify the sender immediately. Nothing in this email is intended to constitute (1) investment or trading advice or recommendations or any advertisement or (2) a solicitation of an investment in any jurisdiction in which such a solicitation would be unlawful. Please note that PDT Partners UK, LLP, including its affiliates, reserves the right to intercept, archive, monitor and review all communications to and from its network. [-- Attachment #2: Type: text/html, Size: 5019 bytes --]
On 20 May 2022 at 11:08 Stefan Radziuk <sradziuk@pdtpartners.com> wrote:
> I have looked through some threads in the mailing list to find out why
> zgetdir > is being used over getcwd in the first place. It seems it was
> implemented this way in the 1990s to work around a bug in some implementations
> of getcwd, which should not be a concern on modern systems.
>
> I was wondering if zgetdir is still the right way to do this? Are there
> use cases where it is preferred over getcwd? Maybe zgetcwd could simply
> use getcwd primarily (i.e. not as fallback), or perhaps USE_GETCWD could
> be enabled by default (currently it is only enabled on QNX builds).
The best bet would probably be to try that by default once we're sure the
release is out of the way --- as you say there's a good chance that's the
right way to go now and I don't think we're going to find out the problems
any other way.
pws
On 5/20/22, Stefan Radziuk <sradziuk@pdtpartners.com> wrote:
> zsh uses zgetdir to evaluate ${FOO:a}. When in a directory in a different
> filesystem to its parent, zgetdir will lstat many entries in the parent.
> This leads to slowness, especially when some of the lstatted entries are on
> remote filesystems.
>
>
>
> This behaviour can be avoided by making zsh use getcwd instead (build zsh
> with USE_GETCWD). Interestingly, getcwd will also be used in the regular zsh
> build if zgetdir fails (see the implementation of zgetcwd).
>
>
>
> I have looked through some threads in the mailing list to find out why
> zgetdir is being used over getcwd in the first place. It seems it was
> implemented this way in the 1990s to work around a bug in some
> implementations of getcwd, which should not be a concern on modern systems.
>
>
>
> I was wondering if zgetdir is still the right way to do this? Are there use
> cases where it is preferred over getcwd? Maybe zgetcwd could simply use
> getcwd primarily (i.e. not as fallback), or perhaps USE_GETCWD could be
> enabled by default (currently it is only enabled on QNX builds).
Do you get the same results with :P ? :a and :A are not guaranteed to
result in a path leading to the same file (eg, symlink/../ will be
simply deleted without following the symlink etc)
--
Mikael Magnusson
[-- Attachment #1: Type: text/plain, Size: 3069 bytes --] > Do you get the same results with :P ? :a and :A are not guaranteed to > result in a path leading to the same file (eg, symlink/../ will be > simply deleted without following the symlink etc) Yes, running with strace confirms that :a,A,P all exhibit the same lstatting behaviour (they all call zgetcwd). From: Mikael Magnusson <mikachu@gmail.com> Date: Friday, May 20, 2022 at 11:35 AM To: Stefan Radziuk <sradziuk@pdtpartners.com> Cc: zsh-workers@zsh.org <zsh-workers@zsh.org> Subject: (EXT) Re: Using the history expansion modifier 'a' results in excessive lstat calls On 5/20/22, Stefan Radziuk <sradziuk@pdtpartners.com> wrote: > zsh uses zgetdir to evaluate ${FOO:a}. When in a directory in a different > filesystem to its parent, zgetdir will lstat many entries in the parent. > This leads to slowness, especially when some of the lstatted entries are on > remote filesystems. > > > > This behaviour can be avoided by making zsh use getcwd instead (build zsh > with USE_GETCWD). Interestingly, getcwd will also be used in the regular zsh > build if zgetdir fails (see the implementation of zgetcwd). > > > > I have looked through some threads in the mailing list to find out why > zgetdir is being used over getcwd in the first place. It seems it was > implemented this way in the 1990s to work around a bug in some > implementations of getcwd, which should not be a concern on modern systems. > > > > I was wondering if zgetdir is still the right way to do this? Are there use > cases where it is preferred over getcwd? Maybe zgetcwd could simply use > getcwd primarily (i.e. not as fallback), or perhaps USE_GETCWD could be > enabled by default (currently it is only enabled on QNX builds). Do you get the same results with :P ? :a and :A are not guaranteed to result in a path leading to the same file (eg, symlink/../ will be simply deleted without following the symlink etc) -- Mikael Magnusson ________________________________ This communication is intended only for the addressee(s), may contain confidential, privileged or proprietary information, and may be protected by US and other laws. Your acceptance of this communication constitutes your agreement to keep confidential all the confidential information contained in this communication, as well as any information derived by you from the confidential information contained in this communication. We do not waive any confidentiality by misdelivery. If you receive this communication in error, any use, dissemination, printing or copying is strictly prohibited; please destroy all electronic and paper copies and notify the sender immediately. Nothing in this email is intended to constitute (1) investment or trading advice or recommendations or any advertisement or (2) a solicitation of an investment in any jurisdiction in which such a solicitation would be unlawful. Please note that PDT Partners UK, LLP, including its affiliates, reserves the right to intercept, archive, monitor and review all communications to and from its network. [-- Attachment #2: Type: text/html, Size: 8199 bytes --]
> On 20 May 2022 at 11:15 Peter Stephenson <p.w.stephenson@ntlworld.com> wrote: > On 20 May 2022 at 11:08 Stefan Radziuk <sradziuk@pdtpartners.com> wrote: > > I have looked through some threads in the mailing list to find out why > > zgetdir > is being used over getcwd in the first place. It seems it was > > implemented this way in the 1990s to work around a bug in some implementations > > of getcwd, which should not be a concern on modern systems. > > > > I was wondering if zgetdir is still the right way to do this? Are there > > use cases where it is preferred over getcwd? Maybe zgetcwd could simply > > use getcwd primarily (i.e. not as fallback), or perhaps USE_GETCWD could > > be enabled by default (currently it is only enabled on QNX builds). > > The best bet would probably be to try that by default once we're sure the > release is out of the way --- as you say there's a good chance that's the > right way to go now and I don't think we're going to find out the problems > any other way. Will it be worth seeing what effects the following has? pws diff --git a/configure.ac b/configure.ac index c72148d06..77e381f50 100644 --- a/configure.ac +++ b/configure.ac @@ -2021,11 +2021,15 @@ if test x$zsh_cv_sys_superroot = xyes; then fi dnl CHECK FOR SYSTEMS REQUIRING GETCWD +dnl This is now turned on by default, as we expect modern getcwd +dnl implementations to work correctly. Any exceptions should be added +dnl to the first case. Currently there are none, hence it is forced +dnl not to match. AC_CACHE_CHECK(whether we should use the native getcwd, zsh_cv_use_getcwd, [case "${host_cpu}-${host_vendor}-${host_os}" in - *QNX*) zsh_cv_use_getcwd=yes ;; - *) zsh_cv_use_getcwd=no ;; + *NOMATCH*) zsh_cv_use_getcwd=no ;; + *) zsh_cv_use_getcwd=yes ;; esac]) AH_TEMPLATE([USE_GETCWD], [Define to 1 if you need to use the native getcwd.])
[-- Attachment #1: Type: text/plain, Size: 3445 bytes --] > Will it be worth seeing what effects the following has? This resolves the issue for me -- verified with strace. Thanks. From: Peter Stephenson <p.w.stephenson@ntlworld.com> Date: Wednesday, May 25, 2022 at 10:53 AM To: Stefan Radziuk <sradziuk@pdtpartners.com>, zsh-workers@zsh.org <zsh-workers@zsh.org> Subject: (EXT) Re: Using the history expansion modifier 'a' results in excessive lstat calls > On 20 May 2022 at 11:15 Peter Stephenson <p.w.stephenson@ntlworld.com> wrote: > On 20 May 2022 at 11:08 Stefan Radziuk <sradziuk@pdtpartners.com> wrote: > > I have looked through some threads in the mailing list to find out why > > zgetdir > is being used over getcwd in the first place. It seems it was > > implemented this way in the 1990s to work around a bug in some implementations > > of getcwd, which should not be a concern on modern systems. > > > > I was wondering if zgetdir is still the right way to do this? Are there > > use cases where it is preferred over getcwd? Maybe zgetcwd could simply > > use getcwd primarily (i.e. not as fallback), or perhaps USE_GETCWD could > > be enabled by default (currently it is only enabled on QNX builds). > > The best bet would probably be to try that by default once we're sure the > release is out of the way --- as you say there's a good chance that's the > right way to go now and I don't think we're going to find out the problems > any other way. Will it be worth seeing what effects the following has? pws diff --git a/configure.ac b/configure.ac index c72148d06..77e381f50 100644 --- a/configure.ac +++ b/configure.ac @@ -2021,11 +2021,15 @@ if test x$zsh_cv_sys_superroot = xyes; then fi dnl CHECK FOR SYSTEMS REQUIRING GETCWD +dnl This is now turned on by default, as we expect modern getcwd +dnl implementations to work correctly. Any exceptions should be added +dnl to the first case. Currently there are none, hence it is forced +dnl not to match. AC_CACHE_CHECK(whether we should use the native getcwd, zsh_cv_use_getcwd, [case "${host_cpu}-${host_vendor}-${host_os}" in - *QNX*) zsh_cv_use_getcwd=yes ;; - *) zsh_cv_use_getcwd=no ;; + *NOMATCH*) zsh_cv_use_getcwd=no ;; + *) zsh_cv_use_getcwd=yes ;; esac]) AH_TEMPLATE([USE_GETCWD], [Define to 1 if you need to use the native getcwd.]) ________________________________ This communication is intended only for the addressee(s), may contain confidential, privileged or proprietary information, and may be protected by US and other laws. Your acceptance of this communication constitutes your agreement to keep confidential all the confidential information contained in this communication, as well as any information derived by you from the confidential information contained in this communication. We do not waive any confidentiality by misdelivery. If you receive this communication in error, any use, dissemination, printing or copying is strictly prohibited; please destroy all electronic and paper copies and notify the sender immediately. Nothing in this email is intended to constitute (1) investment or trading advice or recommendations or any advertisement or (2) a solicitation of an investment in any jurisdiction in which such a solicitation would be unlawful. Please note that PDT Partners UK, LLP, including its affiliates, reserves the right to intercept, archive, monitor and review all communications to and from its network. [-- Attachment #2: Type: text/html, Size: 10358 bytes --]
> On 25 May 2022 at 11:22 Stefan Radziuk <sradziuk@pdtpartners.com> wrote:
> > Will it be worth seeing what effects the following has?
>
> This resolves the issue for me -- verified with strace. Thanks.
>
> dnl CHECK FOR SYSTEMS REQUIRING GETCWD
> +dnl This is now turned on by default, as we expect modern getcwd
> +dnl implementations to work correctly. Any exceptions should be added
> +dnl to the first case. Currently there are none, hence it is forced
> +dnl not to match.
Unless anyone has strong views, I'll commit this --- hoping at the most
we'll be look at edge effects on older or rarer systems which we're not
going to track down any other way.
pws
> On 25 May 2022 at 11:33 Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> > On 25 May 2022 at 11:22 Stefan Radziuk <sradziuk@pdtpartners.com> wrote:
> > > Will it be worth seeing what effects the following has?
> >
> > This resolves the issue for me -- verified with strace. Thanks.
> >
> > dnl CHECK FOR SYSTEMS REQUIRING GETCWD
> > +dnl This is now turned on by default, as we expect modern getcwd
> > +dnl implementations to work correctly. Any exceptions should be added
> > +dnl to the first case. Currently there are none, hence it is forced
> > +dnl not to match.
>
> Unless anyone has strong views, I'll commit this --- hoping at the most
> we'll be look at edge effects on older or rarer systems which we're not
> going to track down any other way.
Not hearing comments, so this is committed. Workers with such systems
are advised to keep watch for any oddities (very likely edge cases) in
getting the current directory.
pws