> n=($^fpath(e^'n=($REPLY/*(N.)); reply=("$#n $REPLY")'^))
> print -l ${${(On)n}[1,3]}

And this continues to demonstrate that Zsh is the only language that the
more I learn, the less readable my code becomes.  I really do appreciate
you demonstrating the most-Zsh way to achieve the desired result.

There really ought to be an explainshell.com equivalent for Zsh expressions
/ expansion / modifiers / etc.  The information is already nicely codified
in Zsh autocompletion (e.g. ${(<TAB> ), it would be nice to feed in
expressions like the above and get a sane explanation.

I actually intend to use this goal for a babys-first-Rust project, so we'll
see how far along I get.  The MVP of the project is to take a glob / path
expansion expression (e.g. foo/**/bar(^/.) ) and convert it into a BSD find
expression.

>> for d in $fpath; do n=$(ls $d/* | wc -l); echo "$n $d"; done | sort -nr
| head -3
> Good heavens, so many processes and pipes.

Pipes are nicely composable, and maintainable by others not intimately
familiar with Zsh section 14.  The Unix philosophy still applies -- do one
thing and do it well.  Shells are good at connecting inputs and outputs and
modifying them.

Sure, $#fpath + 4 processes is heavy vs. an array construct and a builtin
-- but I don't think anybody is writing shell scripts for performance.

This is more apparent when using e.g. the Rust package 'fd' via `fd --type
f .` instead of extglob with **/*(.).  8.2 seconds vs 328 seconds (sample
size is 1,641,649 files).  The only trade-off is that "fd" does not
guarantee any ordering.  Skipping fd's internal sorting flags, and piping
directly to sort(1) gives a runtime of 30 seconds -- still ten times faster
than extglob (which is close to the number of CPU cores I have).

$ hyperfine --runs 2 "zsh -il -c 'echo **/*(.)'" "zsh -il -c 'fd --type f
.'"
Benchmark 1: zsh -il -c 'echo **/*(.)'
  Time (mean ± σ):     328.973 s ±  2.153 s    [User: 198.746 s, System:
86.629 s]
  Range (min … max):   327.451 s … 330.496 s    2 runs

Benchmark 2: zsh -il -c 'fd --type f .'
  Time (mean ± σ):      8.281 s ±  0.703 s    [User: 17.441 s, System:
47.829 s]
  Range (min … max):    7.784 s …  8.778 s    2 runs


Shells ultimately exist to spawn processes and create pipes.  I'd wager
that (A) below is more maintainable than (B).

A. | sort | head -3

B. {$(${(On)n}[1,3]


If there's a sufficient performance benefit to in-shell-process
computation, I would love to see some standard library expansions of zsh to
reimplement common GNU/BSD utilities as functions

*Zach Riggle*


On Mon, Nov 29, 2021 at 10:12 PM Bart Schaefer <schaefer@brasslantern.com>
wrote:

> On Mon, Nov 29, 2021 at 6:30 PM Zach Riggle <zachriggle@gmail.com> wrote:
> >
> > I would expect that the md5sum of a file is reasonably fast, and could
> be stored in the .zwc for sanity checking, instead of just the "newer than"
> check.
>
> To what are you comparing that checksum?  It could tell you if the
> .zwc file were corrupted, but not whether the file differs from all
> the component files that were compiled into it.  Even if you could
> somehow tell they were different, that doesn't answer the question of
> whether the .zwc contains newer versions of any of those functions.
> The .zwc does contain a check that it matches the parser version of
> the shell that's trying to read it.
>
> > I expect that I have more $fpath entries than usual, but the total
> number of autoloadable functions is much more.
>
> That's exactly the point:  You're unlikely to ever execute most of
> those functions, so storing an autoload entry for them is much more
> space-efficient (and startup-time faster) than actually parsing and
> storing the function definitions themselves.
>
> > $ for d in $fpath; do n=$(ls $d/* | wc -l); echo "$n $d"; done | sort
> -nr | head -3
>
> Good heavens, so many processes and pipes.
>
> n=($^fpath(e^'n=($REPLY/*(N.)); reply=("$#n $REPLY")'^))
> print -l ${${(On)n}[1,3]}
>