The shell is fundamentally an interpreter; stuff written in shell can't possibly be as fast as the code that is interpreting it. In general, native tools are going to be more efficient. 

I mean, if you had to write code in your awk program to do the parsing and splitting that it does automatically, that would be slow, too. But you don't, because it's it's already written for you inside the awk binary, and in compiled C rather than interpreted awk. 

In college I wrote a complete email-based helpdesk/workflow system for my team of sysadmins, and I prided myself on doing so entirely in pure ksh, like the Rand MH reimplementation in the back of Korn's book. I definitely went too far in the direction of avoiding external tools, and have since corrected. 

As I see it, the shell is best utilized as commander and coordinator rather than actually doing the hands-on nitty-gritty work; that's better delegated to more efficient (and usually more specialized) tools. The shell can absolutely do it, but it won't be the best application of the available resources.

_
Mark J. Reed <markjreed@gmail.com>


On Wed, May 22, 2024 at 20:30 Ray Andrews <rayandrews@eastlink.ca> wrote:


On 2024-05-22 17:04, Lawrence Velázquez wrote:
Yes, it's quite common -- borderline universal -- to start with sed
and pick up awk later.
Good to know.  Sometimes I think I get everything wrong.
You take this way too far, in my opinion.  Compared to appropriate
external tools, zsh-heavy solutions often perform poorly and are
more difficult to understand and maintain.  (Compare your original
code to the awk solutions Mark and Roman offered.)  Most of your
code would improve if you (judiciously) used more external utilities
and less zsh.
It seems 'obvious' that internal code would be faster, but I know from Roman's various tests over the years that it ain't necessarily so.    Besides, at the concept level, shells are intended as glue between system commands and all their internal abilities are addons.  I suppose when,  as you say, one replaces a multi line internal construction with a single line construction that calls an external prog. the mere fact of many lines to interpret has a penalty right there that you'd not notice in a compiled program.