From mboxrd@z Thu Jan 1 00:00:00 1970 From: smiley@zenzebra.mv.com (smiley at zenzebra.mv.com) Date: Sun, 3 Apr 2011 22:30:56 +0000 Subject: [9fans] Making read(1) an rc(1) builtin? Message-ID: <86fwpz55nj.fsf@cmarib.ramside> Topicbox-Message-UUID: c7287ff2-ead6-11e9-9d60-3106f5b1d025 I'm in the process of writing some filters in rc(1). One thing that has come to concern me about rc(1) is that read(1) is not a "builtin" command. For example, with a loop like: while(message=`{read}) switch($message) { case foo dofoo case bar dobar case * dodefault } Each line that's read by the script causes it to fork a new process, /bin/read, whose sole purpose is to read a single line and die. That means at least one fork for each line read and, if your input has many lines, that means spawning many processes. I wonder if it wouldn't make sense to move read(1) into rc(1) and make it a "builtin" command. A wrapper script could then be created, at /bin/read, to call "rc -c 'eval read $*'" with the appropriate arguments (or sed $n^q, etc.), for any program that requires an actual /bin/read to exist. A similar line of thought holds for /bin/test. The string and numeric tests (-n, -z, =, !=, <, >, -lt, -eq, -ne, etc.) can be very frequently used, and can lead to spawning unnecessarily many processes. For the file test parameters (-e, -f, -d, -r, -x, -A, -L, -T, etc.), however, this argument isn't as strong. Since the file tests have to stat(2) a path, they already require a call to the underlying file system, and an additional fork wouldn't be that much more expensive. I could see the string and numeric tests being moved into rc(1) as a "test" builtin, with the file tests residing at "/bin/ftest" (note the "f"). The "test" builtin could scan its arguments and call "ftest" if needed. A wrapper script at /bin/test could provide compatibility for existing programs which expect an executable named /bin/test to exist. I understand the Unix/Plan 9 philosophy of connecting tools that do one job and do it well. But I don't think /bin/read and /bin/test are places where that philosophy is practical (i.e., efficient). After all, reading input lines really is the perogative of any program that processes line-oriented data (like rc(1) does). In addition, /bin/read represents a simple and fairly stable interface that's not likely to change appreciably in the future. Comparison of numeric and string values is also a fairly stable operation that's not likely to change, and is not likely to be needed outside of rc(1). Most programming languages (C, awk, etc.) have their own mechanisms for integer and string comparison. I suspect moving these operations into rc(1) (with appropriate replacement scripts to ensure compatibility) could appreciably increase the performance of shell scripts, with very little cost in modularity or compatibility. Any thoughts on this? I'm also a bit stumped by the fact that rc(1) doesn't have anything analogous to bash(1)'s string parsing operations: ${foo#bar}, ${foo##bar}, ${foo%bar}, ${foo%%bar}, or ${foo/bar/baz}. Is there any way to extract substrings (or single characters) from a string in rc(1) without having to fork a dd, awk, or sed? I've tried setting ifs='' and using foo=($"bar), but rc(1) always splits bar on spaces. Perhaps, if rc(1) used the first character of $ifs to split $"bar, $bar could be split into individual characters when ifs=''. Then, the characters of $bar could be addressed without resort to dd and friends. (As a side note, if anyone goes into rc(1)'s source to implement any of this, please add a "--" option (or similar) to the "echo" builtin while you're there. Having to wrap echo in: # make 'myecho $foo' work even when $foo starts with '-n' fn myecho { if(~ $1 --) { shift if(~ $1 -n) { shift echo -n -n $* echo } if not echo $* } if not echo $* } can be rather inconvenient.) -- +---------------------------------------------------------------+ |E-Mail: smiley at zenzebra.mv.com PGP key ID: BC549F8B| |Fingerprint: 9329 DB4A 30F5 6EDA D2BA 3489 DAB7 555A BC54 9F8B| +---------------------------------------------------------------+