* How much of it is zsh? @ 2010-03-24 10:43 zzapper 2010-03-24 11:04 ` Piotr Kalinowski ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: zzapper @ 2010-03-24 10:43 UTC (permalink / raw) To: zsh-users Hi This is kind of a generic/dumb question I use zsh on cygwin. So cygwin provides egrep now some of things grep can do are superceded by for instance zsh's **/*.php recursion but presumably I could still use egrep's - R. Am I right in thinking egrep knows nothing about the fact that its shell is zsh.?! Where are the boundaries between a shell and the tools -- zzapper http://zzapper.co.uk/ Technical Tips ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: How much of it is zsh? 2010-03-24 10:43 How much of it is zsh? zzapper @ 2010-03-24 11:04 ` Piotr Kalinowski 2010-03-24 12:03 ` Nadav Har'El 2010-03-24 13:43 ` How much of it is zsh? Joke de Buhr 2 siblings, 0 replies; 10+ messages in thread From: Piotr Kalinowski @ 2010-03-24 11:04 UTC (permalink / raw) To: zzapper; +Cc: zsh-users On 24 March 2010 11:43, zzapper <david@tvis.co.uk> wrote: > So cygwin provides egrep now some of things grep can do are superceded by for > instance zsh's **/*.php recursion but presumably I could still use egrep's - > R. Am I right in thinking egrep knows nothing about the fact that its shell > is zsh.?! > > Where are the boundaries between a shell and the tools Just surround respective arguments with apostrophes ''. That will prevent shell from doing any expansion on them. Regards, Piotr Kalinowski -- Intelligence is like a river: the deeper it is, the less noise it makes ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: How much of it is zsh? 2010-03-24 10:43 How much of it is zsh? zzapper 2010-03-24 11:04 ` Piotr Kalinowski @ 2010-03-24 12:03 ` Nadav Har'El 2010-03-24 19:49 ` Stephane Chazelas 2010-03-24 20:39 ` zzapper 2010-03-24 13:43 ` How much of it is zsh? Joke de Buhr 2 siblings, 2 replies; 10+ messages in thread From: Nadav Har'El @ 2010-03-24 12:03 UTC (permalink / raw) To: zzapper; +Cc: zsh-users On Wed, Mar 24, 2010, zzapper wrote about "How much of it is zsh?": > Hi > This is kind of a generic/dumb question I use zsh on cygwin. > > So cygwin provides egrep now some of things grep can do are superceded by for > instance zsh's **/*.php recursion but presumably I could still use egrep's - > R. Am I right in thinking egrep knows nothing about the fact that its shell > is zsh.?! > > Where are the boundaries between a shell and the tools Unlike MS-DOS where each command had to globbing (expansion of "*" etc.) for its command-line arguments, traditionally shells on Unix (and therefore, also cygwin) do this before calling the command. I.e., if the user types egrep something *.php The shell (in our case, zsh) first does globbing. E.g., if you have the files a.php, b.php and c.php, the command is changed by the shell to egrep something a.php b.php c.php and only then egrep is run. egrep doesn't know anything about the reason it got these 3 filenames, or that they were generated by globbing. You're right that zsh added the very useful *recursive* globbing syntax that didn't exist in previous shells. In this case, **/*.php matches recursively files called *.php. But nothing in the way this works changes from what I described above - i.e., zsh first expands **/*.php into a list of file names, and then gives this list of filenames to egrep. You're right that the two commands egrep -R something dir egrep something dir/**/* basically end up doing the same thing, but I don't see why you should consider this a problem. By the way, if you're curious, there's actually a subtle difference between the way these two work. Like I said, the shell's globbing is always done in advance. So if dir has a million files under it, this will expand into a command with a million arguments - which on some system can be a problem (too much memory used, or command too long). On the other hand, egrep -R finds the files recursively one by one, and never needs to hold the whole list of files in memory. I hope this answers your question. Nadav. -- Nadav Har'El | Wednesday, Mar 24 2010, 9 Nisan 5770 nyh@math.technion.ac.il |----------------------------------------- Phone +972-523-790466, ICQ 13349191 |I put a dollar in one of those change http://nadav.harel.org.il |machines. Nothing changed. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: How much of it is zsh? 2010-03-24 12:03 ` Nadav Har'El @ 2010-03-24 19:49 ` Stephane Chazelas 2010-03-24 20:39 ` zzapper 1 sibling, 0 replies; 10+ messages in thread From: Stephane Chazelas @ 2010-03-24 19:49 UTC (permalink / raw) To: Nadav Har'El; +Cc: zzapper, zsh-users 2010-03-24 14:03:59 +0200, Nadav Har'El: [...] > You're right that the two commands > egrep -R something dir > egrep something dir/**/* > > basically end up doing the same thing, but I don't see why you should > consider this a problem. By the way, if you're curious, there's actually > a subtle difference between the way these two work. Like I said, the shell's > globbing is always done in advance. So if dir has a million files under it, > this will expand into a command with a million arguments - which on some > system can be a problem (too much memory used, or command too long). > On the other hand, egrep -R finds the files recursively one by one, and > never needs to hold the whole list of files in memory. [...] There are a few other differences: - grep -R (at least the GNU variant as probably found on cygwin) will follow symbolic links when descending directories (use dir/***/* to achieve the same with zsh). - **/* will ommit dot files and do dirs, use **/*(D) to avoid that. - **/* also sorts the list of files which adds some more overhead but produces a more reproducible outcome. Use **/*(oN) to prevent sorting. So egrep -R something dir would be more like: egrep something dir/***/*(DoN) grep -E something dir/**/*(.DoN) would probably be more what you'd want though. -- Stephane ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: How much of it is zsh? 2010-03-24 12:03 ` Nadav Har'El 2010-03-24 19:49 ` Stephane Chazelas @ 2010-03-24 20:39 ` zzapper 2010-03-26 9:24 ` array element subsetting S. Cowles 1 sibling, 1 reply; 10+ messages in thread From: zzapper @ 2010-03-24 20:39 UTC (permalink / raw) To: zsh-users Nadav Har'El wrote in news:20100324120359.GA29984@fermat.math.technion.ac.il: ... > shell's globbing is always done in advance. So if dir has a million > files under it, this will expand into a command with a million arguments > - which on some system can be a problem (too much memory used, or > command too long). On the other hand, egrep -R finds the files > recursively one by one, and never needs to hold the whole list of files > in memory. > > I hope this answers your question. > > Nadav. > Yes Nadav that answers it perfectly , it was just that I've been using shells for years w/o ever conceptualising what their role was! -- zzapper http://zzapper.co.uk/ Technical Tips ^ permalink raw reply [flat|nested] 10+ messages in thread
* array element subsetting 2010-03-24 20:39 ` zzapper @ 2010-03-26 9:24 ` S. Cowles 2010-03-26 14:41 ` Bart Schaefer 0 siblings, 1 reply; 10+ messages in thread From: S. Cowles @ 2010-03-26 9:24 UTC (permalink / raw) To: zsh-users I am trying to figure out the correct syntax for constructing two one-liner subsetting operations on arrays. I have two objectives: 1) select nth character from each array element, and 2) select nth element within each array element. The array these methods operate upon is something simple such as: a=( "satu two trio" "sah funf seis" "boundarycase" "revert to pattern" ) For the first case, the solution I came up with is: print -l ${a//#%(#b)(?)*/${match[1]}} for the first character of each element, or print -l ${a//#%(#b)?(#c2)(?(#c1))*/${match[1]}} for 3rd character of each element (generalizable to [n,m] elements). For the second case, doing word splitting on each array element, I came up with two variations to print out the second word in each element. print -l ${a//#%(#b)*[[:IFS:]]##(*)[[:IFS:]]##*/${match[1]}} print -l ${a//#%(#b)[[:WORD:]]##[^[:WORD:]]##([[:WORD:]]##)[^[:WORD:]]##*/${match[1]}} (Though not important for my uses, these both fail with the boundary case where the array element contains only one word.) Isn't there a better/cleaner way to accomplish this, especially for the second objective? Thanks. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: array element subsetting 2010-03-26 9:24 ` array element subsetting S. Cowles @ 2010-03-26 14:41 ` Bart Schaefer 2010-03-26 19:32 ` S. Cowles 0 siblings, 1 reply; 10+ messages in thread From: Bart Schaefer @ 2010-03-26 14:41 UTC (permalink / raw) To: S. Cowles, zsh-users On Mar 26, 2:24am, S. Cowles wrote: } } I am trying to figure out the correct syntax for constructing two } one-liner subsetting operations on arrays. I have two objectives: 1) } select nth character from each array element, and 2) select nth element } within each array element. } } The array these methods operate upon is something simple such as: } a=( } "satu two trio" } "sah funf seis" } "boundarycase" } "revert to pattern" } ) (1) can be done with the (M) parameter flag and simple head/tail: print ${(M)a#?} To generalize to the Nth element, ${(M)${(M)a#?(#c$N)}%?} (requires extendedglob, of course). (2) is more difficult to do without looping, because zsh doesn't support multidimensional arrays, so you have to force an eval step via the (e) flag: print ${(e):-'${${=:-'${^a}'}[2]}'} However, this yields the second character of arrays that contain only one word, because ${=...} reduces singular arrays to scalars. A simple workaround is to insert an empty dummy element at the tail: print ${(e):-'${${=:-'${^a}' ""}[2]}'} In the event there are special characters in the strings in $a, an extra level of quoting can be added and then removed: print ${(e):-'${${=${(Q):-'${(q)^a}' ""}}[2]}'} However, this removes again the empty element inserted by the double quotes, i.e., it returns nothing for the short array rather than an empty second element (use "print -l" in those examples to see the difference more clearly). Remove the (e) if you want to see what's going on with the ${(q)^a} business. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: array element subsetting 2010-03-26 14:41 ` Bart Schaefer @ 2010-03-26 19:32 ` S. Cowles 2010-03-27 4:14 ` Bart Schaefer 0 siblings, 1 reply; 10+ messages in thread From: S. Cowles @ 2010-03-26 19:32 UTC (permalink / raw) To: zsh-users On Fri, 26 Mar 2010, Bart Schaefer wrote: > Date: Fri, 26 Mar 2010 07:41:48 -0700 > From: Bart Schaefer <schaefer@brasslantern.com> > On Mar 26, 2:24am, S. Cowles wrote: > } I am trying to figure out the correct syntax for constructing two > } one-liner subsetting operations on arrays. I have two objectives: 1) > } select nth character from each array element, and 2) select nth element > } within each array element. > } > } The array these methods operate upon is something simple such as: > } a=( > } ... > } ) > (1) can be done with the (M) parameter flag and simple head/tail: > print ${(M)a#?} Simpler and more straightforward than backreferencing. Thank you, Bart. > (2) is more difficult to do without looping, because zsh doesn't > support multidimensional arrays, so you have to force an eval step > via the (e) flag: > print ${(e):-'${${=:-'${^a}' ""}[2]}'} I hadn't previously used the parameter expansion (e) or array creation ${=...} methods. The inline array element addition is new to me; I missed it in Peter's Manual, book, and the zshall man page. Would it be worth considering adding a new subsection on Array Subsetting to the ARRAY PARAMETERS section of the man pages, just after the Subscript Parsing section and just prior to POSITIONAL PARAMETERS? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: array element subsetting 2010-03-26 19:32 ` S. Cowles @ 2010-03-27 4:14 ` Bart Schaefer 0 siblings, 0 replies; 10+ messages in thread From: Bart Schaefer @ 2010-03-27 4:14 UTC (permalink / raw) To: zsh-users On Mar 26, 12:32pm, S. Cowles wrote: } Subject: Re: array element subsetting } } On Fri, 26 Mar 2010, Bart Schaefer wrote: } } > print ${(e):-'${${=:-'${^a}' ""}[2]}'} } } I hadn't previously used the parameter expansion (e) or array creation } ${=...} methods. The inline array element addition is new to me; I } missed it in Peter's Manual, book, and the zshall man page. It isn't really "inline array element addition" -- it's just adding a space and a pair of empty quotes to the end of a string. What turns it into an array element is the combination of ${(e)...} which expands the ${=:-...} expression and thereby removes the quotes, and ${=...} which splits on the space. The important bit is the ${^a} wedged in the middle, which turns the array of strings into an array of parameter expressions wrapped around those strings. This is not a very space-efficient way to emulate a multi-dimensional indexing, even if it's compact to write. } Would it be worth considering adding a new subsection on Array } Subsetting to the ARRAY PARAMETERS section of the man pages [...]? I'm not sure it's a common enough thing to want to do in shell code to be enshrined in the manual, but I'll defer that decision to PWS. I'd suggest it go in the FAQ except I don't recall it ever having been asked before, so the "frequent" part hardly applies ... Incidentally, one might wonder why print ${(e):-'${${:-'${^a}'}[(w)N]}'} doesn't work. The manual says: w If the parameter subscripted is a scalar then this flag makes subscripting work on words instead of characters. The default word separator is whitespace. The answer is that it does work, as long as the value of N is less than the number of words in any string. However, the result of ${three_word_string[(w)4]} is the last word in the string, not an empty element as results with ${${=three_word_string}[4]}. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: How much of it is zsh? 2010-03-24 10:43 How much of it is zsh? zzapper 2010-03-24 11:04 ` Piotr Kalinowski 2010-03-24 12:03 ` Nadav Har'El @ 2010-03-24 13:43 ` Joke de Buhr 2 siblings, 0 replies; 10+ messages in thread From: Joke de Buhr @ 2010-03-24 13:43 UTC (permalink / raw) To: zsh-users [-- Attachment #1: Type: Text/Plain, Size: 774 bytes --] The generated argument list can get very long if you use **/* globbing. Sometimes the argument list gets longer than possible. If it happens the -R option can be useful. List all files under /: ls /**/* # not working: argument list to long ls -R / # working: ls itself does the recursive search On Wednesday, 24. March 2010 11:43:24 zzapper wrote: > Hi > This is kind of a generic/dumb question I use zsh on cygwin. > > So cygwin provides egrep now some of things grep can do are superceded by > for instance zsh's **/*.php recursion but presumably I could still use > egrep's - R. Am I right in thinking egrep knows nothing about the fact > that its shell is zsh.?! > > Where are the boundaries between a shell and the tools > [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 835 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2010-03-27 4:15 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-03-24 10:43 How much of it is zsh? zzapper 2010-03-24 11:04 ` Piotr Kalinowski 2010-03-24 12:03 ` Nadav Har'El 2010-03-24 19:49 ` Stephane Chazelas 2010-03-24 20:39 ` zzapper 2010-03-26 9:24 ` array element subsetting S. Cowles 2010-03-26 14:41 ` Bart Schaefer 2010-03-26 19:32 ` S. Cowles 2010-03-27 4:14 ` Bart Schaefer 2010-03-24 13:43 ` How much of it is zsh? Joke de Buhr
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).