zsh-users
 help / color / mirror / code / Atom feed
* How much of it is zsh?
@ 2010-03-24 10:43 zzapper
  2010-03-24 11:04 ` Piotr Kalinowski
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: zzapper @ 2010-03-24 10:43 UTC (permalink / raw)
  To: zsh-users

Hi
This is kind of a generic/dumb question I use zsh on cygwin.

So cygwin provides egrep now some of things grep can do are superceded by for 
instance zsh's **/*.php recursion  but presumably I could still use egrep's -
R. Am I right in thinking egrep knows nothing about the fact that its shell 
is  zsh.?!

Where are the boundaries between a shell and the tools

-- 
zzapper
http://zzapper.co.uk/ Technical Tips


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: How much of it is zsh?
  2010-03-24 10:43 How much of it is zsh? zzapper
@ 2010-03-24 11:04 ` Piotr Kalinowski
  2010-03-24 12:03 ` Nadav Har'El
  2010-03-24 13:43 ` How much of it is zsh? Joke de Buhr
  2 siblings, 0 replies; 10+ messages in thread
From: Piotr Kalinowski @ 2010-03-24 11:04 UTC (permalink / raw)
  To: zzapper; +Cc: zsh-users

On 24 March 2010 11:43, zzapper <david@tvis.co.uk> wrote:
> So cygwin provides egrep now some of things grep can do are superceded by for
> instance zsh's **/*.php recursion  but presumably I could still use egrep's -
> R. Am I right in thinking egrep knows nothing about the fact that its shell
> is  zsh.?!
>
> Where are the boundaries between a shell and the tools

Just surround respective arguments with apostrophes ''. That will
prevent shell from doing any expansion on them.

Regards,
Piotr Kalinowski
-- 
Intelligence is like a river: the deeper it is, the less noise it makes


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: How much of it is zsh?
  2010-03-24 10:43 How much of it is zsh? zzapper
  2010-03-24 11:04 ` Piotr Kalinowski
@ 2010-03-24 12:03 ` Nadav Har'El
  2010-03-24 19:49   ` Stephane Chazelas
  2010-03-24 20:39   ` zzapper
  2010-03-24 13:43 ` How much of it is zsh? Joke de Buhr
  2 siblings, 2 replies; 10+ messages in thread
From: Nadav Har'El @ 2010-03-24 12:03 UTC (permalink / raw)
  To: zzapper; +Cc: zsh-users

On Wed, Mar 24, 2010, zzapper wrote about "How much of it is zsh?":
> Hi
> This is kind of a generic/dumb question I use zsh on cygwin.
> 
> So cygwin provides egrep now some of things grep can do are superceded by for 
> instance zsh's **/*.php recursion  but presumably I could still use egrep's -
> R. Am I right in thinking egrep knows nothing about the fact that its shell 
> is  zsh.?!
> 
> Where are the boundaries between a shell and the tools

Unlike MS-DOS where each command had to globbing (expansion of "*" etc.)
for its command-line arguments, traditionally shells on Unix (and therefore,
also cygwin) do this before calling the command. I.e., if the user types

	egrep something *.php

The shell (in our case, zsh) first does globbing. E.g., if you have the
files a.php, b.php and c.php, the command is changed by the shell to

	egrep something a.php b.php c.php

and only then egrep is run. egrep doesn't know anything about the reason
it got these 3 filenames, or that they were generated by globbing.

You're right that zsh added the very useful *recursive* globbing syntax
that didn't exist in previous shells. In this case, **/*.php matches
recursively files called *.php. But nothing in the way this works changes
from what I described above - i.e., zsh first expands **/*.php into a list
of file names, and then gives this list of filenames to egrep.

You're right that the two commands
	egrep -R something dir
	egrep something dir/**/*

basically end up doing the same thing, but I don't see why you should
consider this a problem. By the way, if you're curious, there's actually
a subtle difference between the way these two work. Like I said, the shell's
globbing is always done in advance. So if dir has a million files under it,
this will expand into a command with a million arguments - which on some
system can be a problem (too much memory used, or command too long).
On the other hand, egrep -R finds the files recursively one by one, and
never needs to hold the whole list of files in memory.

I hope this answers your question.

Nadav.

-- 
Nadav Har'El                        |     Wednesday, Mar 24 2010, 9 Nisan 5770
nyh@math.technion.ac.il             |-----------------------------------------
Phone +972-523-790466, ICQ 13349191 |I put a dollar in one of those change
http://nadav.harel.org.il           |machines. Nothing changed.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: How much of it is zsh?
  2010-03-24 10:43 How much of it is zsh? zzapper
  2010-03-24 11:04 ` Piotr Kalinowski
  2010-03-24 12:03 ` Nadav Har'El
@ 2010-03-24 13:43 ` Joke de Buhr
  2 siblings, 0 replies; 10+ messages in thread
From: Joke de Buhr @ 2010-03-24 13:43 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: Text/Plain, Size: 774 bytes --]

The generated argument list can get very long if you use **/* globbing. 
Sometimes the argument list gets longer than possible. If it happens the -R 
option can be useful.

List all files under /:

ls /**/*       # not working: argument list to long
ls -R /        # working: ls itself does the recursive search

On Wednesday, 24. March 2010 11:43:24 zzapper wrote:
> Hi
> This is kind of a generic/dumb question I use zsh on cygwin.
> 
> So cygwin provides egrep now some of things grep can do are superceded by
>  for instance zsh's **/*.php recursion  but presumably I could still use
>  egrep's - R. Am I right in thinking egrep knows nothing about the fact
>  that its shell is  zsh.?!
> 
> Where are the boundaries between a shell and the tools
> 

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: How much of it is zsh?
  2010-03-24 12:03 ` Nadav Har'El
@ 2010-03-24 19:49   ` Stephane Chazelas
  2010-03-24 20:39   ` zzapper
  1 sibling, 0 replies; 10+ messages in thread
From: Stephane Chazelas @ 2010-03-24 19:49 UTC (permalink / raw)
  To: Nadav Har'El; +Cc: zzapper, zsh-users

2010-03-24 14:03:59 +0200, Nadav Har'El:
[...]
> You're right that the two commands
> 	egrep -R something dir
> 	egrep something dir/**/*
> 
> basically end up doing the same thing, but I don't see why you should
> consider this a problem. By the way, if you're curious, there's actually
> a subtle difference between the way these two work. Like I said, the shell's
> globbing is always done in advance. So if dir has a million files under it,
> this will expand into a command with a million arguments - which on some
> system can be a problem (too much memory used, or command too long).
> On the other hand, egrep -R finds the files recursively one by one, and
> never needs to hold the whole list of files in memory.
[...]

There are a few other differences:
 - grep -R (at least the GNU variant as probably found on
 cygwin) will follow symbolic links when descending directories
 (use dir/***/* to achieve the same with zsh).
 - **/* will ommit dot files and do dirs, use **/*(D) to avoid
 that.
 - **/* also sorts the list of files which adds some more overhead but
 produces a more reproducible outcome. Use **/*(oN) to prevent
 sorting.


So

egrep -R something dir
would be more like:
egrep something dir/***/*(DoN)

grep -E something dir/**/*(.DoN)

would probably be more what you'd want though.

-- 
Stephane


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: How much of it is zsh?
  2010-03-24 12:03 ` Nadav Har'El
  2010-03-24 19:49   ` Stephane Chazelas
@ 2010-03-24 20:39   ` zzapper
  2010-03-26  9:24     ` array element subsetting S. Cowles
  1 sibling, 1 reply; 10+ messages in thread
From: zzapper @ 2010-03-24 20:39 UTC (permalink / raw)
  To: zsh-users

Nadav Har'El wrote in
news:20100324120359.GA29984@fermat.math.technion.ac.il: 
...
> shell's globbing is always done in advance. So if dir has a million
> files under it, this will expand into a command with a million arguments
> - which on some system can be a problem (too much memory used, or
> command too long). On the other hand, egrep -R finds the files
> recursively one by one, and never needs to hold the whole list of files
> in memory. 
> 
> I hope this answers your question.
> 
> Nadav.
> 

Yes Nadav that answers it perfectly , it was just that I've been using shells 
for years w/o ever conceptualising what their role was!


-- 
zzapper
http://zzapper.co.uk/ Technical Tips


^ permalink raw reply	[flat|nested] 10+ messages in thread

* array element subsetting
  2010-03-24 20:39   ` zzapper
@ 2010-03-26  9:24     ` S. Cowles
  2010-03-26 14:41       ` Bart Schaefer
  0 siblings, 1 reply; 10+ messages in thread
From: S. Cowles @ 2010-03-26  9:24 UTC (permalink / raw)
  To: zsh-users


I am trying to figure out the correct syntax for constructing two 
one-liner subsetting operations on arrays.  I have two objectives: 1) 
select nth character from each array element, and 2) select nth element 
within each array element.

The array these methods operate upon is something simple such as:
a=(
     "satu two trio"
     "sah funf seis"
     "boundarycase"
     "revert to pattern"
)

For the first case, the solution I came up with is:

print -l ${a//#%(#b)(?)*/${match[1]}}

for the first character of each element, or

print -l ${a//#%(#b)?(#c2)(?(#c1))*/${match[1]}}

for 3rd character of each element (generalizable to [n,m] elements).

For the second case, doing word splitting on each array element, I came up 
with two variations to print out the second word in each element.

print -l ${a//#%(#b)*[[:IFS:]]##(*)[[:IFS:]]##*/${match[1]}}

print -l ${a//#%(#b)[[:WORD:]]##[^[:WORD:]]##([[:WORD:]]##)[^[:WORD:]]##*/${match[1]}}

(Though not important for my uses, these both fail with the boundary case 
where the array element contains only one word.)

Isn't there a better/cleaner way to accomplish this, especially for the 
second objective?

Thanks.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: array element subsetting
  2010-03-26  9:24     ` array element subsetting S. Cowles
@ 2010-03-26 14:41       ` Bart Schaefer
  2010-03-26 19:32         ` S. Cowles
  0 siblings, 1 reply; 10+ messages in thread
From: Bart Schaefer @ 2010-03-26 14:41 UTC (permalink / raw)
  To: S. Cowles, zsh-users

On Mar 26,  2:24am, S. Cowles wrote:
} 
} I am trying to figure out the correct syntax for constructing two 
} one-liner subsetting operations on arrays.  I have two objectives: 1) 
} select nth character from each array element, and 2) select nth element 
} within each array element.
} 
} The array these methods operate upon is something simple such as:
} a=(
}      "satu two trio"
}      "sah funf seis"
}      "boundarycase"
}      "revert to pattern"
} )

(1) can be done with the (M) parameter flag and simple head/tail:

	print ${(M)a#?}

To generalize to the Nth element, ${(M)${(M)a#?(#c$N)}%?} (requires
extendedglob, of course).

(2) is more difficult to do without looping, because zsh doesn't
support multidimensional arrays, so you have to force an eval step
via the (e) flag:

	print ${(e):-'${${=:-'${^a}'}[2]}'}

However, this yields the second character of arrays that contain
only one word, because ${=...} reduces singular arrays to scalars.
A simple workaround is to insert an empty dummy element at the tail:

	print ${(e):-'${${=:-'${^a}' ""}[2]}'}

In the event there are special characters in the strings in $a, an
extra level of quoting can be added and then removed:

	print ${(e):-'${${=${(Q):-'${(q)^a}' ""}}[2]}'}

However, this removes again the empty element inserted by the double
quotes, i.e., it returns nothing for the short array rather than an
empty second element (use "print -l" in those examples to see the
difference more clearly).  Remove the (e) if you want to see what's
going on with the ${(q)^a} business.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: array element subsetting
  2010-03-26 14:41       ` Bart Schaefer
@ 2010-03-26 19:32         ` S. Cowles
  2010-03-27  4:14           ` Bart Schaefer
  0 siblings, 1 reply; 10+ messages in thread
From: S. Cowles @ 2010-03-26 19:32 UTC (permalink / raw)
  To: zsh-users

On Fri, 26 Mar 2010, Bart Schaefer wrote:

> Date: Fri, 26 Mar 2010 07:41:48 -0700
> From: Bart Schaefer <schaefer@brasslantern.com>
> On Mar 26,  2:24am, S. Cowles wrote:
> } I am trying to figure out the correct syntax for constructing two
> } one-liner subsetting operations on arrays.  I have two objectives: 1)
> } select nth character from each array element, and 2) select nth element
> } within each array element.
> }
> } The array these methods operate upon is something simple such as:
> } a=(
> } ...
> } )
> (1) can be done with the (M) parameter flag and simple head/tail:
> 	print ${(M)a#?}

Simpler and more straightforward than backreferencing.  Thank you, Bart.

> (2) is more difficult to do without looping, because zsh doesn't
> support multidimensional arrays, so you have to force an eval step
> via the (e) flag:
> 	print ${(e):-'${${=:-'${^a}' ""}[2]}'}

I hadn't previously used the parameter expansion (e) or array creation 
${=...} methods.  The inline array element addition is new to me; I missed 
it in Peter's Manual, book, and the zshall man page.

Would it be worth considering adding a new subsection on Array Subsetting 
to the ARRAY PARAMETERS section of the man pages, just after the Subscript 
Parsing section and just prior to POSITIONAL PARAMETERS?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: array element subsetting
  2010-03-26 19:32         ` S. Cowles
@ 2010-03-27  4:14           ` Bart Schaefer
  0 siblings, 0 replies; 10+ messages in thread
From: Bart Schaefer @ 2010-03-27  4:14 UTC (permalink / raw)
  To: zsh-users

On Mar 26, 12:32pm, S. Cowles wrote:
} Subject: Re: array element subsetting
}
} On Fri, 26 Mar 2010, Bart Schaefer wrote:
} 
} > 	print ${(e):-'${${=:-'${^a}' ""}[2]}'}
} 
} I hadn't previously used the parameter expansion (e) or array creation
} ${=...} methods. The inline array element addition is new to me; I
} missed it in Peter's Manual, book, and the zshall man page.

It isn't really "inline array element addition" -- it's just adding a
space and a pair of empty quotes to the end of a string.  What turns
it into an array element is the combination of ${(e)...} which expands
the ${=:-...} expression and thereby removes the quotes, and ${=...}
which splits on the space.

The important bit is the ${^a} wedged in the middle, which turns the
array of strings into an array of parameter expressions wrapped around
those strings.  This is not a very space-efficient way to emulate a
multi-dimensional indexing, even if it's compact to write.
 
} Would it be worth considering adding a new subsection on Array
} Subsetting to the ARRAY PARAMETERS section of the man pages [...]?

I'm not sure it's a common enough thing to want to do in shell code
to be enshrined in the manual, but I'll defer that decision to PWS.
I'd suggest it go in the FAQ except I don't recall it ever having
been asked before, so the "frequent" part hardly applies ...

Incidentally, one might wonder why

	print ${(e):-'${${:-'${^a}'}[(w)N]}'}

doesn't work.  The manual says:

  w
     If the parameter subscripted is a scalar then this flag makes
     subscripting work on words instead of characters.  The default
     word separator is whitespace.

The answer is that it does work, as long as the value of N is less
than the number of words in any string.  However, the result of
${three_word_string[(w)4]} is the last word in the string, not an
empty element as results with ${${=three_word_string}[4]}.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-03-27  4:15 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-24 10:43 How much of it is zsh? zzapper
2010-03-24 11:04 ` Piotr Kalinowski
2010-03-24 12:03 ` Nadav Har'El
2010-03-24 19:49   ` Stephane Chazelas
2010-03-24 20:39   ` zzapper
2010-03-26  9:24     ` array element subsetting S. Cowles
2010-03-26 14:41       ` Bart Schaefer
2010-03-26 19:32         ` S. Cowles
2010-03-27  4:14           ` Bart Schaefer
2010-03-24 13:43 ` How much of it is zsh? Joke de Buhr

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).