* Shell argument splitting behaviour
@ 2008-10-01 13:02 Peter Stephenson
2008-10-03 13:56 ` Peter Stephenson
0 siblings, 1 reply; 2+ messages in thread
From: Peter Stephenson @ 2008-10-01 13:02 UTC (permalink / raw)
To: Zsh hackers list
I disovered this inconvenience in the parameter splitting flag (z) which
splits words in a similar way to how command line arguments are handled.
foo="(one) (two) (three)"
print -l ${(z)foo}
prints
(
one
)
(two)
(three)
That's because the command word in the line is treated differently; in
this case, it looks like the start of a subshell. I wasn't expecting it
when splitting a string, because it's just an arbitrary set of words,
and my first reaction was to change it (which is easy enough) but I
suppose you can think of it as a feature. The same feature occurs when
the line editor splits arguments: in insert-last-word and
copy-prev-shell-word. In those cases the current behaviour is right,
although it'll only rarely make a difference.
I thought I'd mention it in case anyone else had any reactions.
What I was trying to do was use this to get lisp-like lists of
arguments (since after the first word parentheses have to be balanced),
but I can get that to work just by putting a dummy word in front, so
it's actually not a major concern.
--
Peter Stephenson <pws@csr.com> Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK Tel: +44 (0)1223 692070
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Shell argument splitting behaviour
2008-10-01 13:02 Shell argument splitting behaviour Peter Stephenson
@ 2008-10-03 13:56 ` Peter Stephenson
0 siblings, 0 replies; 2+ messages in thread
From: Peter Stephenson @ 2008-10-03 13:56 UTC (permalink / raw)
Cc: Zsh hackers list
[-- Attachment #1: Type: text/plain, Size: 906 bytes --]
On Wed, 01 Oct 2008 14:02:02 +0100
Peter Stephenson <pws@csr.com> wrote:
> What I was trying to do was use this to get lisp-like lists of
> arguments (since after the first word parentheses have to be balanced),
> but I can get that to work just by putting a dummy word in front, so
> it's actually not a major concern.
In case there's any interest, here's what I came up with for my own use.
The list-word function handle list-style trees showing what can follow what.
It's essentially yet another way of doing argument handling, optimised for
another different case. The _dynamic_directory_name for my own use shows
roughly how to use this; it completes colon-separated parts as in
~[p1:u:main].
--
Peter Stephenson <pws@csr.com> Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK Tel: +44 (0)1223 692070
[-- Attachment #2: list_word --]
[-- Type: application/octet-stream, Size: 7970 bytes --]
#autoload
#
# This function can be used to retrieve the list of words that are allowed
# at each point in an ordered list of words where the next word depends on
# the previous one, given a lisp-style input and the words so far. For
# example, if the first argument may be "one" "two" or "three", but "one"
# is always followed by the second argument "eins", and likewise "two" by
# "zwei" and "three" by "drei", this function will list the possibilities
# at each step. (See the second and third of the "Examples" below for how
# this particular example works.)
#
# Input:
# - a lisp-like tree structure as a single string. The outermost
# list, which includes the entire tree, contains the specification of a
# set of words. The specification for each word is also a list that
# may consist of one or two elements. The first element is the word
# itself; the optional second element is a recursive list in the same
# format as the outermost list that specifies words that may follow.
#
# - The main tree structure may optionally be preceded by a number of named
# trees, in the form "name-:tree", where tree has the identical form
# to the main tree (so is surrounded by parentheses). These
# may appear whenever a subtree, i.e. list describing following words,
# may appear in the code, for example the single argument
# ((one ((eins))) (two ((zwei) (dos) (duo))))
# and the two arguments
# twoarg-:((zwei) (dos) (duo))
# ((one ((eins))) (two -:twoarg))
# are equivalent. Recursive use of named subtrees is possible, so
# A-:((A -:B))
# B-:((B -:A))
# ((A -:B))
# describes any number of alternating words A B A ...
#
# - words matched so far as separate arguments.
#
# Output:
# sets reply to the list of possible values that can come next and return
# zero. If no match, or two many arguments, return 1.
#
# Returns status 2 if
# - an argument before the first surrounded by parentheses did not
# fit the form *'-:('*')'
# - no argument matched the form '('*')'
# - a list describing a single word and (optionally) its following word
# had more than two elements
# - if such a list had the form (word -:name), no predefined sublist
# for name existed.
#
# Further notes on list format:
# Ordinary shell quoting may be applied to individual elements and will
# be stripped for comparisons and in the returned array. No shell
# expansion is performed. If in doubt, characters should be quoted
# since the list is parsed by shell word splitting in which certain
# characters (such as "<" and ">") are processed as separate words
# when unquoted. Quoting also escapes other active forms, including
# "-:" described below. Note that the quotes here are additional
# to any quotes need to protect the argument to the function from
# immediate shell expansion.
#
# Words matched so far (i.e. arguments to the function after the
# top-level list) are not subject to further quote processing.
#
# The named trees may be used as part of lists of words (as well
# as in the second element of word specifications). In this case,
# one level of parentheses will be removed and the result used
# as if it were a list of word specifications. Hence the arguments:
# extra-:((more1) (more2) (more3))
# ((some1) (some2) (some3) -:extra)
# behave as if all six words were given as top-level possibilities.
# As an example to distinguish the two uses, the following:
# extra-:((more1) (more2) (more3))
# ((some1 -:extra) (some2) (some3) -:extra)
# has the same effect at the top level, but the first word "some1"
# can also be followed by the three words defined by -:extra.
#
# Examples:
# The simplest case:
# "((one) (two) (three))"
# sets reply to the three elements one, two three.
#
# "((one ((eins))) (two ((zwei))) (three ((drei))))"
# the same
#
# "((one ((eins))) (two ((zwei))) (three ((drei))))" two
# sets reply to the single element zwei
#
# "((one ((eins))) (two ((zwei) (dos) (duo))) (three ((drei))))" two
# sets reply to the array consisting of zwei, dos, duo.
#
# Notes:
# Note that parentheses have two different tasks, hence the proliferating
# levels. The outermost parentheses and alternate levels going inward
# enclose lists of possible values at a particular depth, and there can
# be as many elements as necessary within each level. The
# second-from-outermost parentheses and alternate levels describe
# a single argument at the current level, together with an optional
# specification for those that may follow.
#
# The innermost level(s) of parentheses around a single argument
# may be missed out; however, this makes it more confusing when
# attempting to add new levels.
#
# Unquoted parentheses and whitespace are always significant; use quotes
# where necessary.
#
# The code is not tolerant to errors in parentheses. Use named
# subtrees to clarify structure.
list_word_expand_tree() {
# work around the fact that "(" is a keyword if it appears first
local param=$1 tmptree=": $2"
set -A $param ${(z)tmptree}
shift $param
}
list_word() {
emulate -L zsh
setopt extendedglob
local -a match mbegin mend
typeset -A sublists
while [[ $1 = (#b)(*)'-:'(\(*\)) ]]; do
sublists[$match[1]]=$match[2]
shift
done
[[ $# -gt 0 && $1 = \(*\) ]] || return 2
local tree=$1 elt nexttree
local -a atree subtree substs
local -A seen
shift
while [[ $# -gt 0 && ${#tree} -ne 0 ]]; do
if [[ $tree = \(*\) ]]; then
# at this level this must be a single argument
tree=$tree[2,-2]
fi
list_word_expand_tree atree $tree
# loop over additional substitutions
# marking ones we've done in seen
seen=()
while true; do
for elt in $atree; do
if [[ $elt = \(*\) ]]; then
elt=$elt[2,-2]
list_word_expand_tree subtree $elt
elt=${(Q)subtree[1]}
if (( ${#subtree} > 2 )); then
return 2
fi
if [[ $subtree[2] = "-:"* ]]
then
nexttree=$sublists[${subtree[2][3,-1]}]
[[ -z $nexttree ]] && return 2
else
nexttree=$subtree[2]
fi
elif [[ $elt = "-:"* ]]; then
elt=${(Q)elt[3,-1]}
if [[ -z ${seen[$elt]} ]]; then
[[ -z $sublists[$elt] ]] && return 2
seen[$elt]=1
substs+=($elt)
continue
fi
else
elt=${(Q)elt}
nexttree=
fi
if [[ $elt = $1 ]]; then
# matched at this level, dive deeper to the next level
shift
tree=$nexttree
continue 3
fi
done
# Process additional -:stuff we may have picked up.
(( ${#substs} )) || break
atree=()
for elt in $substs; do
# Strip one level of parentheses.
elt=${${sublists[$elt]}[2,-2]}
# Add this to the current level for further processing.
list_word_expand_tree subtree $elt
atree+=($subtree)
done
substs=()
done
return 1
done
if [[ $# -eq 0 && ${#tree} -ne 0 ]]; then
if [[ $tree = \(*\) ]]; then
# at this level this must be a single argument
tree=$tree[2,-2]
fi
list_word_expand_tree atree $tree
typeset -ga reply
reply=()
seen=()
while true; do
for elt in $atree; do
if [[ $elt = \(*\) ]]; then
elt=$elt[2,-2]
list_word_expand_tree subtree $elt
reply+=(${(Q)subtree[1]})
elif [[ $elt = "-:"* ]]; then
elt=${(Q)elt[3,-1]}
if [[ -z ${seen[$elt]} ]]; then
[[ -z $sublists[$elt] ]] && return 2
seen[$elt]=1
substs+=($elt)
continue
fi
else
reply+=(${(Q)elt})
fi
done
# Process additional -:stuff we may have picked up.
(( ${#substs} )) || break
atree=()
for elt in $substs; do
# Strip one level of parentheses.
elt=${${sublists[$elt]}[2,-2]}
# Add this to the current level for further processing.
list_word_expand_tree subtree $elt
atree+=($subtree)
done
substs=()
done
return 0
else
return 1
fi
}
list_word "$@"
[-- Attachment #3: _dynamic_directory_name --]
[-- Type: application/octet-stream, Size: 680 bytes --]
#autoload
local expl SEPCHAR
local -a dirs parts reply
# Configurable bit
SEPCHAR=:
dirs=(
"uwb-:((main) (p=buffer32))"
"dot11-:((main) (v4.0) (v5.0))"
"proj-:((u -:uwb) (11 -:dot11))"
"((p1 -:proj) -:proj)"
)
# End config
local -a parts
if [[ $PREFIX = *${SEPCHAR}[^${SEPCHAR}]# ]]; then
if [[ $SEPCHAR = . ]]; then
eval parts=\(\"\${\(@s:${SEPCHAR}:\)PREFIX}\"\)
else
eval parts=\(\"\${\(@s.${SEPCHAR}.\)PREFIX}\"\)
fi
parts=("${(@)parts[1,-2]}")
compset -P "*$SEPCHAR"
# else leave parts empty and PREFIX as whatever
fi
autoload -Uz list_word
list_word $dirs "${(@)parts}"
_wanted namepart expl "Name part" compadd -S']' -r "$SEPCHAR" -- $reply
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2008-10-03 13:57 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-10-01 13:02 Shell argument splitting behaviour Peter Stephenson
2008-10-03 13:56 ` Peter Stephenson
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/zsh/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).