From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2145 invoked by alias); 3 Jan 2014 19:48:17 -0000 Mailing-List: contact zsh-users-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Users List List-Post: List-Help: X-Seq: 18270 Received: (qmail 14942 invoked from network); 3 Jan 2014 19:48:11 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:subject:message-id:in-reply-to :references:mime-version:content-type:content-transfer-encoding; bh=rmHAQ5ZFwyjQlEwMadTYLhjP6h0CZqrTZCLaeI5esWs=; b=Qh/5nWwEyAqSToUJrwkIgjFbghTiG0tVwqduW6nZ8DsA/ULIrvPHKWIZ1ibNxVTAUT EqXuMZZWL7Yb62m5n4SbD9dBYbrSlDj3JijzbjWCeG5oN5riJy7VOAjPUwExABo9ZXsb Bs1MsaxN//ySAPA+wS1pTg6irHfAoyYa6uRwMlTDvnsZHq0OVxTrXtr4aFa42uqNHgMX RLwUe56oB5t/e6uon0f9BZqd2vLlCwf81WIWpz45G+cNaPDyCGdq6ZIc+Ncs+MV9OFqq 1CUbLN/8Ha75TtnS9p5Q7dzB5Iz+UdVrlRty6J1gMOYPrW+uKBaOy8a9e+NJlQOPYLZY nIlA== X-Gm-Message-State: ALoCoQnQTgj4d+83ZtbdW3yHYb8qveKTQdL2+BFyQ6TExye68T8aYkBFbNStfCKH2ICMvLXH5bZR X-Received: by 10.180.108.132 with SMTP id hk4mr3267500wib.12.1388778486727; Fri, 03 Jan 2014 11:48:06 -0800 (PST) X-ProxyUser-IP: 86.6.157.246 Date: Fri, 3 Jan 2014 19:48:02 +0000 From: Peter Stephenson To: Zsh Users Subject: Re: difference between ~ & ^ negation Message-ID: <20140103194802.2f7cae9d@pws-pc.ntlworld.com> In-Reply-To: <140102233726.ZM10543@torch.brasslantern.com> References: <140101134459.ZM8931@torch.brasslantern.com> <20140102210147.0eca0601@pws-pc.ntlworld.com> <140102133636.ZM10014@torch.brasslantern.com> <140102233726.ZM10543@torch.brasslantern.com> X-Mailer: Claws Mail 3.8.0 (GTK+ 2.24.7; x86_64-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Here's an FAQ entry on ^ and ~ since they are obvious confusing. Corrections or improvements are of course gladly received, but with explicit suggestions, even partial ones, please, not generalised complaints as I then don't know what to do about it. diff --git a/Etc/FAQ.yo b/Etc/FAQ.yo index bd8ca97..4bc5df6 100644 --- a/Etc/FAQ.yo +++ b/Etc/FAQ.yo @@ -122,6 +122,7 @@ Chapter 3: How to get various things to work 3.24. What's wrong with cut and paste on my xterm? 3.25. How do I get coloured prompts on my colour xterm? 3.26. Why is my output duplicated with `tt(foo 2>&1 >foo.out | bar)'? +3.27. What are these `^' and `~' pattern characters, anyway? Chapter 4: The mysteries of completion 4.1. What is completion? @@ -545,14 +546,8 @@ tt(EXTENDED_GLOB). option tt(KSH_GLOB) is in effect; for previous versions you must use the table above. - [1] Note that mytt(~) is the only globbing operator to have a lower - precedence than mytt(/). For example, mytt(**/foo~*bar*) matches any - file in a subdirectory called mytt(foo), except where mytt(bar) - occurred somewhere in the path (e.g. mytt(users/barstaff/foo) will - be excluded by the mytt(~) operator). As the mytt(**) operator cannot - be grouped (inside parentheses it is treated as mytt(*)), this is - one way to exclude some subdirectories from matching a mytt(**). - The form (^foo/)# also works. + [1] See question link(3.27)(327) for more on the mysteries of + mytt(~) and mytt(^). it() Unquoted assignments do file expansion after mytt(:)s (intended for PATHs). it()* mytt(typeset) and mytt(integer) have special behaviour for @@ -1452,6 +1447,8 @@ sect(Why does mytt(bindkey ^a command-name) or mytt(stty intr ^-) do something f are metacharacters. tt(^a) matches any file except one called tt(a), so the line is interpreted as bindkey followed by a list of files. Quote the tt(^) with a backslash or put quotation marks around tt(^a). + See link(3.27)(327) if you want to know more about the pattern + character mytt(^). sect(Why can't I bind tt(\C-s) and tt(\C-q) any more?) @@ -1668,6 +1665,7 @@ sect(How do I prevent the prompt overwriting output when there is no newline?) One final alternative is to put a newline in your prompt -- see question link(3.13)(313) for that. + sect(What's wrong with cut and paste on my xterm?) On the majority of modern UNIX systems, cutting text from one window and @@ -1700,6 +1698,7 @@ sect(What's wrong with cut and paste on my xterm?) fixes referred to above in order to be reliable). ) + sect(How do I get coloured prompts on my colour xterm?) (Or `color xterm', if you're reading this in black and white.) @@ -1743,6 +1742,7 @@ sect(How do I get coloured prompts on my colour xterm?) `mytt([0m)' puts printing back to normal so that the rest of the line is unchanged. + sect(Why is my output duplicated with `tt(foo 2>&1 >foo.out | bar)'?) This is a slightly unexpected effect of the option tt(MULTIOS), which is @@ -1780,6 +1780,122 @@ sect(Why is my output duplicated with `tt(foo 2>&1 >foo.out | bar)'?) to unset the option mytt(MULTIOS). +sect(What are these `^' and `~' pattern characters, anyway?) +label(327) + + The characters mytt(^) and mytt(~) are active when the option + tt(EXTENDED_GLOB) is set. Both are used to exclude patterns, i.e. to + say `match something other than ...'. There are some confusing + differences, however. Here are the descriptions for mytt(^) and mytt(~). + + mytt(^) means `anything except the pattern that follows'. You can + think of the combination tt(^)em(pat) as being like a tt(*) except + that it doesn't match em(pat). So, for example, mytt(myfile^.txt) + matches anything that begins with tt(myfile) except tt(myfile.txt). + Because it works with patterns, not just strings, mytt(myfile^*.c) + matches anything that begins with tt(myfile) unless it ends with + tt(.c), whatever comes in the middle --- so it matches tt(myfile1.h) + but not tt(myfile1.c). + + Also like mytt(*), mytt(^) doesn't match across directories if you're + matching files when `globbing', i.e. when you use an unquoted pattern + in an ordinary command line to generate file names. So + mytt(^dir1/^file1) matches any subdirectory of the current directory + except one called tt(dir1), and within any directory it matches it + picks any file except one called tt(file1). So the overall pattern + matches tt(dir2/file2) but not tt(dir1/file1) nor tt(dir1/file2) nor + tt(dir2/file1). (The rule that all the different bits of the pattern + must match is exactly the same as for any other pattern character, + it's just a little confusing that what em(does) match in each bit is + found by telling the shell em(not) to match something or other.) + + As with any other pattern, a mytt(^) expression doesn't treat the + character `tt(/)' specially if it's not matching files, for example + when pattern matching in a command like mytt([[ $string = ^pat1/pat2 ]]). + Here the whole string tt(pat1/pat2) is treated as the argument that + follows the mytt(^). So anything matches but that one string + tt(pat1/pat1). + + It's not obvious what something like mytt([[ $string = ^pat1^pat2 ]]) + means. You won't often have cause to use it, but the rule is that + each mytt(^) takes em(everything) that follows as an argument (unless + it's already inside parentheses --- I'll explain this below). To see + this more clearly, put those arguments in parentheses: the pattern is + equivalent to mytt(^(pat1^(pat2))). where now you can see exactly what + each mytt(^) takes as its argument. I'll leave it as an exercise for + you to work out what this does and doesn't match. + + mytt(~) is always used between two patterns --- never right at the + beginning or right at the end. Note that the other special meaning of + mytt(~), at the start of a filename to refer to your home directory or + the another named directory, doesn't require the option + tt(EXTENDED_GLOB) to be set. (At the end of an argument mytt(~) is + never special at all. This is useful if you have Emacs backup files.) + It means `match what's in front of the tilde, but only if it doesn't + match what's after the tilde'. So mytt(*.c~f*) matches any file + ending in tt(.c) except one that begins with tt(f). You'll see that, + unlike mytt(^), the parts before and after the mytt(~) both refer + separately to the entire test string. + + For matching files by globbing, mytt(~) is the only globbing + operator to have a lower precedence than mytt(/). In other words, + when you have mytt(/a/path/to/match~/a/path/not/to/match) the mytt(~) + considers what's before and what's after as complete paths to file names. + You can put any other pattern characters in the expressions before and + after the mytt(~), but note that the pattern after the tt(~) is really + just a single pattern to match against every file found. That means, + for example, that a tt(*) after the tt(~) em(will) match a tt(/). + If that's confusing, you can think of how mytt(~) works like this: + take the pattern on the left, use it as normal to make a list of + files, then for each file found see if it matches the pattern on the + right and if it does take that file out of the list. + + One rule that is common to both mytt(^) and mytt(~) is that they can + be put inside parentheses and the arguments to them don't extend past + the parentheses. So mytt((^README).txt) matches any file ending in + tt(.txt) unless it starts with tt(README), the same as + mytt(README*~*.txt). Likewise, mytt(abc(<->~<10-100>).txt) matches a + file consisting of tt(abc), then some digits, then tt(.txt), unless + the digits happen to match a number from 10 to 100 inclusive (remember + the handy mytt(<->) pattern for matching integers with optional limits + to the range). So this pattern matches tt(abc1.txt) or tt(abc200.txt) + but not tt(abc20.txt) nor tt(abc100.txt) nor even tt(abc0030.txt). + However, if you're matching files by globbing note you can't put + mytt(/)s inside the parentheses since the groups can't stretch across + multiple directories. (You can do that, of course, whenever the + character mytt(/) isn't special.) + + You may like to know that from zsh 5.0.2 you can disable any pattern + character separately. So if you find mytt(^) gets in your way and + you're happy using mytt(~), put mytt(disable -p "^") in tt(~/.zshrc). + You still need to turn on tt(EXTENDED_GLOB); the tt(disable) command + only deactivates things that would otherwise be active, you can't + specially enable something not allowed by the syntax options in effect. + + Here are some examples with files to illustrate the points. We'll + assume the option tt(EXTENDED_GLOB) is set. + + enumerate( + myeit() mytt(**/foo~*bar*) matches any file in a subdirectory called + mytt(foo), except where mytt(bar) occurred somewhere in the path + (e.g. mytt(users/barstaff/foo) will be excluded by the mytt(~) + operator). As the mytt(**) operator cannot be grouped (inside + parentheses it is treated as mytt(*)), this is one way to exclude some + subdirectories from matching a mytt(**). Note that this can be quite + inefficent because the shell performs a complete search for + mytt(**/foo) before it uses the pattern after the mytt(~) to exclude + files from the match. The file is excluded if mytt(bar) occurs + em(anywhere), in any directory segment or the final file name. + myeit() The form mytt((^foo/)#) can be used to match any hierarchy of + files where none of the path components is tt(foo). For + example, mytt((^CVS/)#) selects all subdirectories to any depth + except where one component is named mytt(CVS). (The form + mytt((pat/)#) is very useful in other cases; for example, + mytt((../)#.cvsignore) finds the file tt(.cvsignore) if it exists + in the current directory or any parent.) + ) + + chapter(The mysteries of completion) -- Peter Stephenson Web page now at http://homepage.ntlworld.com/p.w.stephenson/