From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from doolittle.vetsci.su.OZ.AU ([129.78.148.2]) by archone.tamu.edu with SMTP id <18887>; Sat, 1 Feb 1992 11:26:17 -0600 Received: by doolittle.vetsci.su.oz.au via suspension id <49380>; Sun, 2 Feb 1992 04:25:30 +1100 Received: by doolittle.vetsci.su.oz.au id <49379>; Sun, 2 Feb 1992 03:36:31 +1100 From: John (I've got some bad news for you, sunshine) Mackin Date: Sat, 1 Feb 1992 10:02:08 -0600 To: The rc Mailing List cc: Tom Culliton Subject: Re: Match operator puzzlement In-Reply-To: <9201311601.aa02527@ceres.srg.af.mil> Message-ID: <199202020302.19891.rc.badey@vetsci.su.oz.au> X-Face: 39seV7n\`#asqOFdx#oj/Uz*lseO_1n9n7rQS;~ve\e`&Z},nU1+>0X^>mg&M.^X$[ez>{F k5[Ah<7xBWF-@-ru?& @4K4-b`ydd^`(n%Z{ Tom Culliton raised some interesting points about pattern matching. Which didn't work as planned for semi-obvious reasons involving re-scanning. The reason a straightforward attempt doesn't work isn't really anything to do with rescanning at all, since there IS no rescanning -- don't forget that that is rc's main principle: in the absence of 'eval', which exists to break the rule, there is NEVER rescanning. The reason it doesn't work is, to quote Byron, for metacharacters in a ~ pattern to behave as metacharacters, they must appear _literally_ and _unquoted_. Nothing else will serve; no subterfuge, however subtle, will make them match unless they are literal and not quoted. Usually this doesn't present a problem, since a simple eval suffices. Tom, however, has either a weird application (if this really is a practical problem) or a curious bent of mind (if it's just a theoretical one), since he posits: OOPS! I encountered a file name with a $ in it so make that if (eval ~ '$i' $patterns) { # etc... But what about patterns with $ and so forth in them? Hmm. Filenames with $ in them? I didn't know rc had been ported to VMS :). Seriously, filenames with $ in them are not a good idea. Still, the above does deal with that. As to patterns with $ in them, that's what makes this an interesting question. In fact, let's leave eval aside for a moment, and consider just the question of how to match a pattern with $ in it. Now, ~ 'get$down' 'get$down' does work, naturally, and as naturally, ~ 'get$down' get$down does not, since the $down in the pattern is variable-expanded (into nothing since I don't have that set). Everything you would expect to work, does work. All these match: ~ 'get$down' *n ~ 'get$down' *down ~ 'get$down' get?down And this doesn't: ~ 'get$down' '*$down' Recall the basic principle: the metacharacter must be literal and unquoted to be effective. So leaving eval aside, we have to ask this question: how can the metacharacter be unquoted, to be effective, and the $ be quoted, to prevent variable expansion? When we know the question, the answer is obvious: ~ 'get$down' * ^ '$down' which does indeed match as expected. The answer to Tom's question is simply to use the same mechanism along with eval, using the exact code of his last example: patterns = $1 ... if ( eval ~ '$i' $patterns ) The point, though, is that if the pattern is to contain any of rc's syntax characters, appropriate quoting must be used. $ is not the only character that causes these problems; consider a pattern containing '=' -- similar hassles arise there. So one cannot just write cmd '*.o *.a *$bar' but must rather write cmd '*.o *.a * ^ ''$bar''' I am willing to admit that this is a touch cumbersome. However, in closing I'd like to stick up for the way rc works here. It is simple and clean and _predictable_, unlike other shells. I'd hate to even imagine trying something like this in csh. And I'd like to just beat a little harder on an earlier point: UNIX gives us a hell of a lot of power in many ways. Not the least of those is our ability to put any character in a pathname segment other than NUL or slash. But, as always, the converse of power is responsibility; being a properly responsible UNIX citizen means being aware that if we are going to put characters in pathnames that don't, by all rights, reasonably belong there (like $), we have to accept the consequences (our tools get harder to create, and have more work to do). Of course, the beauty of UNIX is that as long as we _are_ willing to accept the consequences, we _can_ do it. And the beauty of rc is that it's easy to see how. OK, John.