zsh-workers
 help / color / mirror / code / Atom feed
* print -D and ${(D)} quoting
@ 2010-10-12 16:52 Peter Stephenson
  2010-10-12 20:53 ` Peter Stephenson
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Stephenson @ 2010-10-12 16:52 UTC (permalink / raw)
  To: Zsh hackers list

I noticed, via cdr, that the file "/home/pws/with space" turned into
"~/with space" when you used "print -D".  This seems to me wrong --- as
it stands, the first part of the string needs to be used literally when
passed to the shell as an argument for re-evaluation, while the second
part needs to be quoted or it will be split into two words.  So the
result is inconsistent and there's no easy rule by which to interpret
the resulting output.  This isn't specific to space, it's true of any
special character.

As I was thinking about fixing this it occurred to me that most of the
time I can fix it by using tricks with the parameter expansion (D) flag,
although the (D) flag only got added in 28025 in June, which I think is
why I didn't use it before.  They're not completely trivial to do
consistently, however.

Still, the real fix isn't necessarily trivial to do consistently,
either, so that's why I'm asking about it.  In particular, unless I
reorder the code in bin_print(), "-D" processing happens after any
stripping of backslash that happens because -r is not present.  That
looks the wrong way round, certainly if I implement the fix.  What I
mean is, the new behaviour after the basic fix to add quoting is this:

% print -rD "/home/pws/with space"
~/with\ space

(because of -r)

% print -D "/home/pws/with space"
~/with\ space                # actual result
~/with space                 # expected result on general moral principles

So I think I should reverse the order of handling of -D and not--r.  That
may be easier said than done, the logic in bin_print() is somewhat
tortuous.

Presumably if I change print -D I should also change the (D) flag
similarly.  I think that's less of an issue because it's easy to nest
the effects of parameter flags.  Further, quote munging is a more
familiar effect in parameter substitution than it is in "print"
statements.

Plan B would be just to change the (D) flag and use that in "cdr -l";
there's no release of the shell with (D) in yet.

-- 
Peter Stephenson <pws@csr.com>            Software Engineer
Tel: +44 (0)1223 692070                   Cambridge Silicon Radio Limited
Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, UK


Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: print -D and ${(D)} quoting
  2010-10-12 16:52 print -D and ${(D)} quoting Peter Stephenson
@ 2010-10-12 20:53 ` Peter Stephenson
  2010-10-13  1:56   ` Bart Schaefer
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Stephenson @ 2010-10-12 20:53 UTC (permalink / raw)
  To: zsh-workers

Here, I think, is the uncontroversial part, that changes the (D)
substitution flag.

I spent some time wondering whether finddir() took an unmetafied or a
metafied path, and eventually decided it was metafied.  This resulted in
one fix and various notes elsewhere; it means the current "print -D" code
is actually broken in this respect.

Index: Doc/Zsh/expn.yo
===================================================================
RCS file: /cvsroot/zsh/zsh/Doc/Zsh/expn.yo,v
retrieving revision 1.119
diff -p -u -r1.119 expn.yo
--- Doc/Zsh/expn.yo	2 Sep 2010 08:46:27 -0000	1.119
+++ Doc/Zsh/expn.yo	12 Oct 2010 20:48:01 -0000
@@ -774,8 +774,10 @@ that result from field splitting.
 )
 item(tt(D))(
 Assume the string or array elements contain directories and attempt
-to substitute the leading part of these by names.  This is the reverse
-of `tt(~)' substitution:  see
+to substitute the leading part of these by names.  The remainder of
+the path (the whole of it if the leading part was not subsituted)
+is then quoted so that the whole string can be used as a shell
+argument.  This is the reverse of `tt(~)' substitution:  see
 ifnzman(noderef(Filename Expansion))\
 ifzman(the section FILENAME EXPANSION below).
 )
Index: Functions/Chpwd/cdr
===================================================================
RCS file: /cvsroot/zsh/zsh/Functions/Chpwd/cdr,v
retrieving revision 1.1
diff -p -u -r1.1 cdr
--- Functions/Chpwd/cdr	9 Jul 2010 14:47:48 -0000	1.1
+++ Functions/Chpwd/cdr	12 Oct 2010 20:48:01 -0000
@@ -291,7 +291,7 @@ if (( list )); then
   dirs=($reply)
   for (( i = 1; i <= ${#dirs}; i++ )); do
     print -n ${(r.5.)i}
-    print -D ${dirs[i]}
+    print -r ${(D)dirs[i]}
   done
   return
 fi
Index: Src/builtin.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/builtin.c,v
retrieving revision 1.244
diff -p -u -r1.244 builtin.c
--- Src/builtin.c	19 Sep 2010 19:20:33 -0000	1.244
+++ Src/builtin.c	12 Oct 2010 20:48:02 -0000
@@ -3704,6 +3704,7 @@ bin_print(char *name, char **args, Optio
 	    Nameddir d;
 
 	    queue_signals();
+	    /* TODO: finddir takes a metafied file */
 	    d = finddir(args[n]);
 	    if(d) {
 		int dirlen = strlen(d->dir);
Index: Src/utils.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/utils.c,v
retrieving revision 1.245
diff -p -u -r1.245 utils.c
--- Src/utils.c	15 Jul 2010 18:44:13 -0000	1.245
+++ Src/utils.c	12 Oct 2010 20:48:03 -0000
@@ -807,6 +807,8 @@ fprintdir(char *s, FILE *f)
 /*
  * Substitute a directory using a name.
  * If there is none, return the original argument.
+ *
+ * At this level all strings involved are metafied.
  */
 
 /**/
@@ -816,8 +818,9 @@ substnamedir(char *s)
     Nameddir d = finddir(s);
 
     if (!d)
-	return s;
-    return zhtricat("~", d->node.nam, s + strlen(d->dir));
+	return quotestring(s, NULL, QT_BACKSLASH);
+    return zhtricat("~", d->node.nam, quotestring(s + strlen(d->dir),
+						  NULL, QT_BACKSLASH));
 }
 
 
@@ -874,9 +877,13 @@ finddir_scan(HashNode hn, UNUSED(int fla
     }
 }
 
-/* See if a path has a named directory as its prefix. *
- * If passed a NULL argument, it will invalidate any  *
- * cached information.                                */
+/*
+ * See if a path has a named directory as its prefix.
+ * If passed a NULL argument, it will invalidate any 
+ * cached information.
+ *
+ * s here is metafied.
+ */
 
 /**/
 Nameddir
@@ -915,9 +922,7 @@ finddir(char *s)
     scanhashtable(nameddirtab, 0, 0, 0, finddir_scan, 0);
 
     if (func) {
-	char *dir_meta = metafy(finddir_full, strlen(finddir_full),
-				META_ALLOC);
-	char **ares = subst_string_by_func(func, "d", dir_meta);
+	char **ares = subst_string_by_func(func, "d", finddir_full);
 	int len;
 	if (ares && arrlen(ares) >= 2 &&
 	    (len = (int)zstrtol(ares[1], NULL, 10)) > finddir_best) {
@@ -928,8 +933,6 @@ finddir(char *s)
 	    finddir_last->diff = len - strlen(finddir_last->node.nam);
 	    finddir_best = len;
 	}
-	if (dir_meta != finddir_full)
-	    zsfree(dir_meta);
     }
 
     return finddir_last;
@@ -1039,6 +1042,10 @@ getnameddir(char *name)
     return NULL;
 }
 
+/*
+ * Compare directories.  Both are metafied.
+ */
+
 /**/
 static int
 dircmp(char *s, char *t)
@@ -4607,7 +4614,7 @@ addunprintable(char *v, const char *u, c
 }
 
 /*
- * Quote the string s and return the result.
+ * Quote the string s and return the result as a string from the heap.
  *
  * If e is non-zero, the
  * pointer it points to may point to a position in s and in e the position


-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: print -D and ${(D)} quoting
  2010-10-12 20:53 ` Peter Stephenson
@ 2010-10-13  1:56   ` Bart Schaefer
  2010-10-13  9:04     ` Peter Stephenson
  0 siblings, 1 reply; 6+ messages in thread
From: Bart Schaefer @ 2010-10-13  1:56 UTC (permalink / raw)
  To: zsh-workers

On Oct 12,  9:53pm, Peter Stephenson wrote:
}
} Here, I think, is the uncontroversial part, that changes the (D)
} substitution flag.

Naturally, asserting lack of controversy means I have a question ...

} Index: Doc/Zsh/expn.yo
} ===================================================================
}  item(tt(D))(
}  Assume the string or array elements contain directories and attempt
} +to substitute the leading part of these by names.  The remainder of
} +the path (the whole of it if the leading part was not subsituted)
} +is then quoted so that the whole string can be used as a shell
} +argument.  This is the reverse of `tt(~)' substitution:  see
}  ifnzman(noderef(Filename Expansion))\
}  ifzman(the section FILENAME EXPANSION below).
}  )

Is quoting the rest of the string really the correct thing to do?
Consider ${${(D)foo}:r} or ${=${(D)foo}}.  Why do you believe 
${(Q)$(D)foo}} should be necessary to operate on the original
string, modulo tilde contraction?  Is it because that's easier
on a zsh script programmer than figuring out where to use (q)?
Perhaps steps 13 and 14 from the "rules" should be combined in
some way, so (qD) together differs from either used alone?  

Non-sequitur:

The implementation of the (q-) flag is still a bit confusing; the
form (q-q) behaves like (qq), but (qq-) is "error in flags" whereas
(q-qq), (q-qqq), etc. for any number of "q" after the "-", reports:
    Src/utils.c:4685: BUG: bad quote type in quotestring
and then behaves like (qq).


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: print -D and ${(D)} quoting
  2010-10-13  1:56   ` Bart Schaefer
@ 2010-10-13  9:04     ` Peter Stephenson
  2010-10-13 16:14       ` Bart Schaefer
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Stephenson @ 2010-10-13  9:04 UTC (permalink / raw)
  To: zsh-workers

On Tue, 12 Oct 2010 18:56:49 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> Is quoting the rest of the string really the correct thing to do?

My contention is that a string of the form "~/with space" is neither nowt
nor something.

> Consider ${${(D)foo}:r} or ${=${(D)foo}}.

${(D)${foo:r}} is the right way round, since the initial $foo is (by
hypothesis) a path understood by the OS, while ${(D)foo} isn't, even
without the quoting.  But why are you doing file operations on a
parameter and then turning it into something that's no longer a file
(or, perhaps better put given what :r is typically used for, that
doesn't allow you to construct a file name you can use in a shell
expression)?

I don't know what ${=${(D)foo}} is supposed to be useful for.

> Why do you believe 
> ${(Q)$(D)foo}} should be necessary to operate on the original
> string, modulo tilde contraction?  Is it because that's easier
> on a zsh script programmer than figuring out where to use (q)?

Yes.  With dynamic directory naming you can have
~[complexexpression]/stuff.  It's valid to have a "/" in the complex
expression.  It's not a simple obvious operation to get a correctly
formatted string even without that.

Under what circumstances do you think it's useful to have a tilde
expression that's not usable as a shell command line argument, anyway?
Surely this can only be cosmetic, in which case having to add (Q) if you
really want a string purely for display is perfectly reasonable?

> Perhaps steps 13 and 14 from the "rules" should be combined in
> some way, so (qD) together differs from either used alone?  

That looks a messier solution to me, though not unworkable given that
you also have the safe option of ~${(q)${(D)...}}.

-- 
Peter Stephenson <pws@csr.com>            Software Engineer
Tel: +44 (0)1223 692070                   Cambridge Silicon Radio Limited
Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, UK


Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: print -D and ${(D)} quoting
  2010-10-13  9:04     ` Peter Stephenson
@ 2010-10-13 16:14       ` Bart Schaefer
  2010-10-13 16:45         ` Peter Stephenson
  0 siblings, 1 reply; 6+ messages in thread
From: Bart Schaefer @ 2010-10-13 16:14 UTC (permalink / raw)
  To: zsh-workers

On Oct 13, 10:04am, Peter Stephenson wrote:
}
} On Tue, 12 Oct 2010 18:56:49 -0700
} Bart Schaefer <schaefer@brasslantern.com> wrote:
} > Is quoting the rest of the string really the correct thing to do?
} 
} My contention is that a string of the form "~/with space" is neither
} nowt nor something.

As that stands, I agree (though I'm not sure whether you intend to
include the quote marks as part of the form).

However, what we have is not simply a string.  It's the value of a
parameter, and therefore subject to all the rules for when (not) to
split it or expand it.

BTW my intention here is not to insist the patch is wrong, merely to
understand why it expresses the preferred semantics.
 
} > Consider ${${(D)foo}:r} or ${=${(D)foo}}.
} 
} ${(D)${foo:r}} is the right way round, since the initial $foo is (by
} hypothesis) a path understood by the OS, while ${(D)foo} isn't

Neither (D) nor :r truly care whether the value they're operating upon
is a path understood by the OS, but in that sense arguably (D) is more
rather than less dependent on path semantics than :r.

} even without the quoting. But why are you doing file operations on a
} parameter and then turning it into something that's no longer a file

Maybe my problem is that I can't conceive of (D) as a file operation.
As far as I can tell as soon as you use (D) the result has been turned
into something that's no longer a file, and therefore it shouldn't be
treated specially.

Skipping ahead for an interlude:

} Under what circumstances do you think it's useful to have a tilde
} expression that's not usable as a shell command line argument, anyway?

What does "usable as a shell command line argument" mean here?  Why
convert something into a contraction and then back again if you're
operating entirely within a shell program?

} Surely this can only be cosmetic, in which case having to add (Q) if
} you really want a string purely for display is perfectly reasonable?

The ONLY reason I understand for using (D) is for human consumption,
e.g., for display, so from my point of view (Q) is never not required.

Unless you're positing a case like

    somevar="/some/path/or/other/with spaces"
    hash -d foo=/some/path/or/other
    x=${(D)somevar}
    hash -d foo=/a/different/path/now
    blather $~x

I guess that's sort of what dynamic directory naming accomplishes, so
maybe that's not as farfetched as I first thought.

However, even in that case having backslashes in the value of $x is
not the right thing, is it?  With your proposed change the argument
passed to blather is going to include the literal backslash before
the space, because of the parameter rules.  Unless I misunderstand
something, that last line would still have to be one of

    blather ${(Q)~x}
or
    eval blather $x

instead?  If that's the usage for which you want to optimize, that's
OK, I'm just wondering why; dynamic directory naming may be the
answer.  That's something I never use, so ...

} (or, perhaps better put given what :r is typically used for, that
} doesn't allow you to construct a file name you can use in a shell
} expression)?

I was thinking of the case where the expansion of ~name itself has
a file extension.  In that case ${(D)${foo:r}} and ${${(D)foo}:r}
may be quite different when [[ "$foo" == "$(echo -n ~name)" ]].

} I don't know what ${=${(D)foo}} is supposed to be useful for.

I was only demonstrating that if you split up the result of (D)
when quoted as you propose, you get backslashes in potentially
unexpected places.
 
} Yes.  With dynamic directory naming you can have
} ~[complexexpression]/stuff.  It's valid to have a "/" in the complex
} expression.  It's not a simple obvious operation to get a correctly
} formatted string even without that.

So your patch ends up quoting everything after the [complexexpression],
but not quoting ~[complexexpression] itself.  Becaue you then need to
do some programmatic operations on the value that don't involve simple
$~x replacement?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: print -D and ${(D)} quoting
  2010-10-13 16:14       ` Bart Schaefer
@ 2010-10-13 16:45         ` Peter Stephenson
  0 siblings, 0 replies; 6+ messages in thread
From: Peter Stephenson @ 2010-10-13 16:45 UTC (permalink / raw)
  To: zsh-workers

On Wed, 13 Oct 2010 09:14:42 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> The ONLY reason I understand for using (D) is for human consumption,
> e.g., for display, so from my point of view (Q) is never not required.

Ah, this is probably the key point of why we're looking at this
differently.  No, the resulting string is used in completion.
So the intention is for the munged value to be both
useful and readable.

The case I'm interested in (and this is actually the only time I've used
the (D) flag so far, so I can't really compare with other possible uses)
is:

- take PWD
- save the value to a file
- use that value later on for completion

but in the process tidy up the value to something that's more readable
for the user, so doing the ~ contraction. (It doesn't really matter here
whether that's at the second or third step, in fact it's at the third).

One oddity here is that we're completing the full path in one go --- the
choices are absolute paths which could be anywhere in the file system.
This is why having a compact prefix becomes important.

In this particular case it's clear cut that either I supply
a raw file string, and allow completion to do the quoting, or I
supply a properly quoted command line argument, and make completion
insert it literally, since completion will either quote everything or
nothing.

So the new code gets me "~/with\ space" or whatever, and I
tell completion to insert that literally.

I don't necessarily know this is going to be widely used but I
still don't a good example of why doing it differently is likely to be
more useful.

-- 
Peter Stephenson <pws@csr.com>            Software Engineer
Tel: +44 (0)1223 692070                   Cambridge Silicon Radio Limited
Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, UK


Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-10-13 17:16 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-12 16:52 print -D and ${(D)} quoting Peter Stephenson
2010-10-12 20:53 ` Peter Stephenson
2010-10-13  1:56   ` Bart Schaefer
2010-10-13  9:04     ` Peter Stephenson
2010-10-13 16:14       ` Bart Schaefer
2010-10-13 16:45         ` Peter Stephenson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).