zsh-workers
 help / color / mirror / code / Atom feed
* PATCH: test of improved backslash quoting
@ 2007-08-22 10:35 Peter Stephenson
  2007-08-23 21:59 ` Peter Stephenson
  0 siblings, 1 reply; 2+ messages in thread
From: Peter Stephenson @ 2007-08-22 10:35 UTC (permalink / raw)
  To: Zsh hackers list

This tests the code I added to improved the output of expansion for
invalid or unprintable characters.  As I implied, but possibly didn't
make entirely clear, this affects any case where backslash quoting is
added, including the parameter (q) flag.  I've expanded the
documentation for this.

Note that removal of $'...' quotes is not currently well handled: this
applies both to the (Q) parameter flag and to completion, where already
added $'\...' quoting isn't properly parsed for further handling.  This
is part of the hairy patch I've been trying to do to improve completion
with nested quotes; however, the (Q) part is useful separately and a bit
more tractable, so I'll try and drag it out since it's likely to be
several orders of magnitude quicker than trying to complete the full
patch.

Index: Doc/Zsh/expn.yo
===================================================================
RCS file: /cvsroot/zsh/zsh/Doc/Zsh/expn.yo,v
retrieving revision 1.80
diff -u -r1.80 expn.yo
--- Doc/Zsh/expn.yo	27 Jul 2007 21:51:32 -0000	1.80
+++ Doc/Zsh/expn.yo	22 Aug 2007 10:30:45 -0000
@@ -792,10 +792,14 @@
 tt(${(P)${foo}}), and tt(${(P)$(echo bar)}) will be expanded to `tt(baz)'.
 )
 item(tt(q))(
-Quote the resulting words with backslashes. If this flag is given
+Quote the resulting words with backslashes; unprintable or invalid
+characters are quoted using the tt($'\)var(NNN)tt(') form, with separate
+quotes for each octet.  If this flag is given
 twice, the resulting words are quoted in single quotes and if it is
-given three times, the words are quoted in double quotes. If it is
-given four times, the words are quoted in single quotes preceded by a tt($).
+given three times, the words are quoted in double quotes; in these forms
+no special handling of unprintable or invalid characters is attempted.  If
+the flag is given four times, the words are quoted in single quotes
+preceded by a tt($).
 )
 item(tt(Q))(
 Remove one level of quotes from the resulting words.
Index: Test/D07multibyte.ztst
===================================================================
RCS file: /cvsroot/zsh/zsh/Test/D07multibyte.ztst,v
retrieving revision 1.19
diff -u -r1.19 D07multibyte.ztst
--- Test/D07multibyte.ztst	18 Jun 2007 13:25:10 -0000	1.19
+++ Test/D07multibyte.ztst	22 Aug 2007 10:30:45 -0000
@@ -378,3 +378,9 @@
 >ngs200.txt
 >ngs20.txt
 >ngs2.txt
+
+# Not strictly multibyte, but gives us a well-defined locale for testing.
+  foo=$'X\xc0Y\x07Z\x7fT'
+  print -r ${(q)foo}
+0:Backslash-quoting of unprintable/invalid characters uses $'...'
+>X$'\300'Y$'\a'Z$'\177'T


-- 
Peter Stephenson <pws@csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070


.


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: PATCH: test of improved backslash quoting
  2007-08-22 10:35 PATCH: test of improved backslash quoting Peter Stephenson
@ 2007-08-23 21:59 ` Peter Stephenson
  0 siblings, 0 replies; 2+ messages in thread
From: Peter Stephenson @ 2007-08-23 21:59 UTC (permalink / raw)
  To: Zsh hackers list

On Wed, 22 Aug 2007 11:35:53 +0100
Peter Stephenson <pws@csr.com> wrote:
> Note that removal of $'...' quotes is not currently well handled: this
> applies both to the (Q) parameter flag and to completion, where already
> added $'\...' quoting isn't properly parsed for further handling.

I was right about that.

> This is part of the hairy patch I've been trying to do to improve
> completion with nested quotes; however, the (Q) part is useful
> separately and a bit more tractable, so I'll try and drag it out since
> it's likely to be several orders of magnitude quicker than trying to
> complete the full patch.

I was wrong about that.  It's yet another separate special case.
However, it's not too hard to do with one caveat:  it assumes the result
of $'...' is shorter than the original.  That's usually the case, but
I'm worried there are pathological cases with metafied multibyte
strings.  If someone wants to fix parse_subst_string() or prove it's OK
as it is that would be splendid.  The difficulty with fixing
parse_subst_string() is that currently it acts in place and so no memory
management is necessary.  It would need some fiddling to allocate a new
string from the right memory and avoid leaks.

Index: Src/lex.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/lex.c,v
retrieving revision 1.40
diff -u -r1.40 lex.c
--- Src/lex.c	14 Aug 2007 11:51:18 -0000	1.40
+++ Src/lex.c	23 Aug 2007 21:53:04 -0000
@@ -1556,6 +1556,7 @@
 parse_subst_string(char *s)
 {
     int c, l = strlen(s), err, olen, lexstop_ret;
+    char *ptr;
 
     if (!*s || !strcmp(s, nulstring))
 	return 0;
@@ -1593,6 +1594,43 @@
 	return 1;
     }
 #endif
+    /* Check for $'...' quoting.  This needs special handling. */
+    for (ptr = s; *ptr; )
+    {
+	if (*ptr == String && ptr[1] == Snull)
+	{
+	    char *t;
+	    int len, tlen, diff;
+	    t = getkeystring(ptr + 2, &len, GETKEYS_DOLLARS_QUOTE, NULL);
+	    len += 2;
+	    tlen = strlen(t);
+	    diff = len - tlen;
+	    /*
+	     * Yuk.
+	     * parse_subst_string() currently handles strings in-place.
+	     * That's not so easy to fix without knowing whether
+	     * additional memory should come off the heap or
+	     * otherwise.  So we cheat by copying the unquoted string
+	     * into place, unless it's too long.  That's not the
+	     * normal case, but I'm worried there are are pathological
+	     * cases with converting metafied multibyte strings.
+	     * If someone can prove there aren't I will be very happy.
+	     */
+	    if (diff < 0) {
+		DPUTS(1, "$'...' subst too long: fix get_parse_string()");
+		return 1;
+	    }
+	    memcpy(ptr, t, tlen);
+	    ptr += tlen;
+	    if (diff > 0) {
+		char *dptr = ptr;
+		char *sptr = ptr + diff;
+		while ((*dptr++ = *sptr++))
+		    ;
+	    }
+	} else
+	    ptr++;
+    }
     return 0;
 }
 
Index: Test/D04parameter.ztst
===================================================================
RCS file: /cvsroot/zsh/zsh/Test/D04parameter.ztst,v
retrieving revision 1.27
diff -u -r1.27 D04parameter.ztst
--- Test/D04parameter.ztst	25 Jul 2007 09:26:52 -0000	1.27
+++ Test/D04parameter.ztst	23 Aug 2007 21:53:04 -0000
@@ -320,6 +320,11 @@
 0:${(Q)...}
 >and now even the pubs are shut.
 
+  foo="X$'\x41'$'\x42'Y"
+  print -r ${(Q)foo}
+0:${(Q)...} with handling of $'...'
+>XABY
+
   psvar=(dog)
   setopt promptsubst
   foo='It shouldn'\''t $(happen) to a %1v.'



-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2007-08-23 22:00 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-08-22 10:35 PATCH: test of improved backslash quoting Peter Stephenson
2007-08-23 21:59 ` Peter Stephenson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).