9front - general discussion about 9front
 help / color / mirror / Atom feed
* [9front] PATCH walk(1) to (optionally) quote name and path
@ 2023-08-12  3:25 Romano
  2023-08-12  5:46 ` ori
  2023-08-12 13:21 ` hiro
  0 siblings, 2 replies; 9+ messages in thread
From: Romano @ 2023-08-12  3:25 UTC (permalink / raw)
  To: 9front

I'm working with files on a backup drive that have spaces and
quotation marks in their names, using walk(1).  I can pipe to sed
s/''''/''''''''''/g and do similarly for other characters requiring
escaping for rc(1), but figured patching walk(1) would be better.  I
don't like how I just copied case 'p': the quoting "works", but isn't
as clean as the output for ls(1).  For example:

Desktop/Movies/'A Bug''s Life.mp4'
'Desktop/Movies/TV/TV Library.tvlibrary'/'Library Preferences.tvdb'
Desktop/Movies/'The American President.mp4'

instead of:

'Desktop/Movies/A Bug''s Life.mp4'
'Desktop/Movies/TV/TV Library.tvlibrary/Library Preferences.tvdb'
'Desktop/Movies/The American President.mp4'

But it's better than what happens now, where there's no quoting.

From: Romano <unobe@cpan.org>
Date: Sat, 12 Aug 2023 03:18:22 +0000
Subject: [PATCH] walk: quote name and paths using N and P

---
diff deb39a43ae5ed09c7971726cedeb06a9f65ccc6d 2da52094832662c533f60629813dbd5ac805e3ad
--- a/sys/man/1/walk
+++ b/sys/man/1/walk
@@ -81,8 +81,14 @@
 .B n
 final path element (name)
 .TP
+.B N
+final path element (name), rc (1) quoted when necessary
+.TP
 .B p
 path
+.TP
+.B P
+path, rc (1) quoted when necessary
 .TP
 .B q
 qid path.version.type (see
--- a/sys/src/cmd/walk.c
+++ b/sys/src/cmd/walk.c
@@ -43,7 +43,7 @@
 void
 dofile(char *path, Dir *f, int pathonly)
 {
-	char *p;
+	char *p, *quoted;
 
 	if(
 		(f == dotdir)
@@ -60,6 +60,10 @@
 		case 'a': Bprint(bout, "%uld", f->atime); break;
 		case 'm': Bprint(bout, "%uld", f->mtime); break;
 		case 'n': Bwrite(bout, f->name, strlen(f->name)); break;
+		case 'N':
+			quoted = quotestrdup(f->name);
+			Bwrite(bout, quoted, strlen(quoted));
+			break;
 		case 'p':
 			if(path != dotpath)
 				Bwrite(bout, path, strlen(path));
@@ -69,6 +73,18 @@
 				Bwrite(bout, f->name, strlen(f->name));
 			}
 			break;
+		case 'P':
+			if(path != dotpath) {
+				quoted = quotestrdup(path);
+				Bwrite(bout, quoted, strlen(quoted));
+			}
+			if(! (f->qid.type & QTDIR) && !pathonly){
+				if(path != dotpath)
+					Bputc(bout, '/');
+				quoted = quotestrdup(f->name);
+				Bwrite(bout, quoted, strlen(quoted));
+			}
+			break;
 		case 'q': Bprint(bout, "%ullx.%uld.%.2uhhx", f->qid.path, f->qid.vers, f->qid.type); break;
 		case 's': Bprint(bout, "%lld", f->length); break;
 		case 'x': Bprint(bout, "%M", f->mode); break;
@@ -243,7 +259,7 @@
 		if((stfmt = s_reset(stfmt)) == nil)
 			sysfatal("s_reset: %r");
 		s_append(stfmt, EARGF(usage()));
-		i = strspn(s_to_c(stfmt), "UGMamnpqsxDT");
+		i = strspn(s_to_c(stfmt), "UGMamnNpPqsxDT");
 		if(i != s_len(stfmt))
 			sysfatal("bad stfmt: %s", s_to_c(stfmt));
 		break;


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [9front] PATCH walk(1) to (optionally) quote name and path
  2023-08-12  3:25 [9front] PATCH walk(1) to (optionally) quote name and path Romano
@ 2023-08-12  5:46 ` ori
  2023-08-12  6:34   ` unobe
  2023-08-12  6:44   ` unobe
  2023-08-12 13:21 ` hiro
  1 sibling, 2 replies; 9+ messages in thread
From: ori @ 2023-08-12  5:46 UTC (permalink / raw)
  To: 9front

Quoth Romano <unobe@cpan.org>:
> I'm working with files on a backup drive that have spaces and
> quotation marks in their names, using walk(1).  I can pipe to sed
> s/''''/''''''''''/g and do similarly for other characters requiring
> escaping for rc(1), but figured patching walk(1) would be better.

generally no objections, though it's usually relatively easy to
structure scripts such that there's relatively little that is
interpreted by rc; is this for interactive use?

> I don't like how I just copied case 'p': the quoting "works", but isn't
> as clean as the output for ls(1).

I'll need to look at the code to see if there's a way to
improve this; regardless, a few comments inline:

> But it's better than what happens now, where there's no quoting.
> 
> From: Romano <unobe@cpan.org>
> Date: Sat, 12 Aug 2023 03:18:22 +0000
> Subject: [PATCH] walk: quote name and paths using N and P
> 
> ---
> diff deb39a43ae5ed09c7971726cedeb06a9f65ccc6d 2da52094832662c533f60629813dbd5ac805e3ad
> --- a/sys/man/1/walk
> +++ b/sys/man/1/walk
> @@ -81,8 +81,14 @@
>  .B n
>  final path element (name)
>  .TP
> +.B N
> +final path element (name), rc (1) quoted when necessary
> +.TP
>  .B p
>  path
> +.TP
> +.B P
> +path, rc (1) quoted when necessary
>  .TP
>  .B q
>  qid path.version.type (see
> --- a/sys/src/cmd/walk.c
> +++ b/sys/src/cmd/walk.c
> @@ -43,7 +43,7 @@
>  void
>  dofile(char *path, Dir *f, int pathonly)
>  {
> -	char *p;
> +	char *p, *quoted;

nitpicking on style; perhaps '*q' instead of '*quoted'

>  
>  	if(
>  		(f == dotdir)
> @@ -60,6 +60,10 @@
>  		case 'a': Bprint(bout, "%uld", f->atime); break;
>  		case 'm': Bprint(bout, "%uld", f->mtime); break;
>  		case 'n': Bwrite(bout, f->name, strlen(f->name)); break;
> +		case 'N':
> +			quoted = quotestrdup(f->name);
> +			Bwrite(bout, quoted, strlen(quoted));

looks like this leaks 'quoted'; same with all other quotestrdup calls.

> +			break;
>  		case 'p':
>  			if(path != dotpath)
>  				Bwrite(bout, path, strlen(path));
> @@ -69,6 +73,18 @@
>  				Bwrite(bout, f->name, strlen(f->name));
>  			}
>  			break;
> +		case 'P':

this quoting could be done by concating the path and the
dotpath, possibly even with libstring. Not sure if it'd
be cleaner to dedup the 'p' and 'P' cases or not yet.

> +			if(path != dotpath) {
> +				quoted = quotestrdup(path);
> +				Bwrite(bout, quoted, strlen(quoted));
> +			}
> +			if(! (f->qid.type & QTDIR) && !pathonly){
> +				if(path != dotpath)
> +					Bputc(bout, '/');
> +				quoted = quotestrdup(f->name);
> +				Bwrite(bout, quoted, strlen(quoted));
> +			}
> +			break;
>  		case 'q': Bprint(bout, "%ullx.%uld.%.2uhhx", f->qid.path, f->qid.vers, f->qid.type); break;
>  		case 's': Bprint(bout, "%lld", f->length); break;
>  		case 'x': Bprint(bout, "%M", f->mode); break;
> @@ -243,7 +259,7 @@
>  		if((stfmt = s_reset(stfmt)) == nil)
>  			sysfatal("s_reset: %r");
>  		s_append(stfmt, EARGF(usage()));
> -		i = strspn(s_to_c(stfmt), "UGMamnpqsxDT");
> +		i = strspn(s_to_c(stfmt), "UGMamnNpPqsxDT");
>  		if(i != s_len(stfmt))
>  			sysfatal("bad stfmt: %s", s_to_c(stfmt));
>  		break;
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [9front] PATCH walk(1) to (optionally) quote name and path
  2023-08-12  5:46 ` ori
@ 2023-08-12  6:34   ` unobe
  2023-08-12  6:44   ` unobe
  1 sibling, 0 replies; 9+ messages in thread
From: unobe @ 2023-08-12  6:34 UTC (permalink / raw)
  To: 9front

Thanks for the feedback, Ori.  I'm not a C programmer, and it shows.
See my replies below.

Quoth ori@eigenstate.org:
> Quoth Romano <unobe@cpan.org>:
> > I'm working with files on a backup drive that have spaces and
> > quotation marks in their names, using walk(1).  I can pipe to sed
> > s/''''/''''''''''/g and do similarly for other characters requiring
> > escaping for rc(1), but figured patching walk(1) would be better.
> 
> generally no objections, though it's usually relatively easy to
> structure scripts such that there's relatively little that is
> interpreted by rc; is this for interactive use?
> 
> > I don't like how I just copied case 'p': the quoting "works", but isn't
> > as clean as the output for ls(1).
> 
> I'll need to look at the code to see if there's a way to
> improve this; regardless, a few comments inline:
> 
> > But it's better than what happens now, where there's no quoting.
> > 
> > From: Romano <unobe@cpan.org>
> > Date: Sat, 12 Aug 2023 03:18:22 +0000
> > Subject: [PATCH] walk: quote name and paths using N and P
> > 
> > ---
> > diff deb39a43ae5ed09c7971726cedeb06a9f65ccc6d 2da52094832662c533f60629813dbd5ac805e3ad
> > --- a/sys/man/1/walk
> > +++ b/sys/man/1/walk
> > @@ -81,8 +81,14 @@
> >  .B n
> >  final path element (name)
> >  .TP
> > +.B N
> > +final path element (name), rc (1) quoted when necessary
> > +.TP
> >  .B p
> >  path
> > +.TP
> > +.B P
> > +path, rc (1) quoted when necessary
> >  .TP
> >  .B q
> >  qid path.version.type (see
> > --- a/sys/src/cmd/walk.c
> > +++ b/sys/src/cmd/walk.c
> > @@ -43,7 +43,7 @@
> >  void
> >  dofile(char *path, Dir *f, int pathonly)
> >  {
> > -	char *p;
> > +	char *p, *quoted;
> 
> nitpicking on style; perhaps '*q' instead of '*quoted'

Got it--be terse.

> 
> >  
> >  	if(
> >  		(f == dotdir)
> > @@ -60,6 +60,10 @@
> >  		case 'a': Bprint(bout, "%uld", f->atime); break;
> >  		case 'm': Bprint(bout, "%uld", f->mtime); break;
> >  		case 'n': Bwrite(bout, f->name, strlen(f->name)); break;
> > +		case 'N':
> > +			quoted = quotestrdup(f->name);
> > +			Bwrite(bout, quoted, strlen(quoted));
> 
> looks like this leaks 'quoted'; same with all other quotestrdup calls.

Okay, so if I'm dup'ing a string, there's memory allocated that needs
to be free()'d.

> 
> > +			break;
> >  		case 'p':
> >  			if(path != dotpath)
> >  				Bwrite(bout, path, strlen(path));
> > @@ -69,6 +73,18 @@
> >  				Bwrite(bout, f->name, strlen(f->name));
> >  			}
> >  			break;
> > +		case 'P':
> 
> this quoting could be done by concating the path and the
> dotpath, possibly even with libstring. Not sure if it'd
> be cleaner to dedup the 'p' and 'P' cases or not yet.

Yeah, I was thinking about consolidating the code for 'p' and 'P', but
figured the path of least resistance is to first just copy-and-modify.
Why not use sprint() instead of concat, since I need to interpose '/'
for the path? Anyway, see below for a patch to my patch: does it look
better?

> 
> > +			if(path != dotpath) {
> > +				quoted = quotestrdup(path);
> > +				Bwrite(bout, quoted, strlen(quoted));
> > +			}
> > +			if(! (f->qid.type & QTDIR) && !pathonly){
> > +				if(path != dotpath)
> > +					Bputc(bout, '/');
> > +				quoted = quotestrdup(f->name);
> > +				Bwrite(bout, quoted, strlen(quoted));
> > +			}
> > +			break;
> >  		case 'q': Bprint(bout, "%ullx.%uld.%.2uhhx", f->qid.path, f->qid.vers, f->qid.type); break;
> >  		case 's': Bprint(bout, "%lld", f->length); break;
> >  		case 'x': Bprint(bout, "%M", f->mode); break;
> > @@ -243,7 +259,7 @@
> >  		if((stfmt = s_reset(stfmt)) == nil)
> >  			sysfatal("s_reset: %r");
> >  		s_append(stfmt, EARGF(usage()));
> > -		i = strspn(s_to_c(stfmt), "UGMamnpqsxDT");
> > +		i = strspn(s_to_c(stfmt), "UGMamnNpPqsxDT");
> >  		if(i != s_len(stfmt))
> >  			sysfatal("bad stfmt: %s", s_to_c(stfmt));
> >  		break;
> > 
> 

From: Romano <unobe@cpan.org>
Date: Sat, 12 Aug 2023 06:32:08 +0000
Subject: [PATCH] walk: use quotefmt and free()

---
diff 2da52094832662c533f60629813dbd5ac805e3ad c371afbc8bcf6882fb1e77b5ddfefb198e331a8f
--- a/sys/src/cmd/walk.c
+++ b/sys/src/cmd/walk.c
@@ -43,7 +43,7 @@
 void
 dofile(char *path, Dir *f, int pathonly)
 {
-	char *p, *quoted;
+	char *p, *q;
 
 	if(
 		(f == dotdir)
@@ -61,8 +61,7 @@
 		case 'm': Bprint(bout, "%uld", f->mtime); break;
 		case 'n': Bwrite(bout, f->name, strlen(f->name)); break;
 		case 'N':
-			quoted = quotestrdup(f->name);
-			Bwrite(bout, quoted, strlen(quoted));
+			Bprint(bout, "%q", f->name);
 			break;
 		case 'p':
 			if(path != dotpath)
@@ -74,16 +73,13 @@
 			}
 			break;
 		case 'P':
-			if(path != dotpath) {
-				quoted = quotestrdup(path);
-				Bwrite(bout, quoted, strlen(quoted));
-			}
-			if(! (f->qid.type & QTDIR) && !pathonly){
-				if(path != dotpath)
-					Bputc(bout, '/');
-				quoted = quotestrdup(f->name);
-				Bwrite(bout, quoted, strlen(quoted));
-			}
+			if(!pathonly && path != dotpath && ! (f->qid.type & QTDIR)) {
+				q = smprint("%s/%s", path, f->name);
+				Bprint(bout, "%q", q);
+				free(q);
+}
+			else if (path != dotpath)
+				Bprint(bout, "%q", path);
 			break;
 		case 'q': Bprint(bout, "%ullx.%uld.%.2uhhx", f->qid.path, f->qid.vers, f->qid.type); break;
 		case 's': Bprint(bout, "%lld", f->length); break;
@@ -268,6 +264,7 @@
 	}ARGEND;
 
 	fmtinstall('M', dirmodefmt);
+	quotefmtinstall();
 
 	if((bout = Bfdopen(1, OWRITE)) == nil)
 		sysfatal("Bfdopen: %r");


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [9front] PATCH walk(1) to (optionally) quote name and path
  2023-08-12  5:46 ` ori
  2023-08-12  6:34   ` unobe
@ 2023-08-12  6:44   ` unobe
  2023-08-12 19:40     ` ori
  1 sibling, 1 reply; 9+ messages in thread
From: unobe @ 2023-08-12  6:44 UTC (permalink / raw)
  To: 9front

Quoth ori@eigenstate.org:
> Quoth Romano <unobe@cpan.org>:
> > I'm working with files on a backup drive that have spaces and
> > quotation marks in their names, using walk(1).  I can pipe to sed
> > s/''''/''''''''''/g and do similarly for other characters requiring
> > escaping for rc(1), but figured patching walk(1) would be better.
> 
> generally no objections, though it's usually relatively easy to
> structure scripts such that there's relatively little that is
> interpreted by rc; is this for interactive use?

I realized I didn't reply to your question, sorry.  No, not directly.
I am trying to consolidate some files spread across different paths to
one directory, so first doing:

	walk -f -eP | grep 'pattern' > /tmp/files_of_interest.txt

Then after reviewing and deleting further, I was going to do something like:

	clone `{cat /tmp/files_of_interest.txt} /path/

Are you thinking I'd do something like set the ifs manually?

	ifs='
'
	clone `{cat /tmp/files_of_interest.txt} /path


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [9front] PATCH walk(1) to (optionally) quote name and path
  2023-08-12  3:25 [9front] PATCH walk(1) to (optionally) quote name and path Romano
  2023-08-12  5:46 ` ori
@ 2023-08-12 13:21 ` hiro
  2023-08-12 20:11   ` unobe
  1 sibling, 1 reply; 9+ messages in thread
From: hiro @ 2023-08-12 13:21 UTC (permalink / raw)
  To: 9front

> 'Desktop/Movies/TV/TV Library.tvlibrary'/'Library Preferences.tvdb'

> 'Desktop/Movies/TV/TV Library.tvlibrary/Library Preferences.tvdb'

what about

Desktop/Movies/TV/'TV Library.tvlibrary'/'Library Preferences.tvdb'

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [9front] PATCH walk(1) to (optionally) quote name and path
  2023-08-12  6:44   ` unobe
@ 2023-08-12 19:40     ` ori
  2023-08-12 20:07       ` unobe
  0 siblings, 1 reply; 9+ messages in thread
From: ori @ 2023-08-12 19:40 UTC (permalink / raw)
  To: 9front

Quoth unobe@cpan.org:
> I realized I didn't reply to your question, sorry.  No, not directly.
> I am trying to consolidate some files spread across different paths to
> one directory, so first doing:
> 
> 	walk -f -eP | grep 'pattern' > /tmp/files_of_interest.txt
> 
> Then after reviewing and deleting further, I was going to do something like:
> 
> 	clone `{cat /tmp/files_of_interest.txt} /path/

in this case, rc doesn't evaluate the quotes, it simply splits
on spaces (or whatever ifs is).

> Are you thinking I'd do something like set the ifs manually?
> 
> 	ifs='
> '
> 	clone `{cat /tmp/files_of_interest.txt} /path
> 

yes; though I tend to use the inline version, since
it has fewer warts around nested shell functions.

	nl='
	'
	clone `$nl{cat /tmp/files} /path


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [9front] PATCH walk(1) to (optionally) quote name and path
  2023-08-12 19:40     ` ori
@ 2023-08-12 20:07       ` unobe
  0 siblings, 0 replies; 9+ messages in thread
From: unobe @ 2023-08-12 20:07 UTC (permalink / raw)
  To: 9front

Quoth ori@eigenstate.org:
> Quoth unobe@cpan.org:
> > I realized I didn't reply to your question, sorry.  No, not directly.
> > I am trying to consolidate some files spread across different paths to
> > one directory, so first doing:
> > 
> > 	walk -f -eP | grep 'pattern' > /tmp/files_of_interest.txt
> > 
> > Then after reviewing and deleting further, I was going to do something like:
> > 
> > 	clone `{cat /tmp/files_of_interest.txt} /path/
> 
> in this case, rc doesn't evaluate the quotes, it simply splits
> on spaces (or whatever ifs is).

Oy, that's right.

> > Are you thinking I'd do something like set the ifs manually?
> > 
> > 	ifs='
> > '
> > 	clone `{cat /tmp/files_of_interest.txt} /path
> > 
> 
> yes; though I tend to use the inline version, since
> it has fewer warts around nested shell functions.
> 
> 	nl='
> 	'
> 	clone `$nl{cat /tmp/files} /path
> 

+1

Anyway, I'll let this simmer for another few days before consolidating
the two commits and merging as one commit (including fixing the
indentation for the end curly bracket on my last commit).


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [9front] PATCH walk(1) to (optionally) quote name and path
  2023-08-12 13:21 ` hiro
@ 2023-08-12 20:11   ` unobe
  2023-08-13 20:10     ` hiro
  0 siblings, 1 reply; 9+ messages in thread
From: unobe @ 2023-08-12 20:11 UTC (permalink / raw)
  To: 9front

Quoth hiro <23hiro@gmail.com>:
> > 'Desktop/Movies/TV/TV Library.tvlibrary'/'Library Preferences.tvdb'
> 
> > 'Desktop/Movies/TV/TV Library.tvlibrary/Library Preferences.tvdb'
> 
> what about
> 
> Desktop/Movies/TV/'TV Library.tvlibrary'/'Library Preferences.tvdb'

The directory is processed at one time, so without a more significant refactoring, I don't see how that would be possible. For reference, ls(1) does this:

cpu% ls Desktop/Movies/TV/'TV Library.tvlibrary'/'Library Preferences.tvdb'
'Desktop/Movies/TV/TV Library.tvlibrary/Library Preferences.tvdb'
cpu%

So the latest commit I did matches that output.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [9front] PATCH walk(1) to (optionally) quote name and path
  2023-08-12 20:11   ` unobe
@ 2023-08-13 20:10     ` hiro
  0 siblings, 0 replies; 9+ messages in thread
From: hiro @ 2023-08-13 20:10 UTC (permalink / raw)
  To: 9front

indeed. and thus preferable from my pov.

On Sat, Aug 12, 2023 at 10:13 PM <unobe@cpan.org> wrote:
>
> Quoth hiro <23hiro@gmail.com>:
> > > 'Desktop/Movies/TV/TV Library.tvlibrary'/'Library Preferences.tvdb'
> >
> > > 'Desktop/Movies/TV/TV Library.tvlibrary/Library Preferences.tvdb'
> >
> > what about
> >
> > Desktop/Movies/TV/'TV Library.tvlibrary'/'Library Preferences.tvdb'
>
> The directory is processed at one time, so without a more significant refactoring, I don't see how that would be possible. For reference, ls(1) does this:
>
> cpu% ls Desktop/Movies/TV/'TV Library.tvlibrary'/'Library Preferences.tvdb'
> 'Desktop/Movies/TV/TV Library.tvlibrary/Library Preferences.tvdb'
> cpu%
>
> So the latest commit I did matches that output.
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-08-13 20:14 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-12  3:25 [9front] PATCH walk(1) to (optionally) quote name and path Romano
2023-08-12  5:46 ` ori
2023-08-12  6:34   ` unobe
2023-08-12  6:44   ` unobe
2023-08-12 19:40     ` ori
2023-08-12 20:07       ` unobe
2023-08-12 13:21 ` hiro
2023-08-12 20:11   ` unobe
2023-08-13 20:10     ` hiro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).