9front - general discussion about 9front
 help / color / mirror / Atom feed
* [9front] [PATCH] awk: add %q format for quoted strings (see quote(2))
@ 2021-08-11 21:24 james palmer
  2021-08-12 21:29 ` Steve Simon
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: james palmer @ 2021-08-11 21:24 UTC (permalink / raw)
  To: 9front mailing list

[-- Attachment #1: Type: text/plain, Size: 98 bytes --]

this one comes in handy for building bulk rename scripts.
patch for git/import attached.

- james

[-- Attachment #2.1: Type: text/plain, Size: 327 bytes --]

from postmaster@1ess:
The following attachment had content that we can't
prove to be harmless.  To avoid possible automatic
execution, we changed the content headers.
The original header was:

	Content-Disposition: attachment;filename="awk.patch"
	Content-Type: text/x-patch; name="awk.patch"
	Content-Transfer-Encoding: BASE64

[-- Attachment #2.2: awk.patch.suspect --]
[-- Type: application/octet-stream, Size: 1509 bytes --]

From: james palmer <james@biobuf.link>
Date: Tue, 03 Aug 2021 14:33:24 +0000
Subject: [PATCH] awk: add %q format for quoted strings (see quote(2))

---
diff 223daf6104b5fd73a6214fc3b2fcbd237ffbe666 0a401c48ef82bbb8bdafc93a6107cc3a35ada4ee
--- a/sys/src/cmd/awk/main.c	Sat Jul 31 19:29:39 2021
+++ b/sys/src/cmd/awk/main.c	Tue Aug  3 15:33:24 2021
@@ -60,6 +60,8 @@
 	Binit(&stdout, 1, OWRITE);
 	Binit(&stderr, 2, OWRITE);
 
+	quotefmtinstall();
+
 	cmdname = argv[0];
 	if (argc == 1) {
 		Bprint(&stderr, "usage: %s [-F fieldsep] [-d] [-mf n] [-mr n] [-safe] [-v var=value] [-f programfile | 'program'] [file ...]\n", cmdname);
--- a/sys/src/cmd/awk/run.c	Sat Jul 31 19:29:39 2021
+++ b/sys/src/cmd/awk/run.c	Tue Aug  3 15:33:24 2021
@@ -836,7 +836,7 @@
 int format(char **pbuf, int *pbufsize, char *s, Node *a)	/* printf-like conversions */
 {
 	char *fmt;
-	char *p, *t, *os;
+	char *p, *t, *os, *tmp;
 	Cell *x;
 	int flag, n;
 	int fmtwd; /* format width */
@@ -915,6 +915,9 @@
 		case 'c':
 			flag = 5;
 			break;
+		case 'q':
+			flag = 6;
+			break;
 		default:
 			WARNING("weird printf conversion %s", fmt);
 			flag = 0;
@@ -964,6 +967,14 @@
 					p++;
 				*p = '\0';
 			}
+			break;
+		case 6:
+			t = getsval(x);
+			tmp = t;
+			while(*tmp++) { if(*tmp == '\'') { n++; } n++; }
+			if(!adjbuf(&buf, &bufsize, 3+n+p-buf, recsize, &p, 0))
+				FATAL("huge string/format (%d chars) in printf %.30s... ran format() out of memory", n, t);
+			sprint(p, fmt, t);
 			break;
 		}
 		if (istemp(x))

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9front] [PATCH] awk: add %q format for quoted strings (see quote(2))
  2021-08-11 21:24 [9front] [PATCH] awk: add %q format for quoted strings (see quote(2)) james palmer
@ 2021-08-12 21:29 ` Steve Simon
  2021-08-18 13:23   ` james palmer
  2021-08-13  6:43 ` unobe
  2021-08-23 16:55 ` ori
  2 siblings, 1 reply; 8+ messages in thread
From: Steve Simon @ 2021-08-12 21:29 UTC (permalink / raw)
  To: 9front


i see why you would want this but the argument against it was always to keep the plan9 version of awk compatible with Brian’s one-true-awk so awk code is truely portable.

how much hassle is portability worth? good question.

-Steve


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9front] [PATCH] awk: add %q format for quoted strings (see quote(2))
  2021-08-11 21:24 [9front] [PATCH] awk: add %q format for quoted strings (see quote(2)) james palmer
  2021-08-12 21:29 ` Steve Simon
@ 2021-08-13  6:43 ` unobe
  2021-08-18 13:10   ` james palmer
  2021-08-23 16:55 ` ori
  2 siblings, 1 reply; 8+ messages in thread
From: unobe @ 2021-08-13  6:43 UTC (permalink / raw)
  To: 9front

Quoth james palmer <james@biobuf.link>:
> this one comes in handy for building bulk rename scripts.
> patch for git/import attached.

Neat.  If it is applied, could the man page be updated to reflect the
change, too?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9front] [PATCH] awk: add %q format for quoted strings (see quote(2))
  2021-08-13  6:43 ` unobe
@ 2021-08-18 13:10   ` james palmer
  0 siblings, 0 replies; 8+ messages in thread
From: james palmer @ 2021-08-18 13:10 UTC (permalink / raw)
  To: 9front mailing list

Quoth unobe@cpan.org:
> Quoth james palmer <james@biobuf.link>:
> > this one comes in handy for building bulk rename scripts.
> > patch for git/import attached.
> 
> Neat.  If it is applied, could the man page be updated to reflect the
> change, too?

which man page? i don't see a list of format codes in awk(1). it refers to fprintf(2) which hasn't been changed. (surely it should be print(2) ?)

should i just add a sentence saying that the quote format has been installed?

- james

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9front] [PATCH] awk: add %q format for quoted strings (see quote(2))
  2021-08-12 21:29 ` Steve Simon
@ 2021-08-18 13:23   ` james palmer
  2021-08-24  9:45     ` [9front] " Pavel Renev
  0 siblings, 1 reply; 8+ messages in thread
From: james palmer @ 2021-08-18 13:23 UTC (permalink / raw)
  To: 9front mailing list

Quoth steve@quintile.net:
> 
> i see why you would want this but the argument against it was always to 
> keep the plan9 version of awk compatible with Brian’s one-true-awk so 
> awk code is truely portable.
> 
> how much hassle is portability worth? good question.
> 
> -Steve

yeah i suppose that's a sensible argument to make.
i don't think it is that big of a deal. perhaph it should be documented as non-standard in the man page?

- james

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9front] [PATCH] awk: add %q format for quoted strings (see quote(2))
  2021-08-11 21:24 [9front] [PATCH] awk: add %q format for quoted strings (see quote(2)) james palmer
  2021-08-12 21:29 ` Steve Simon
  2021-08-13  6:43 ` unobe
@ 2021-08-23 16:55 ` ori
  2021-08-24  4:31   ` ori
  2 siblings, 1 reply; 8+ messages in thread
From: ori @ 2021-08-23 16:55 UTC (permalink / raw)
  To: 9front

Quoth james palmer <james@biobuf.link>:
> this one comes in handy for building bulk rename scripts.
> patch for git/import attached.
> 
> - james
> 

I think it makes sense to add this, though I'm
not sure how much we want to diverge from upstream.

Ape would get this by accident, but it's a superset
of what posix needs.

going the other way would also be neat:

	ls -l | awk -Q {print $NF}



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9front] [PATCH] awk: add %q format for quoted strings (see quote(2))
  2021-08-23 16:55 ` ori
@ 2021-08-24  4:31   ` ori
  0 siblings, 0 replies; 8+ messages in thread
From: ori @ 2021-08-24  4:31 UTC (permalink / raw)
  To: 9front

Quoth ori@eigenstate.org:
> Quoth james palmer <james@biobuf.link>:
> > this one comes in handy for building bulk rename scripts.
> > patch for git/import attached.
> > 
> > - james
> > 
> 
> I think it makes sense to add this, though I'm
> not sure how much we want to diverge from upstream.
> 
> Ape would get this by accident, but it's a superset
> of what posix needs.
> 
> going the other way would also be neat:
> 
> 	ls -l | awk -Q {print $NF}
> 

Here's a proof of concept -- though, I don't like how it
totally ignores FS.

diff fb2e0a1987b33083e3e08fa0659f99534c56d6aa uncommitted
--- a/sys/src/cmd/awk/awk.h
+++ b/sys/src/cmd/awk/awk.h
@@ -49,6 +49,7 @@
 extern int	donefld;	/* 1 if record broken into fields */
 extern int	donerec;	/* 1 if record is valid (no fld has changed */
 extern char	inputFS[];	/* FS at time of input, for field splitting */
+extern int	quotefld;	/* if we use quotes instead of FS for splitting */
 
 extern int	dbg;
 
--- a/sys/src/cmd/awk/lib.c
+++ b/sys/src/cmd/awk/lib.c
@@ -38,6 +38,7 @@
 
 Cell	**fldtab;	/* pointers to Cells */
 char	inputFS[100] = " ";
+int	quotefld;
 
 #define	MAXFLD	200
 int	nfields	= MAXFLD;	/* last allocated slot for $i */
@@ -242,7 +243,49 @@
 	   dprint( ("command line set %s to |%s|\n", s, p) );
 }
 
+char *unquoted(char *t, char **et)
+{
+	int quoting;
+	char *h, *s;
 
+	quoting = 0;
+	/* unquoting only shrinks s */
+	while (*t == ' ' || *t == '\t' || *t == '\n')
+		t++;
+	s = strdup(t);
+	h = s;
+	if (s == nil)
+		FATAL("out of space in tostring on %s", s);
+	while(*t!='\0'){
+		if(!quoting && (*t == ' ' || *t == '\t' || *t == '\n'))
+			break;
+		if(*t != '\''){
+			*s++ = *t++;
+			continue;
+		}
+		/* *t is a quote */
+		if(!quoting){
+			quoting = 1;
+			t++;
+			continue;
+		}
+		/* quoting and we're on a quote */
+		if(t[1] != '\''){
+			/* end of quoted section; absorb closing quote */
+			t++;
+			quoting = 0;
+			continue;
+		}
+		/* doubled quote; fold one quote into two */
+		t++;
+		*s++ = *t++;
+
+	}
+	*s = 0;
+	*et = t;
+	return h;
+}	
+
 void fldbld(void)	/* create fields from current record */
 {
 	/* this relies on having fields[] the same length as $0 */
@@ -265,7 +308,21 @@
 	}
 	fr = fields;
 	i = 0;	/* number of fields accumulated here */
-	if (strlen(inputFS) > 1) {	/* it's a regular expression */
+	if (quotefld){				/* it's quoted text */
+		for (i = 0; ; ) {
+			while (*r == ' ' || *r == '\t' || *r == '\n')
+				r++;
+			if (*r == 0)
+				break;
+			i++;
+			if (i > nfields)
+				growfldtab(i);
+			if (freeable(fldtab[i]))
+				xfree(fldtab[i]->sval);
+			fldtab[i]->sval = unquoted(r, &r);
+			fldtab[i]->tval = FLD | STR;
+		}
+	}else if (strlen(inputFS) > 1) {	/* it's a regular expression */
 		i = refldbld(r, inputFS);
 	} else if (*inputFS == ' ') {	/* default whitespace */
 		for (i = 0; ; ) {
--- a/sys/src/cmd/awk/main.c
+++ b/sys/src/cmd/awk/main.c
@@ -51,7 +51,7 @@
 
 void main(int argc, char *argv[])
 {
-	char *fs = nil, *marg;
+	char *fs = nil, qs = 0, *marg;
 	int temp;
 
 	setfcr(getfcr() & ~FPINVAL);
@@ -103,6 +103,9 @@
 			if (fs == nil || *fs == '\0')
 				WARNING("field separator FS is empty");
 			break;
+		case 'Q':
+			qs = 1;
+			break;
 		case 'v':	/* -v a=1 to be done NOW.  one -v for each */
 			if (argv[1][2] == '\0' && --argc > 1 && isclvar((++argv)[1]))
 				setclvar(argv[1]);
@@ -158,6 +161,8 @@
 	   dprint( ("argc=%d, argv[0]=%s\n", argc, argv[0]) );
 	arginit(argc, argv);
 	yyparse();
+	if (qs)
+		quotefld = 1;
 	if (fs)
 		*FS = qstring(fs, '\0');
 	   dprint( ("exitstatus=%s\n", exitstatus) );


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [9front] Re: [9front] [PATCH] awk: add %q format for quoted strings (see quote(2))
  2021-08-18 13:23   ` james palmer
@ 2021-08-24  9:45     ` Pavel Renev
  0 siblings, 0 replies; 8+ messages in thread
From: Pavel Renev @ 2021-08-24  9:45 UTC (permalink / raw)
  To: 9front

Would a separate program be sufficient?
A simple main wrapper around quote()

It'd preserve awk's artwork intact, and having separate programs is what Unix Way is all about, or so I heard.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-08-24 10:01 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-11 21:24 [9front] [PATCH] awk: add %q format for quoted strings (see quote(2)) james palmer
2021-08-12 21:29 ` Steve Simon
2021-08-18 13:23   ` james palmer
2021-08-24  9:45     ` [9front] " Pavel Renev
2021-08-13  6:43 ` unobe
2021-08-18 13:10   ` james palmer
2021-08-23 16:55 ` ori
2021-08-24  4:31   ` ori

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).