* [9front] [PATCH] awk: add %q format for quoted strings (see quote(2))
@ 2021-08-11 21:24 james palmer
2021-08-12 21:29 ` Steve Simon
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: james palmer @ 2021-08-11 21:24 UTC (permalink / raw)
To: 9front mailing list
[-- Attachment #1: Type: text/plain, Size: 98 bytes --]
this one comes in handy for building bulk rename scripts.
patch for git/import attached.
- james
[-- Attachment #2.1: Type: text/plain, Size: 327 bytes --]
from postmaster@1ess:
The following attachment had content that we can't
prove to be harmless. To avoid possible automatic
execution, we changed the content headers.
The original header was:
Content-Disposition: attachment;filename="awk.patch"
Content-Type: text/x-patch; name="awk.patch"
Content-Transfer-Encoding: BASE64
[-- Attachment #2.2: awk.patch.suspect --]
[-- Type: application/octet-stream, Size: 1509 bytes --]
From: james palmer <james@biobuf.link>
Date: Tue, 03 Aug 2021 14:33:24 +0000
Subject: [PATCH] awk: add %q format for quoted strings (see quote(2))
---
diff 223daf6104b5fd73a6214fc3b2fcbd237ffbe666 0a401c48ef82bbb8bdafc93a6107cc3a35ada4ee
--- a/sys/src/cmd/awk/main.c Sat Jul 31 19:29:39 2021
+++ b/sys/src/cmd/awk/main.c Tue Aug 3 15:33:24 2021
@@ -60,6 +60,8 @@
Binit(&stdout, 1, OWRITE);
Binit(&stderr, 2, OWRITE);
+ quotefmtinstall();
+
cmdname = argv[0];
if (argc == 1) {
Bprint(&stderr, "usage: %s [-F fieldsep] [-d] [-mf n] [-mr n] [-safe] [-v var=value] [-f programfile | 'program'] [file ...]\n", cmdname);
--- a/sys/src/cmd/awk/run.c Sat Jul 31 19:29:39 2021
+++ b/sys/src/cmd/awk/run.c Tue Aug 3 15:33:24 2021
@@ -836,7 +836,7 @@
int format(char **pbuf, int *pbufsize, char *s, Node *a) /* printf-like conversions */
{
char *fmt;
- char *p, *t, *os;
+ char *p, *t, *os, *tmp;
Cell *x;
int flag, n;
int fmtwd; /* format width */
@@ -915,6 +915,9 @@
case 'c':
flag = 5;
break;
+ case 'q':
+ flag = 6;
+ break;
default:
WARNING("weird printf conversion %s", fmt);
flag = 0;
@@ -964,6 +967,14 @@
p++;
*p = '\0';
}
+ break;
+ case 6:
+ t = getsval(x);
+ tmp = t;
+ while(*tmp++) { if(*tmp == '\'') { n++; } n++; }
+ if(!adjbuf(&buf, &bufsize, 3+n+p-buf, recsize, &p, 0))
+ FATAL("huge string/format (%d chars) in printf %.30s... ran format() out of memory", n, t);
+ sprint(p, fmt, t);
break;
}
if (istemp(x))
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [9front] [PATCH] awk: add %q format for quoted strings (see quote(2))
2021-08-11 21:24 [9front] [PATCH] awk: add %q format for quoted strings (see quote(2)) james palmer
@ 2021-08-12 21:29 ` Steve Simon
2021-08-18 13:23 ` james palmer
2021-08-13 6:43 ` unobe
2021-08-23 16:55 ` ori
2 siblings, 1 reply; 8+ messages in thread
From: Steve Simon @ 2021-08-12 21:29 UTC (permalink / raw)
To: 9front
i see why you would want this but the argument against it was always to keep the plan9 version of awk compatible with Brian’s one-true-awk so awk code is truely portable.
how much hassle is portability worth? good question.
-Steve
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [9front] [PATCH] awk: add %q format for quoted strings (see quote(2))
2021-08-11 21:24 [9front] [PATCH] awk: add %q format for quoted strings (see quote(2)) james palmer
2021-08-12 21:29 ` Steve Simon
@ 2021-08-13 6:43 ` unobe
2021-08-18 13:10 ` james palmer
2021-08-23 16:55 ` ori
2 siblings, 1 reply; 8+ messages in thread
From: unobe @ 2021-08-13 6:43 UTC (permalink / raw)
To: 9front
Quoth james palmer <james@biobuf.link>:
> this one comes in handy for building bulk rename scripts.
> patch for git/import attached.
Neat. If it is applied, could the man page be updated to reflect the
change, too?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [9front] [PATCH] awk: add %q format for quoted strings (see quote(2))
2021-08-13 6:43 ` unobe
@ 2021-08-18 13:10 ` james palmer
0 siblings, 0 replies; 8+ messages in thread
From: james palmer @ 2021-08-18 13:10 UTC (permalink / raw)
To: 9front mailing list
Quoth unobe@cpan.org:
> Quoth james palmer <james@biobuf.link>:
> > this one comes in handy for building bulk rename scripts.
> > patch for git/import attached.
>
> Neat. If it is applied, could the man page be updated to reflect the
> change, too?
which man page? i don't see a list of format codes in awk(1). it refers to fprintf(2) which hasn't been changed. (surely it should be print(2) ?)
should i just add a sentence saying that the quote format has been installed?
- james
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [9front] [PATCH] awk: add %q format for quoted strings (see quote(2))
2021-08-12 21:29 ` Steve Simon
@ 2021-08-18 13:23 ` james palmer
2021-08-24 9:45 ` [9front] " Pavel Renev
0 siblings, 1 reply; 8+ messages in thread
From: james palmer @ 2021-08-18 13:23 UTC (permalink / raw)
To: 9front mailing list
Quoth steve@quintile.net:
>
> i see why you would want this but the argument against it was always to
> keep the plan9 version of awk compatible with Brian’s one-true-awk so
> awk code is truely portable.
>
> how much hassle is portability worth? good question.
>
> -Steve
yeah i suppose that's a sensible argument to make.
i don't think it is that big of a deal. perhaph it should be documented as non-standard in the man page?
- james
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [9front] [PATCH] awk: add %q format for quoted strings (see quote(2))
2021-08-11 21:24 [9front] [PATCH] awk: add %q format for quoted strings (see quote(2)) james palmer
2021-08-12 21:29 ` Steve Simon
2021-08-13 6:43 ` unobe
@ 2021-08-23 16:55 ` ori
2021-08-24 4:31 ` ori
2 siblings, 1 reply; 8+ messages in thread
From: ori @ 2021-08-23 16:55 UTC (permalink / raw)
To: 9front
Quoth james palmer <james@biobuf.link>:
> this one comes in handy for building bulk rename scripts.
> patch for git/import attached.
>
> - james
>
I think it makes sense to add this, though I'm
not sure how much we want to diverge from upstream.
Ape would get this by accident, but it's a superset
of what posix needs.
going the other way would also be neat:
ls -l | awk -Q {print $NF}
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [9front] [PATCH] awk: add %q format for quoted strings (see quote(2))
2021-08-23 16:55 ` ori
@ 2021-08-24 4:31 ` ori
0 siblings, 0 replies; 8+ messages in thread
From: ori @ 2021-08-24 4:31 UTC (permalink / raw)
To: 9front
Quoth ori@eigenstate.org:
> Quoth james palmer <james@biobuf.link>:
> > this one comes in handy for building bulk rename scripts.
> > patch for git/import attached.
> >
> > - james
> >
>
> I think it makes sense to add this, though I'm
> not sure how much we want to diverge from upstream.
>
> Ape would get this by accident, but it's a superset
> of what posix needs.
>
> going the other way would also be neat:
>
> ls -l | awk -Q {print $NF}
>
Here's a proof of concept -- though, I don't like how it
totally ignores FS.
diff fb2e0a1987b33083e3e08fa0659f99534c56d6aa uncommitted
--- a/sys/src/cmd/awk/awk.h
+++ b/sys/src/cmd/awk/awk.h
@@ -49,6 +49,7 @@
extern int donefld; /* 1 if record broken into fields */
extern int donerec; /* 1 if record is valid (no fld has changed */
extern char inputFS[]; /* FS at time of input, for field splitting */
+extern int quotefld; /* if we use quotes instead of FS for splitting */
extern int dbg;
--- a/sys/src/cmd/awk/lib.c
+++ b/sys/src/cmd/awk/lib.c
@@ -38,6 +38,7 @@
Cell **fldtab; /* pointers to Cells */
char inputFS[100] = " ";
+int quotefld;
#define MAXFLD 200
int nfields = MAXFLD; /* last allocated slot for $i */
@@ -242,7 +243,49 @@
dprint( ("command line set %s to |%s|\n", s, p) );
}
+char *unquoted(char *t, char **et)
+{
+ int quoting;
+ char *h, *s;
+ quoting = 0;
+ /* unquoting only shrinks s */
+ while (*t == ' ' || *t == '\t' || *t == '\n')
+ t++;
+ s = strdup(t);
+ h = s;
+ if (s == nil)
+ FATAL("out of space in tostring on %s", s);
+ while(*t!='\0'){
+ if(!quoting && (*t == ' ' || *t == '\t' || *t == '\n'))
+ break;
+ if(*t != '\''){
+ *s++ = *t++;
+ continue;
+ }
+ /* *t is a quote */
+ if(!quoting){
+ quoting = 1;
+ t++;
+ continue;
+ }
+ /* quoting and we're on a quote */
+ if(t[1] != '\''){
+ /* end of quoted section; absorb closing quote */
+ t++;
+ quoting = 0;
+ continue;
+ }
+ /* doubled quote; fold one quote into two */
+ t++;
+ *s++ = *t++;
+
+ }
+ *s = 0;
+ *et = t;
+ return h;
+}
+
void fldbld(void) /* create fields from current record */
{
/* this relies on having fields[] the same length as $0 */
@@ -265,7 +308,21 @@
}
fr = fields;
i = 0; /* number of fields accumulated here */
- if (strlen(inputFS) > 1) { /* it's a regular expression */
+ if (quotefld){ /* it's quoted text */
+ for (i = 0; ; ) {
+ while (*r == ' ' || *r == '\t' || *r == '\n')
+ r++;
+ if (*r == 0)
+ break;
+ i++;
+ if (i > nfields)
+ growfldtab(i);
+ if (freeable(fldtab[i]))
+ xfree(fldtab[i]->sval);
+ fldtab[i]->sval = unquoted(r, &r);
+ fldtab[i]->tval = FLD | STR;
+ }
+ }else if (strlen(inputFS) > 1) { /* it's a regular expression */
i = refldbld(r, inputFS);
} else if (*inputFS == ' ') { /* default whitespace */
for (i = 0; ; ) {
--- a/sys/src/cmd/awk/main.c
+++ b/sys/src/cmd/awk/main.c
@@ -51,7 +51,7 @@
void main(int argc, char *argv[])
{
- char *fs = nil, *marg;
+ char *fs = nil, qs = 0, *marg;
int temp;
setfcr(getfcr() & ~FPINVAL);
@@ -103,6 +103,9 @@
if (fs == nil || *fs == '\0')
WARNING("field separator FS is empty");
break;
+ case 'Q':
+ qs = 1;
+ break;
case 'v': /* -v a=1 to be done NOW. one -v for each */
if (argv[1][2] == '\0' && --argc > 1 && isclvar((++argv)[1]))
setclvar(argv[1]);
@@ -158,6 +161,8 @@
dprint( ("argc=%d, argv[0]=%s\n", argc, argv[0]) );
arginit(argc, argv);
yyparse();
+ if (qs)
+ quotefld = 1;
if (fs)
*FS = qstring(fs, '\0');
dprint( ("exitstatus=%s\n", exitstatus) );
^ permalink raw reply [flat|nested] 8+ messages in thread
* [9front] Re: [9front] [PATCH] awk: add %q format for quoted strings (see quote(2))
2021-08-18 13:23 ` james palmer
@ 2021-08-24 9:45 ` Pavel Renev
0 siblings, 0 replies; 8+ messages in thread
From: Pavel Renev @ 2021-08-24 9:45 UTC (permalink / raw)
To: 9front
Would a separate program be sufficient?
A simple main wrapper around quote()
It'd preserve awk's artwork intact, and having separate programs is what Unix Way is all about, or so I heard.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-08-24 10:01 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-11 21:24 [9front] [PATCH] awk: add %q format for quoted strings (see quote(2)) james palmer
2021-08-12 21:29 ` Steve Simon
2021-08-18 13:23 ` james palmer
2021-08-24 9:45 ` [9front] " Pavel Renev
2021-08-13 6:43 ` unobe
2021-08-18 13:10 ` james palmer
2021-08-23 16:55 ` ori
2021-08-24 4:31 ` ori
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).