From: Daniel Shahaf <d.s@daniel.shahaf.name>
To: zsh-workers@zsh.org
Subject: Re: [BUG] zformat -f has no multibyte support
Date: Tue, 24 Dec 2019 20:23:27 +0000 [thread overview]
Message-ID: <20191224202327.or5yuftjieyq5z2o@tarpaulin.shahaf.local2> (raw)
In-Reply-To: <20191223222436.rtsmhs4xuug2jfum@localhost>
[-- Attachment #1: Type: text/plain, Size: 510 bytes --]
zsugabubus wrote on Mon, Dec 23, 2019 at 23:24:36 +0100:
> $ setopt multibyte
> $ zformat -f X '%-3s' 's:ő'; echo $X
> "ő"
> $ zformat -f X '%.1s' 's:ő'; echo $X
> (garbage)
> $ zformat -f X '%-3s' 's:o'; echo $X
> " o"
The printf builtin handles this correctly, so this should be fairly easy to fix.
Actually, I don't suppose we could just call into the printf code directly, can
we? It _works_ (see attachment), but it's not elegant.
Aside: It is customary to use a valid from address.
[-- Attachment #2: zformat-printf-PoC.patch --]
[-- Type: text/x-diff, Size: 3519 bytes --]
diff --git a/Src/Modules/zutil.c b/Src/Modules/zutil.c
index 7d9bf05d6..bb00c8a24 100644
--- a/Src/Modules/zutil.c
+++ b/Src/Modules/zutil.c
@@ -775,7 +775,7 @@ static char *zformat_substring(char* instr, char **specs, char **outp,
for (s = instr; *s && *s != endchar; s++) {
if (*s == '%') {
- int right, min = -1, max = -1, outl, testit;
+ int right, min = -1, max = -1, testit;
char *spec, *start = s;
if ((right = (*++s == '-')))
@@ -835,11 +835,49 @@ static char *zformat_substring(char* instr, char **specs, char **outp,
} else if (skip) {
continue;
} else if ((spec = specs[STOUC(*s)])) {
- int len;
+ int outl;
+ Param pm;
+ LinkList args = newlinklist();
+
+ /* '%', '-', min, ',', max, 's', NUL. */
+ char fmt[1 + 1 + DIGBUFSIZE-1 + 1 + DIGBUFSIZE-1 + 1 + 1];
+
+ /* zformat uses minus to mean "pad on the left".
+ * printf uses minus to mean "pad on the right". */
+ const char *optional_minus = (right ? "" : "-");
+
+ startparamscope();
+ pm = createparam("REPLY", PM_LOCAL|PM_SCALAR);
+ if (pm)
+ pm->level = locallevel; /* because createparam() doesn't */
+
+ addlinknode(args, "printf");
+ addlinknode(args, "-v");
+ addlinknode(args, "REPLY");
+
+ if (min >= 0 && max >= 0) {
+ snprintf(fmt, sizeof(fmt), "%%%s%d.%ds", optional_minus, min, max);
+ } else if (min >= 0) {
+ snprintf(fmt, sizeof(fmt), "%%%s%ds", optional_minus, min);
+ } else if (max >= 0) {
+ snprintf(fmt, sizeof(fmt), "%%.%ds", max);
+ } else {
+ snprintf(fmt, sizeof(fmt), "%%%ss", optional_minus);
+ }
+ addlinknode(args, fmt);
- if ((len = strlen(spec)) > max && max >= 0)
- len = max;
- outl = (min >= 0 ? (min > len ? min : len) : len);
+ addlinknode(args, spec);
+
+ {
+ Builtin builtin_printf =
+ (Builtin)builtintab->getnode(builtintab, "printf");
+ local_list0(assigns);
+
+ init_list0(assigns);
+ execbuiltin(args, &assigns, builtin_printf);
+ }
+
+ outl = strlen(getsparam("REPLY"));
if (*ousedp + outl >= *olenp) {
int nlen = *olenp + outl + 128;
@@ -849,24 +887,11 @@ static char *zformat_substring(char* instr, char **specs, char **outp,
*olenp = nlen;
*outp = tmp;
}
- if (len >= outl) {
- memcpy(*outp + *ousedp, spec, outl);
+ {
+ memcpy(*outp + *ousedp, getsparam("REPLY"), outl);
*ousedp += outl;
- } else {
- int diff = outl - len;
-
- if (right) {
- while (diff--)
- (*outp)[(*ousedp)++] = ' ';
- memcpy(*outp + *ousedp, spec, len);
- *ousedp += len;
- } else {
- memcpy(*outp + *ousedp, spec, len);
- *ousedp += len;
- while (diff--)
- (*outp)[(*ousedp)++] = ' ';
- }
}
+ endparamscope();
} else {
int len = s - start + 1;
diff --git a/Test/D07multibyte.ztst b/Test/D07multibyte.ztst
index e20315340..3e7ec061f 100644
--- a/Test/D07multibyte.ztst
+++ b/Test/D07multibyte.ztst
@@ -585,3 +585,12 @@
>OK
F:A failure here may indicate the system regex library does not
F:support character sets outside the portable 7-bit range.
+
+ if zmodload zsh/zutil 2>/dev/null; then
+ zformat -f REPLY '%.3s' 's:ヌxéfoo'
+ echo $REPLY
+ else
+ ZTST_skip="can't load the zsh/zutil module for testing"
+ fi
+0:zformat multibyte test
+>ヌxé
diff --git a/Test/V13zformat.ztst b/Test/V13zformat.ztst
index 982866e13..91901cbf4 100644
--- a/Test/V13zformat.ztst
+++ b/Test/V13zformat.ztst
@@ -65,3 +65,5 @@
>ipsum.bar
>bazbaz
>\esc:ape
+
+# Multibyte tests in D07multibyte.ztst
next prev parent reply other threads:[~2019-12-24 20:24 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-23 22:24 zsugabubus
2019-12-24 20:23 ` Daniel Shahaf [this message]
2019-12-26 4:35 ` dana
2019-12-26 5:04 ` Daniel Shahaf
2019-12-26 11:28 ` Daniel Shahaf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191224202327.or5yuftjieyq5z2o@tarpaulin.shahaf.local2 \
--to=d.s@daniel.shahaf.name \
--cc=zsh-workers@zsh.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/zsh/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).