* [PATCH 3/3] Constify two local variables. @ 2015-11-30 3:21 Daniel Shahaf 2015-11-30 9:38 ` Peter Stephenson 0 siblings, 1 reply; 11+ messages in thread From: Daniel Shahaf @ 2015-11-30 3:21 UTC (permalink / raw) To: zsh-workers --- I assume that v->pm->gsu.a->getfn() has no access to v->start and v->end, hence changing the order is safe. Are there any compilers that choke on 'char **const' (const pointer to pointer to char)? Cheers, Daniel Src/params.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/Src/params.c b/Src/params.c index d8bf83d..142697f 100644 --- a/Src/params.c +++ b/Src/params.c @@ -2552,18 +2552,20 @@ setarrvalue(Value v, char **val) v->pm->node.nam); return; } else { - char **old, **new, **p, **q, **r; - int pre_assignment_length; + char **const old = v->pm->gsu.a->getfn(v->pm); + char **new; + char **p, **q, **r; /* index variables */ + const int pre_assignment_length = arrlen(old); int post_assignment_length; int i; + q = old; + if ((v->flags & VALFLAG_INV) && unset(KSHARRAYS)) { if (v->start > 0) v->start--; v->end--; } - q = old = v->pm->gsu.a->getfn(v->pm); - pre_assignment_length = arrlen(old); if (v->start < 0) { v->start += pre_assignment_length; if (v->start < 0) -- 2.1.4 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] Constify two local variables. 2015-11-30 3:21 [PATCH 3/3] Constify two local variables Daniel Shahaf @ 2015-11-30 9:38 ` Peter Stephenson 2015-11-30 15:59 ` Daniel Shahaf 0 siblings, 1 reply; 11+ messages in thread From: Peter Stephenson @ 2015-11-30 9:38 UTC (permalink / raw) To: zsh-workers On Mon, 30 Nov 2015 03:21:53 +0000 Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > I assume that v->pm->gsu.a->getfn() has no access to v->start and > v->end, hence changing the order is safe. Yes, the getfn() is very low level, dealing only with storage for that parameter and having no knowledge at all of any expansion/substitution happening above it. > Are there any compilers that choke on 'char **const' (const pointer to pointer > to char)? I suspect we ditched those some years ago when we started making more use of const, but there might be a few oddballs lying around. By the way... > See for example the number of uses of mkarry() that should have been "mkarray()". I see 19 uses, then there's also hmkarray() which comes from the heap and is only used 3 times. pws ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] Constify two local variables. 2015-11-30 9:38 ` Peter Stephenson @ 2015-11-30 15:59 ` Daniel Shahaf 2015-11-30 16:28 ` Bart Schaefer 2015-11-30 16:42 ` Peter Stephenson 0 siblings, 2 replies; 11+ messages in thread From: Daniel Shahaf @ 2015-11-30 15:59 UTC (permalink / raw) To: Peter Stephenson; +Cc: zsh-workers [-- Attachment #1: Type: text/plain, Size: 2301 bytes --] Peter Stephenson wrote on Mon, Nov 30, 2015 at 09:38:32 +0000: > On Mon, 30 Nov 2015 03:21:53 +0000 > Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > > I assume that v->pm->gsu.a->getfn() has no access to v->start and > > v->end, hence changing the order is safe. > > Yes, the getfn() is very low level, dealing only with storage for that > parameter and having no knowledge at all of any expansion/substitution > happening above it. > Thanks, that's what I expected. What are the lifetime interfaces of getfn/setfn of arrays? As far as I can tell, getfn returns a "borrowed reference" (i.e., something whose lifetime is managed by the parameter) and setfn "steals" a reference to its second parameter (assumes responsibility for freeing it). I'm asking because I found that $region_highlights' getfn returns a heap-allocated array, while its setfn calls free() on the array passed to it, so . Param pm = $region_highlights; values_a = pm->gsu.a->getfn(); pm->gsu.a->setfn(values_a); . is undefined behaviour (calls free() on a heap pointer, triggering SIGABRT). That's a reduced example of what the "don't reallocate if it's the same size" patch does, and I'm not sure how to make it work: should I change get_re, the setfn, or the callsite which tries to pipe them into each other? (The getfn is get_region_highlight. The setfn is set_region_highlight. I'm attaching that patch in its current state, in case it's relevant; however, it's not ready to be committed, since it triggers the SIGABRT whenever I invoke backward-kill-word.) > > Are there any compilers that choke on 'char **const' (const pointer to pointer > > to char)? > > I suspect we ditched those some years ago when we started making more > use of const, but there might be a few oddballs lying around. > Ack. I'll hold this patch (37253) until the release too then (unless there's another test version first). No point in breaking people's builds if I can avoid it. > > See for example the number of uses of mkarry() > > that should have been "mkarray()". I see 19 uses, then there's > also hmkarray() which comes from the heap and is only used 3 times. *nod*, however, I haven't looked into these yet; I have been looking into the "don't reallocate if it's the same size" part first. Thanks, Daniel [-- Attachment #2: 0004-Optimize-non-resizing-array-assignments.patch --] [-- Type: text/x-patch, Size: 4066 bytes --] >From c4cba933db3e972a5539e489e45d5908918e3b59 Mon Sep 17 00:00:00 2001 From: Daniel Shahaf <d.s@daniel.shahaf.name> Date: Mon, 30 Nov 2015 09:18:12 +0000 Subject: [PATCH] Optimize non-resizing array assignments. Before: % zsh -fc 'local -a a; N=5000; time ( a[N]=bar; integer i; while (( i++ < N )) { a[i]=foo } )' 2.07s user 0.00s system 99% cpu 2.077 total After: % zsh -fc 'local -a a; N=5000; time ( a[N]=bar; integer i; while (( i++ < N )) { a[i]=foo } )' 0.11s user 0.00s system 98% cpu 0.114 total Also add a test demonstrating that this codepath must call setfn(), and another triggering the 'v->end > arrlen(old)' case. --- To be applied on top of 37253. Src/params.c | 70 +++++++++++++++++++++++++++++++++++++---------------- Test/A06assign.ztst | 12 +++++++++ 2 files changed, 61 insertions(+), 21 deletions(-) diff --git a/Src/params.c b/Src/params.c index 142697f..94c996a 100644 --- a/Src/params.c +++ b/Src/params.c @@ -2553,13 +2553,9 @@ setarrvalue(Value v, char **val) return; } else { char **const old = v->pm->gsu.a->getfn(v->pm); - char **new; - char **p, **q, **r; /* index variables */ const int pre_assignment_length = arrlen(old); + const int slice_length = arrlen(val); int post_assignment_length; - int i; - - q = old; if ((v->flags & VALFLAG_INV) && unset(KSHARRAYS)) { if (v->start > 0) @@ -2579,24 +2575,56 @@ setarrvalue(Value v, char **val) if (v->end < v->start) v->end = v->start; - post_assignment_length = v->start + arrlen(val); + post_assignment_length = v->start + slice_length; if (v->end <= pre_assignment_length) - post_assignment_length += pre_assignment_length - v->end + 1; - - p = new = (char **) zshcalloc(sizeof(char *) - * (post_assignment_length + 1)); - - for (i = 0; i < v->start; i++) - *p++ = i < pre_assignment_length ? ztrdup(*q++) : ztrdup(""); - for (r = val; *r;) - *p++ = ztrdup(*r++); - if (v->end < pre_assignment_length) - for (q = old + v->end; *q;) - *p++ = ztrdup(*q++); - *p = NULL; + post_assignment_length += pre_assignment_length - v->end; + + if (post_assignment_length == pre_assignment_length) { + /* Optimization: avoid reallocating. */ + const int effective_end_index = MIN(v->end, pre_assignment_length); + int i; + + DPUTS3(v->start + slice_length != effective_end_index, + "BUG: %s: start=%d end=%d", + v->pm->node.nam, v->start, v->end); + DPUTS3(v->start + slice_length != effective_end_index, + "BUG: %s: slice#=%d arr#=%d", + v->pm->node.nam, slice_length, pre_assignment_length); + + /* Shallow copy 'val' into 'old'. */ + for (i = v->start; i < effective_end_index; i++) + zsfree(old[i]); + memcpy(old + v->start, val, (i - v->start) * sizeof(char *)); + + v->pm->gsu.a->setfn(v->pm, old); + free(val); + } else { + /* Why allocate N+2 pointers? The first N are the elements, the + * (N+1)th is the NULL sentinel, but what is the (N+2)th pointer? + */ + char **const new = + (char**) zshcalloc(sizeof(char *) + * (post_assignment_length + 2)); + char **p, **q, **r; /* index variables */ + int i; + + q = old; + p = new; + r = val; + + for (i = 0; i < v->start; i++) + *p++ = i < pre_assignment_length ? ztrdup(*q++) : ztrdup(""); + for (r = val; *r;) + *p++ = ztrdup(*r++); + if (v->end < pre_assignment_length) + for (q = old + v->end; *q;) + *p++ = ztrdup(*q++); + *p = NULL; + + v->pm->gsu.a->setfn(v->pm, new); + freearray(val); + } - v->pm->gsu.a->setfn(v->pm, new); - freearray(val); } } diff --git a/Test/A06assign.ztst b/Test/A06assign.ztst index 1e3d2ed..17f7c63 100644 --- a/Test/A06assign.ztst +++ b/Test/A06assign.ztst @@ -478,3 +478,15 @@ >first second >first >second + + typeset -aU unique_array=(first second) + unique_array[1]=second + print $unique_array +0:assignment to unique array +>second + + typeset -a array=(first) + array[1,3]=(FIRST) + print $array +0:slice beyond length of array +>FIRST -- 2.1.4 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] Constify two local variables. 2015-11-30 15:59 ` Daniel Shahaf @ 2015-11-30 16:28 ` Bart Schaefer 2015-11-30 16:42 ` Peter Stephenson 1 sibling, 0 replies; 11+ messages in thread From: Bart Schaefer @ 2015-11-30 16:28 UTC (permalink / raw) To: zsh-workers On Nov 30, 3:59pm, Daniel Shahaf wrote: } } I'm asking because I found that $region_highlights' getfn returns } a heap-allocated array, while its setfn calls free() on the array passed } to it, so } . } Param pm = $region_highlights; } values_a = pm->gsu.a->getfn(); } pm->gsu.a->setfn(values_a); } . } is undefined behaviour (calls free() on a heap pointer, triggering SIGABRT). In general you can't pass the result of any getfn() back to its own or anyone else's setfn(). All setfn() should always expect a newly- zshcalloc()d [or the equivalent] array. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] Constify two local variables. 2015-11-30 15:59 ` Daniel Shahaf 2015-11-30 16:28 ` Bart Schaefer @ 2015-11-30 16:42 ` Peter Stephenson 2015-11-30 20:39 ` Bart Schaefer 1 sibling, 1 reply; 11+ messages in thread From: Peter Stephenson @ 2015-11-30 16:42 UTC (permalink / raw) To: zsh-workers On Mon, 30 Nov 2015 15:59:45 +0000 Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > Peter Stephenson wrote on Mon, Nov 30, 2015 at 09:38:32 +0000: > > On Mon, 30 Nov 2015 03:21:53 +0000 > > Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > > > I assume that v->pm->gsu.a->getfn() has no access to v->start and > > > v->end, hence changing the order is safe. > > > > Yes, the getfn() is very low level, dealing only with storage for that > > parameter and having no knowledge at all of any expansion/substitution > > happening above it. > > > > Thanks, that's what I expected. What are the lifetime interfaces of > getfn/setfn of arrays? That's less well defined than it should be, but the implication is obviously they will be used within the liftime of the current heap. This is common in e.g. Src/Modules/parameter.c for returning parameters reflecting some internal state. Making sure this is what happens isn't easy. The only real difference from any other use of the heap is that the source of the heap allocation is quite well buried. But maybe that's not much of a difference, since there's much heap allocation a few layers above in paramsubst(). As Bart says, setfn() expects a permanently allocated values; if it doesn't neeed it, it's up to the setfn in question to convert the value and then free what's passed in. pws ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] Constify two local variables. 2015-11-30 16:42 ` Peter Stephenson @ 2015-11-30 20:39 ` Bart Schaefer 2015-11-30 21:36 ` Bart Schaefer 2015-12-02 0:36 ` Daniel Shahaf 0 siblings, 2 replies; 11+ messages in thread From: Bart Schaefer @ 2015-11-30 20:39 UTC (permalink / raw) To: zsh-workers On Nov 30, 4:42pm, Peter Stephenson wrote: } } As Bart says, setfn() expects a permanently allocated values; if it } doesn't neeed it, it's up to the setfn in question to convert the value } and then free what's passed in. Thinking about it, it is quite likely that this makes it impossible to optimize in the way Daniel wants. At a very low level, we can't swap values into an existing (char**) representing an array parameter. All the setfn() implementations would have to be updated to notice when the pointer passed in is the same as the pointer already in the param struct and skip the free, or something like that. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] Constify two local variables. 2015-11-30 20:39 ` Bart Schaefer @ 2015-11-30 21:36 ` Bart Schaefer 2015-12-02 0:36 ` Daniel Shahaf 1 sibling, 0 replies; 11+ messages in thread From: Bart Schaefer @ 2015-11-30 21:36 UTC (permalink / raw) To: zsh-workers On Nov 30, 12:39pm, Bart Schaefer wrote: } Subject: Re: [PATCH 3/3] Constify two local variables. } } All the setfn() implementations would have to be updated to notice when } the pointer passed in is the same as the pointer already in the param } struct and skip the free, or something like that. Hmm, the default array and hash setfn() already do that. So it would only be an issue with some of the specials? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] Constify two local variables. 2015-11-30 20:39 ` Bart Schaefer 2015-11-30 21:36 ` Bart Schaefer @ 2015-12-02 0:36 ` Daniel Shahaf 2015-12-02 1:11 ` Bart Schaefer 1 sibling, 1 reply; 11+ messages in thread From: Daniel Shahaf @ 2015-12-02 0:36 UTC (permalink / raw) To: zsh-workers Bart Schaefer wrote on Mon, Nov 30, 2015 at 12:39:37 -0800: > On Nov 30, 4:42pm, Peter Stephenson wrote: > } > } As Bart says, setfn() expects a permanently allocated values; if it > } doesn't neeed it, it's up to the setfn in question to convert the value > } and then free what's passed in. > > Thinking about it, it is quite likely that this makes it impossible to > optimize in the way Daniel wants. At a very low level, we can't swap > values into an existing (char**) representing an array parameter. All > the setfn() implementations would have to be updated to notice when > the pointer passed in is the same as the pointer already in the param > struct and skip the free, or something like that. There's a sort of a third way here: I could change vals = getfn() munge_in_place(vals) setfn(vals) to #define N (sizeof(void*) * arrlen(vals)) vals = getfn() munge_in_place(vals) dummy = zalloc(N) memcpy(dummy, vals, N) setfn(dummy) which still does O(N) work, but does _less_ work: it still allocates and initializes one chunk of N pointers, but (compared to the setarrvalue() in HEAD) doesn't allocate and initialize N chunks of one element each. Bart Schaefer wrote on Mon, Nov 30, 2015 at 13:36:17 -0800: > On Nov 30, 12:39pm, Bart Schaefer wrote: > } Subject: Re: [PATCH 3/3] Constify two local variables. > } > } All the setfn() implementations would have to be updated to notice when > } the pointer passed in is the same as the pointer already in the param > } struct and skip the free, or something like that. > > Hmm, the default array and hash setfn() already do that. So it would > only be an issue with some of the specials? Specials, and module-defined parameters that don't use arrsetfn(), I believe. So, one option is to change all setfn()s to support the setfn(getfn()) invocation...? But that'll mean modules written for 5.1.2 would have to be *patched*, not just recompiled, to work with 5.1.3, unless we put in a way for Param's to indicate whether they support the setfn(getfn()) invocation (to let core skip the optimization for Param's that don't support that). But in the meantime, we can just limit the optimization to Param's that are known to support it: /* to be applied on top of 37257 (which itself is to be applied on top of 37253) */ diff --git a/Src/params.c b/Src/params.c index 94c996a..1372958 100644 --- a/Src/params.c +++ b/Src/params.c @@ -2579,7 +2579,11 @@ setarrvalue(Value v, char **val) if (v->end <= pre_assignment_length) post_assignment_length += pre_assignment_length - v->end; - if (post_assignment_length == pre_assignment_length) { + if (post_assignment_length == pre_assignment_length && + v->pm->gsu.a->setfn == arrsetfn) { /* Optimization: avoid reallocating. */ + /* Only enable the optimization for setfn's that support being called + * on the return value of getfn, such as arrsetfn. */ + const int effective_end_index = MIN(v->end, pre_assignment_length); int i; This fixes the SIGABRT, I think I can commit it like this. So, I'll commit 37253 and 37257+this after the release. Now, one of my actual use-cases for this happens to be region_highlight (for zsh-syntax-highlighting reasons), which doesn't use arrsetfn, but I'll see about that separately... Cheers, Daniel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] Constify two local variables. 2015-12-02 0:36 ` Daniel Shahaf @ 2015-12-02 1:11 ` Bart Schaefer 2015-12-03 23:37 ` Daniel Shahaf 0 siblings, 1 reply; 11+ messages in thread From: Bart Schaefer @ 2015-12-02 1:11 UTC (permalink / raw) To: zsh-workers On Dec 2, 12:36am, Daniel Shahaf wrote: } Subject: Re: [PATCH 3/3] Constify two local variables. } } #define N (sizeof(void*) * arrlen(vals)) } vals = getfn() } munge_in_place(vals) } dummy = zalloc(N) } memcpy(dummy, vals, N) } setfn(dummy) Would that memcpy() really work, or would it need to be zarrdup() to copy each of the elements as well? Because e.g. arrsetfn() does freearray(). ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] Constify two local variables. 2015-12-02 1:11 ` Bart Schaefer @ 2015-12-03 23:37 ` Daniel Shahaf 2015-12-04 0:36 ` Bart Schaefer 0 siblings, 1 reply; 11+ messages in thread From: Daniel Shahaf @ 2015-12-03 23:37 UTC (permalink / raw) To: zsh-workers Bart Schaefer wrote on Tue, Dec 01, 2015 at 17:11:21 -0800: > On Dec 2, 12:36am, Daniel Shahaf wrote: > } Subject: Re: [PATCH 3/3] Constify two local variables. > } > } #define N (sizeof(void*) * arrlen(vals)) > } vals = getfn() > } munge_in_place(vals) > } dummy = zalloc(N) > } memcpy(dummy, vals, N) > } setfn(dummy) > > Would that memcpy() really work, or would it need to be zarrdup() to copy > each of the elements as well? Because e.g. arrsetfn() does freearray(). I think it would work, because setarrvalue() in HEAD calls freearray() on the values array, implying that the individual elements are already permanently allocated. --- In the meantime I've found a reproducible invalid free() with this patchset (reproducible by invoking _gnu_generic on ": Sr<TAB>" in the source tree). I haven't investigated yet. Daniel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] Constify two local variables. 2015-12-03 23:37 ` Daniel Shahaf @ 2015-12-04 0:36 ` Bart Schaefer 0 siblings, 0 replies; 11+ messages in thread From: Bart Schaefer @ 2015-12-04 0:36 UTC (permalink / raw) To: zsh-workers On Dec 3, 11:37pm, Daniel Shahaf wrote: } Subject: Re: [PATCH 3/3] Constify two local variables. } } Bart Schaefer wrote on Tue, Dec 01, 2015 at 17:11:21 -0800: } > On Dec 2, 12:36am, Daniel Shahaf wrote: } > } } > } memcpy(dummy, vals, N) } > } setfn(dummy) } > } > Would that memcpy() really work, or would it need to be zarrdup() to copy } > each of the elements as well? Because e.g. arrsetfn() does freearray(). } } I think it would work, because setarrvalue() in HEAD calls freearray() } on the values array, implying that the individual elements are already } permanently allocated. That's exactly why I think it would NOT work! If you memcp() only the pointers and then pass a new pointer-to-pointer into arrsetfn(), it's going to call freearray() and discard all the individual pointers that you intended would become the [re-used] elements. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2015-12-04 0:36 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-11-30 3:21 [PATCH 3/3] Constify two local variables Daniel Shahaf 2015-11-30 9:38 ` Peter Stephenson 2015-11-30 15:59 ` Daniel Shahaf 2015-11-30 16:28 ` Bart Schaefer 2015-11-30 16:42 ` Peter Stephenson 2015-11-30 20:39 ` Bart Schaefer 2015-11-30 21:36 ` Bart Schaefer 2015-12-02 0:36 ` Daniel Shahaf 2015-12-02 1:11 ` Bart Schaefer 2015-12-03 23:37 ` Daniel Shahaf 2015-12-04 0:36 ` Bart Schaefer
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).