From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 5275 invoked from network); 30 Apr 2023 19:27:27 -0000 Received: from zero.zsh.org (2a02:898:31:0:48:4558:7a:7368) by inbox.vuxu.org with ESMTPUTF8; 30 Apr 2023 19:27:27 -0000 ARC-Seal: i=1; cv=none; a=rsa-sha256; d=zsh.org; s=rsa-20210803; t=1682882847; b=KFwelZJCN3KrB9H/McZ5ljt3qlDI49UAfTNc5C1AYaAhL3ZgiXaAL3cBCBX3s7jBNnAnpX7Pd6 atk8H4kg3wMYpxqDJyhRrq+i7SVfjbvqq8Qo3EI4eQxAMsvEukn4eBcJ47xMffTBtentZBky9b Y8wDkB7XfLB00add6Il2C1YvHv+XuftVPoNnn+BZtFm+2OrRqtq66TsgKKQVM/o/xMHslF1b1s oqVD+X8rTH/sV35QJeEew78pGNSqP2uYEDMwdhe/fv/zOWeyuhQJLIr1KscX9KiIZ1Bf5iZf5Z iCUt+jyF2YpBCK7eP6Jwx1EvZp7reRdCOhYfzYMnlptpLQ==; ARC-Authentication-Results: i=1; zsh.org; iprev=pass (mail-ej1-f51.google.com) smtp.remote-ip=209.85.218.51; dkim=pass header.d=brasslantern-com.20221208.gappssmtp.com header.s=20221208 header.a=rsa-sha256; dmarc=none header.from=brasslantern.com; arc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed; d=zsh.org; s=rsa-20210803; t=1682882847; bh=AFh0PEptLpqJQBEnK+0U+CeWUsFGp02i9XOGDtUjFFQ=; h=List-Archive:List-Owner:List-Post:List-Unsubscribe:List-Subscribe:List-Help: List-Id:Sender:Content-Type:To:Subject:Message-ID:Date:From:In-Reply-To: References:MIME-Version:DKIM-Signature:DKIM-Signature; b=db/dZ04HhTuU59+vkTQiJs2bwE5Lack+rmVMrlieY2wH00dx3xK7nnXhhfKZyKQfL2PWCzdFam zMCPpshDFpB/NMKH409zbgcwUkaVR9XA2+/eY0yZY6ojuzKaoM+EpZqDglaR0zn/UAnq02iPl1 UctLutA/c70SXkWseLTccD8ME1zfOGK6tHcAzuupsUH015wrm0TkWQVN4bPlQDJddparRP7Ww3 GdLcBKUPVi6egyWfNHU1IWPpwF/mx/bJU9jIIzoekwUlsxINusmPfPxdzTYjFsjP/UgcUatM1B sztvmUyJuCezS5dGMyhShQ7oyc1sIZ/d4AssW8JJR8XZDA==; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=zsh.org; s=rsa-20210803; h=List-Archive:List-Owner:List-Post:List-Unsubscribe: List-Subscribe:List-Help:List-Id:Sender:Content-Type:To:Subject:Message-ID: Date:From:In-Reply-To:References:MIME-Version:Reply-To:Cc: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=WSiSMm/0VHTajWFXZ/vNiDcEUFQVr35i+fw+N4tDrMU=; b=K8u7pq4RmekBjujUfvGHsyV/fX w8YFtilwXjQRbr+yK5INbxv/1FdNRTFapwPeONNkOyGLWoZVuQF8FSRRoTezOzDKsKFQH0Wlb+xnH SteoFjHkooww462eFJWZDlGxOVb0oCa2G8dQFYCRZIpLK7dsNRkTOd34V9Jk27oX6BSybhQuic8Uh 2xwaI9e4hkk2NXR22GRjiyTLm9jm/ZjyNznQutmEtQ54vdVtCe5N3pl3FYBPPy7Q55LoAqxyV/SSO rd8e4hrCGeNl0B5EoIyraILXj16wXhPBRGRYWBlnq1inRQc0cNItxX7dipKA0iKVJpy1zcZQylYBc v8sXRDIA==; Received: by zero.zsh.org with local id 1ptChi-00055v-Ev; Sun, 30 Apr 2023 19:27:26 +0000 Authentication-Results: zsh.org; iprev=pass (mail-ej1-f51.google.com) smtp.remote-ip=209.85.218.51; dkim=pass header.d=brasslantern-com.20221208.gappssmtp.com header.s=20221208 header.a=rsa-sha256; dmarc=none header.from=brasslantern.com; arc=none Received: from mail-ej1-f51.google.com ([209.85.218.51]:52539) by zero.zsh.org with esmtps (TLS1.3:TLS_AES_128_GCM_SHA256:128) id 1ptChS-0004mW-3n; Sun, 30 Apr 2023 19:27:11 +0000 Received: by mail-ej1-f51.google.com with SMTP id a640c23a62f3a-94ef0a8546fso294394966b.1 for ; Sun, 30 Apr 2023 12:27:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brasslantern-com.20221208.gappssmtp.com; s=20221208; t=1682882829; x=1685474829; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=WSiSMm/0VHTajWFXZ/vNiDcEUFQVr35i+fw+N4tDrMU=; b=0Uz5aPByGXzjHrqu7Y0/IWn1ucZIN+XkVaXOTLFPk5iqtpCrVnDUi3uGaHj+FmEpkS /0UX/kU1GrrLIJBjp4GqEsKaMBrhdncGAAGdR4vRP0hdVzdGgKWYVHFSwNrwtrk8AtGO cRE54QPFLhA5M4ZNPZLo5KIcjujnvee3Rh+PVoIqfdzKpClyJ3tjcEZ3O+sWyOSIgxlF 1HYrKlblksts2xG2APkq2SuZ4x2zaLhuyf0IWaNggEfU+bytJcs+simbDbH7vpI6xos6 +dzVlP8H/vDxN+95j2xNOf34d4B/OVqqelp+DBgFIUXjxaPqRFgmvBpHq/J3+xtPmkBW 3Oiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682882829; x=1685474829; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=WSiSMm/0VHTajWFXZ/vNiDcEUFQVr35i+fw+N4tDrMU=; b=NdnzCgwwbJFIeiKiMuHvA7O7mbx3i/BYUQ8UkweMbuhZCQWbshNJhoTMhk6Vn/9dCC R1RGIGLQP/xV0tqZeqfHYV1cA/QmhLDaezQwJUCEW6WyAumlEDEz5OS7yfnrj8Q5IyUt 1/0cOLfkfvULT9Rnqo1Cck6X7l+G7sC4h1vMultH36QedeWLwblHBJAoNqc4DhMfeeLw Hxg95Jy0yqnzJZMm0jvgm351Frv6b7Bi2mT+6wHPkKwzk3fLfMItKSqMNmz7RXMQIeLZ BN0GX3LOGQCz/EBzmYVkgfDFNRnLoCq+F5N1TvUr1tdr9QEtZdg0887E3ZriC2kFdlpe z7Rg== X-Gm-Message-State: AC+VfDxFIOX2giKOS2Fiyep2+tt9cguIMw2EZ8g04RQGVc2NkdyNaL0l yrPFvrzw2jDMqwQY27sYUfX+UtmMYsVzOaAfeSgOG/mpQC5uAvbkIC8= X-Google-Smtp-Source: ACHHUZ5TmhLEBtXDttMLP9Z9KpDmwStPHQhywaGsGmxcMB8GcGNtVBR7OcNajuJZbZlvdaVcaFhBgMsQS2UuPc1WAjE= X-Received: by 2002:a17:906:db07:b0:94a:76f6:8e52 with SMTP id xj7-20020a170906db0700b0094a76f68e52mr11368067ejb.35.1682882829495; Sun, 30 Apr 2023 12:27:09 -0700 (PDT) MIME-Version: 1.0 References: <20230416090342.dztvzcnzpbyukgfq@chazelas.org> In-Reply-To: From: Bart Schaefer Date: Sun, 30 Apr 2023 12:26:58 -0700 Message-ID: Subject: Re: accessing array by index very slow To: Zsh hackers list Content-Type: text/plain; charset="UTF-8" X-Seq: 51690 Archived-At: X-Loop: zsh-workers@zsh.org Errors-To: zsh-workers-owner@zsh.org Precedence: list Precedence: bulk Sender: zsh-workers-request@zsh.org X-no-archive: yes List-Id: List-Help: , List-Subscribe: , List-Unsubscribe: , List-Post: List-Owner: List-Archive: I've been poking around at this, and there are a few serious complications. The first is that we almost always retrieve arrays via a Value struct, which contains both a pointer to the source Param and the start/end indexes for an array slice. The largest reason array accesses aren't O(1) is due to bounds checking that those desired slices don't point outside the array. We've already optimized bounds checking to avoid scanning the full array, so unless the same array is going to be checked repeatedly it'd actually make things slower to populate the cache. Second and similarly, the best place to initialize the cache is when assigning or appending to the array, which makes those operations become O(n), which seems undesirable. Third, a lot of array operations either use an array constructed on the fly (array of all values of a hash table, for example) or fetch the array from the parameter at a point in the program flow that's abstracted from the operation that needs to do the bounds checking, so the cached length is no longer readily available; or a subset of the source array may already have been copied using (start,end) of the Value, or by scanning the array for a pattern match, so the cached value no longer applies. Fourth, the length of a PM_SPECIAL parameter can't reliably be cached because the source data may change between accesses. Not true for all PM_SPECIAL, of course, but recording which ones can or cannot be is an obstacle. There is a "numparamvals" global that's reset whenever a hash is converted to an array, so that might be co-opted to update on other array copy operations if we're careful.