From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.4 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 13580 invoked from network); 10 Nov 2023 09:52:18 -0000 Received: from zero.zsh.org (2a02:898:31:0:48:4558:7a:7368) by inbox.vuxu.org with ESMTPUTF8; 10 Nov 2023 09:52:18 -0000 ARC-Seal: i=1; cv=none; a=rsa-sha256; d=zsh.org; s=rsa-20210803; t=1699609938; b=MnXBNjg1KuPv0qviKdiakV9eEimhTqg695XYqngOK0P1waa/A0gSvEcOIkM4jeIsJQw24xEKte WPU3MPry3d5c2ILghafIz6VoA360KcKMlgJcInjPBMaLpL2yURvKTWYP7aqBTwVNNb8P+CbIbM t8Vh+ntV85tAvDm5SSVf2OGGQAZB72FhIQ+O9sxaIWxPsesfVOLMosSG8Jma4g1ifJCbNvViXx jEALOk1yo/NZvA1G0oVdkEmj06QvQUruDPx081QAYZ6TF1Qxzs83WOTMCGmMvrYXCoy2fYXZIl x2dP4xV61dj4WkOKrN2GNXQSZ5M6FUA/f+OUheQXQYJnWA==; ARC-Authentication-Results: i=1; zsh.org; iprev=pass (mail-lj1-f172.google.com) smtp.remote-ip=209.85.208.172; dkim=pass header.d=gmail.com header.s=20230601 header.a=rsa-sha256; dmarc=pass header.from=gmail.com; arc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed; d=zsh.org; s=rsa-20210803; t=1699609938; bh=SfD8DaHLh4roV+pcLa2MdwYqdehhk1P2eTcrgoBFGHQ=; h=List-Archive:List-Owner:List-Post:List-Unsubscribe:List-Subscribe:List-Help: List-Id:Sender:Content-Transfer-Encoding:Content-Type:Cc:To:Subject: Message-ID:Date:From:In-Reply-To:References:MIME-Version:DKIM-Signature: DKIM-Signature; b=i1Qj7SHxrnjJLEWstOdVPqB7Ndw2M2TrpuvSZLXx9ux+t8SVfEQi4pFi9hjHhGcewZiSrbk9Rl 8PLfNMDYykAlSK6WPM0COTlAUaSYZFebCG/GJht2ClVMq3hZgUz/PYTmOI4JVXJBvh89ms1w5i WxjJ9AYtDQogbLp+PEuI+rJHI15o13COiShuHnKJjnRr9OHx7bnLYWVUyBa7GQPbfyARzRnsIq /XdLULyFfCb2BUquAWpUcRGcPh4qemA3H8MRspJLuN4rWybnNeNaRnlQ5EOP10ndFie9XGYjKC f7d1dxBtMhiXEAEm0JkzR7XPBV1hKneXzFP9LhGcUpeBuQ==; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=zsh.org; s=rsa-20210803; h=List-Archive:List-Owner:List-Post:List-Unsubscribe: List-Subscribe:List-Help:List-Id:Sender:Content-Transfer-Encoding: Content-Type:Cc:To:Subject:Message-ID:Date:From:In-Reply-To:References: MIME-Version:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=P6FTDIGf2SJuO0d3uAFdjKpMljSyFeQj1OLbxrII7HA=; b=jxCs/fd9nNNxSdierXW6zzOyCx x/6OCTrauWVj7LOVftzEHDFk0RtfO2LaGrYgymAr2Io5lODxPlLqmTZRRQ2F99eCVYkqn9aBzpDnw YTAxck2pfPYqHVxBHRMcXxoCnOPAI0btunGVt8q4zgcWTRkoN18uO3PHRNglLlHlt2VMukQqbiLXL MQrilZmGLdbgn81bgG7s+eA5ShCzuD3H4WpfDa5jKNilpSn0F1Z+dS/OiemNXzWMUz4yAXyRFJxjW slZys5SMKqSqf1ybO7vTrF5wnI6bY4U27djUs6lgXhkNuRoc2s1xVcsADi0jWh3+P8r3KvOteVmBE cahwitqw==; Received: by zero.zsh.org with local id 1r1OBT-0000xa-Ox; Fri, 10 Nov 2023 09:52:15 +0000 Authentication-Results: zsh.org; iprev=pass (mail-lj1-f172.google.com) smtp.remote-ip=209.85.208.172; dkim=pass header.d=gmail.com header.s=20230601 header.a=rsa-sha256; dmarc=pass header.from=gmail.com; arc=none Received: from mail-lj1-f172.google.com ([209.85.208.172]:42040) by zero.zsh.org with esmtps (TLS1.3:TLS_AES_128_GCM_SHA256:128) id 1r1OAR-0000HD-Q8; Fri, 10 Nov 2023 09:51:12 +0000 Received: by mail-lj1-f172.google.com with SMTP id 38308e7fff4ca-2c6b5841f61so20858511fa.0 for ; Fri, 10 Nov 2023 01:51:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1699609870; x=1700214670; darn=zsh.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=P6FTDIGf2SJuO0d3uAFdjKpMljSyFeQj1OLbxrII7HA=; b=l8aJnVtl5UIoekbvrtexaaWl5SCllpJ0cZ1Zs/WL7f5ucSlNe3DyoMsHO6BZ2oYFIu wuAVrs/5VIDXK3jhtrgLL1AohyqI/wWM0BKdKf/5YVYzSikCj2lUbm6Fbejo9+rkaAVN OcsIizoyHetDzJDSR+PGX0DpxtuqOhAznPk8V1ePGPHR+3LpDLJKNCvfmHHYFdJ0+u4+ Aw+YCa8v+B7rLPxbD/VbhTbl2E0kcVisodQVaV+2N/4NYhfZ9KNOSWV1ERQVmdhzCMSp Eraw8VqtWeCDKgvKtToRRDpjhPX+MZK7Tq4VZQnUADVtAcV+8qU80hXGEpl0GkBAOdpV l7Cw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699609870; x=1700214670; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=P6FTDIGf2SJuO0d3uAFdjKpMljSyFeQj1OLbxrII7HA=; b=TPq/hPVKRMJff28YVj3Vp/y37fRwApM+aYtOjb+7+vmbPlQLlq5JCAEWtzWApvSiVI l41zloq9yqcOdWIFp1Hm03MC8q7dg+GgBQBvf0v9xV8dUVWCiLodDrZNEQvQjOWRxFnm ZyjChmuvtw+1JhpISwzbsGlNx5BdE4CtdyTfw9v6XinNHyJFIZz3KRKOPesM8rEMltVS 2kwao3ufExv9y6I5Q1U/V3pFMZLGUVX/r5YOwdva9gBJhvJXx8d+d6Tf50+ihMit3WOg tpcaW22aOKMJfjZEDw7YroRYa2dAEhWQMkfRdmLG1maQyzSjGg41SkXES7yu+JpFfMpY qGPQ== X-Gm-Message-State: AOJu0YyKkz4pLFk4votBiZpIFm7rW6fz/RAYbwhdKEhcLKBsvCiNd5aF wlgbFC+ZPgq/5H9RRo6YlBchYIqhJfDCUDQ28ME= X-Google-Smtp-Source: AGHT+IENRsbakN/bTIsASxXryh82dA/RUG7gBYyThp6g4xBb9fujiIyolWvcGFK4WfwMvSqFKjQ6AJXoaYvxCbfdhzQ= X-Received: by 2002:a2e:be23:0:b0:2bf:f244:f229 with SMTP id z35-20020a2ebe23000000b002bff244f229mr863166ljq.24.1699609870229; Fri, 10 Nov 2023 01:51:10 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Roman Perepelitsa Date: Fri, 10 Nov 2023 10:50:58 +0100 Message-ID: Subject: Re: special characters in file names issue To: linuxtechguy@gmail.com Cc: zsh Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Seq: 29335 Archived-At: X-Loop: zsh-users@zsh.org Errors-To: zsh-users-owner@zsh.org Precedence: list Precedence: bulk Sender: zsh-users-request@zsh.org X-no-archive: yes List-Id: List-Help: , List-Subscribe: , List-Unsubscribe: , List-Post: List-Owner: List-Archive: On Fri, Nov 10, 2023 at 12:17=E2=80=AFAM Jim wro= te: > > Hi everyone, > > Using scripts, looking to cleanup duplicate files even if named different= ly. > The issue I ran into is when a file path contains parentheses. '(' or ')' > > Example File Name: Wallpapers/Web_downloads/05 (1).jpg > > The following is part of an anonymous function: > > local E > local -a AllFileNames > local -A FileNameCkSum > ... > for E (${(@)AllFileNames}) { > [[ -v FileNameCkSum[$E] ]] || FileNameCkSum[$E]=3D${$(shasum -a 1 $E)[1]}= } # line that fails > ... > > AllFileName contains the result of a glob statement. > > Error Message: (anon):: invalid subscript Associative arrays in zsh are finicky when it comes to the content of their keys. The problem you are experiencing can be distilled to this: % typeset -A dict % key=3D'(' % [[ -v dict[$key] ]] zsh: invalid subscript There is no simple quoting that you can apply to $key here: (q), (b), etc. are all wrong. You could perhaps escape a specific list of characters ('(', '[', '{' but not '$' or '*') although my memory tells me that some keys cannot be made to work under `[[ -v ...]]` or `unset` no matter how you try to escape them. I could be wrong though. I usually apply one of two workarounds: use hash($x) instead of $x as a key, or replace the associative array with two plain arrays, one for keys and another for values. The latter results in O(N) lookup though. Roman. P.S. >From the description of your problem I would think that you want file hashes as keys. Something like this: # usage: detect-dup-files [file].. function detect-dup-files() { emulate -L zsh (( ARGC )) || return 0 local -A seen local i files fname hash orig files=3D( $(shasum -ba 256 -- "$@") ) || return (( 2 * ARGC =3D=3D $#files )) || return for i in {1..$ARGC}; do fname=3D$argv[i] hash=3D${files[2*i-1]#\\} if [[ -n ${orig::=3D$seen[$hash]} ]]; then print -r -- "${(q+)fname} is a dup of ${(q+)orig}" else seen[$hash]=3D$fname fi done } This code has an added advantage of forking only once. It also handles file names with backslashes and linefeeds in them.