From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 12606 invoked by alias); 18 Mar 2014 20:28:55 -0000 Mailing-List: contact zsh-users-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Users List List-Post: List-Help: X-Seq: 18622 Received: (qmail 6861 invoked from network); 18 Mar 2014 20:28:50 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.2 X-Originating-IP: [86.6.157.246] X-Spam: 0 X-Authority: v=2.1 cv=VLBTnr/X c=1 sm=1 tr=0 a=BvYiZ/UW0Fmn8Wufq9dPrg==:117 a=BvYiZ/UW0Fmn8Wufq9dPrg==:17 a=NLZqzBF-AAAA:8 a=7s3Jj7Ix0b0A:10 a=uObrxnre4hsA:10 a=kj9zAlcOel0A:10 a=pGLkceISAAAA:8 a=jDcTJxiP7rcuGnxnzNkA:9 a=CjuIK1q_8ugA:10 a=MSl-tDqOz04A:10 a=_dQi-Dcv4p4A:10 Date: Tue, 18 Mar 2014 20:23:09 +0000 From: Peter Stephenson To: Zsh-Users List Subject: Re: Compare two (or more) filenames and return what is common between them Message-ID: <20140318202309.4d830a8b@pws-pc.ntlworld.com> In-Reply-To: References: X-Mailer: Claws Mail 3.8.0 (GTK+ 2.24.7; x86_64-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Tue, 18 Mar 2014 03:05:27 -0400 TJ Luoma wrote: > What I am trying to do: > > Given a folder/directory full of files (and, possibly, some existing > folders/directories), I want to create folders which will group files > with similar files names, but which will leave folders alone. I'm still not quite sure after reading your description what it is you want, but below is a function for you to play with. It deals with array entries rather than files, but fixing that part should be straightforward. Somewhere you'll have a '*(.)' pattern to select all the regular files in a directory, somewhere else a mkdir or possibly mkdir -p, and somewhere else a mv. The upshot is that for the input "One Two Nineteen" "One Two Three" "One Two Buckle My Shoe" "One Two Buckle My Belt" "One Three Four" "Two Three Sixteen" "Two Three Seventeen" "Three Forty Five" it prints Extracting common prefixes 'One Two Buckle My'... 'One Two Buckle My Shoe' goes in directory 'One Two Buckle My' 'One Two Buckle My Belt' goes in directory 'One Two Buckle My' Extracting common prefixes 'One Two', 'Two Three'... 'One Two Nineteen' goes in directory 'One Two' 'One Two Three' goes in directory 'One Two' 'Two Three Sixteen' goes in directory 'Two Three' 'Two Three Seventeen' goes in directory 'Two Three' Unmatched files: 'One Three Four' 'Three Forty Five' which may or may not be what you want. I handled suffixes by stripping off everything from the earliest "." to an end before looking for common prefixes. I have to admit I was within an ace of switching to Ruby for this. ##start emulate -L zsh setopt extendedglob local -a words match mbegin mend split restwords words=( "One Two Nineteen" "One Two Three" "One Two Buckle My Shoe" "One Two Buckle My Belt" "One Three Four" "Two Three Sixteen" "Two Three Seventeen" "Three Forty Five" ) typeset -A groups foundgroups integer maxwords local word initial pat make for word in $words; do initial=${word%%.*} split=(${=initial}) if (( ${#split} > maxwords )); then maxwords=${#split} fi done words_getinitial() { local word=$1 initial=${word%%.*} if (( maxwords > 1 )); then pat="(#b)(([^[:blank:]]##[[:blank:]]##)(#c$((maxwords-1)))([^[:blank:]]##))" else pat="(#b)([^[:blank:]]##)" fi initial=${(M)word##${~pat}} } # functions -T words_getinitial while (( maxwords && ${#words} )); do restwords=() groups=() foundgroups=() for word in $words; do words_getinitial $word [[ -z $initial ]] && continue if [[ -n $groups[$initial] ]]; then foundgroups[$initial]=1 else groups[$initial]=1 fi done if (( ${#foundgroups} )); then print "Extracting common prefixes '${(kj.', '.)foundgroups}'..." for word in $words; do words_getinitial $word if [[ -z $initial ]]; then restwords+=($word) elif [[ -n $foundgroups[$initial] ]]; then print "'$word' goes in directory '$initial'" else restwords+=($word) fi words=($restwords) done fi (( maxwords-- )) done if (( ${#words} )); then print "Unmatched files:" print "'${(pj.'\n'.)words}'" fi ##end -- Peter Stephenson Web page now at http://homepage.ntlworld.com/p.w.stephenson/