zsh-users
 help / color / mirror / code / Atom feed
From: Peter Stephenson <p.w.stephenson@ntlworld.com>
To: Zsh-Users List <zsh-users@zsh.org>
Subject: Re: Compare two (or more) filenames and return what is common between them
Date: Tue, 18 Mar 2014 20:23:09 +0000	[thread overview]
Message-ID: <20140318202309.4d830a8b@pws-pc.ntlworld.com> (raw)
In-Reply-To: <CADjGqHt4xeQ=CGDSHOm8FCtK+PgndvEXnLA_WWU-OSd7D0xK-w@mail.gmail.com>

On Tue, 18 Mar 2014 03:05:27 -0400
TJ Luoma <luomat@gmail.com> wrote:
> What I am trying to do:
> 
> Given a folder/directory full of files (and, possibly, some existing
> folders/directories), I want to create folders which will group files
> with similar files names, but which will leave folders alone.

I'm still not quite sure after reading your description what it is you
want, but below is a function for you to play with.  It deals with array
entries rather than files, but fixing that part should be
straightforward.  Somewhere you'll have a '*(.)' pattern to select all
the regular files in a directory, somewhere else a mkdir or possibly mkdir -p,
and somewhere else a mv.

The upshot is that for the input

  "One Two Nineteen"
  "One Two Three"
  "One Two Buckle My Shoe"
  "One Two Buckle My Belt"
  "One Three Four"
  "Two Three Sixteen"
  "Two Three Seventeen"
  "Three Forty Five"

it prints

  Extracting common prefixes 'One Two Buckle My'...
  'One Two Buckle My Shoe' goes in directory 'One Two Buckle My'
  'One Two Buckle My Belt' goes in directory 'One Two Buckle My'
  Extracting common prefixes 'One Two', 'Two Three'...
  'One Two Nineteen' goes in directory 'One Two'
  'One Two Three' goes in directory 'One Two'
  'Two Three Sixteen' goes in directory 'Two Three'
  'Two Three Seventeen' goes in directory 'Two Three'
  Unmatched files:
  'One Three Four'
  'Three Forty Five'

which may or may not be what you want.  I handled suffixes by stripping
off everything from the earliest "." to an end before looking for common
prefixes.

I have to admit I was within an ace of switching to Ruby for this.


##start
emulate -L zsh
setopt extendedglob

local -a words match mbegin mend split restwords

words=(
	"One Two Nineteen"
	"One Two Three"
	"One Two Buckle My Shoe"
	"One Two Buckle My Belt"
	"One Three Four"
	"Two Three Sixteen"
	"Two Three Seventeen"
	"Three Forty Five"
)

typeset -A groups foundgroups
integer maxwords
local word initial pat make

for word in $words; do
  initial=${word%%.*}
  split=(${=initial})
  if (( ${#split} > maxwords )); then
    maxwords=${#split}
  fi
done

words_getinitial() {
  local word=$1
  initial=${word%%.*}
  if (( maxwords > 1 )); then
    pat="(#b)(([^[:blank:]]##[[:blank:]]##)(#c$((maxwords-1)))([^[:blank:]]##))"
  else
    pat="(#b)([^[:blank:]]##)"
  fi
  initial=${(M)word##${~pat}}
}
# functions -T words_getinitial

while (( maxwords && ${#words} )); do
  restwords=()
  groups=()
  foundgroups=()
  for word in $words; do
    words_getinitial $word
    [[ -z $initial ]] && continue
    if [[ -n $groups[$initial] ]]; then
      foundgroups[$initial]=1
    else
      groups[$initial]=1
    fi
  done
  if (( ${#foundgroups} )); then
    print "Extracting common prefixes '${(kj.', '.)foundgroups}'..."
    for word in $words; do
      words_getinitial $word
      if [[ -z $initial ]]; then
	restwords+=($word)
      elif [[ -n $foundgroups[$initial] ]]; then
	print "'$word' goes in directory '$initial'"
      else
	restwords+=($word)
      fi
      words=($restwords)
    done
  fi
  (( maxwords-- ))
done

if (( ${#words} )); then
  print "Unmatched files:"
  print "'${(pj.'\n'.)words}'"
fi
##end


-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/


  reply	other threads:[~2014-03-18 20:28 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-18  7:05 TJ Luoma
2014-03-18 20:23 ` Peter Stephenson [this message]
2014-03-18 21:42   ` Bart Schaefer
2014-03-21 19:02     ` TJ Luoma
2014-03-21 19:39       ` Peter Stephenson
2014-03-28 20:20   ` Peter Stephenson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140318202309.4d830a8b@pws-pc.ntlworld.com \
    --to=p.w.stephenson@ntlworld.com \
    --cc=zsh-users@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).