zsh-users
 help / color / mirror / code / Atom feed
* Sorting file names randomly
@ 2005-07-23 19:42 DervishD
  2005-07-23 21:26 ` DervishD
  0 siblings, 1 reply; 16+ messages in thread
From: DervishD @ 2005-07-23 19:42 UTC (permalink / raw)
  To: Zsh Users

    Hi all :)

    A time ago (zsh-workers/19128) Bart explained me how to sort
randomly a group of files. Namely, the solution is this:

    array=(*(e:'reply=%0(l..$RANDOM)$REPLY:))
    array=(${(%)array)

    Just as a side note, this works because that %0... construct,
which is a prompt escape sequence, namely a conditional that says "if
more than 0 characters have been printed, print nothing, else print
the $RANDOM expansion). Obviously this is always false so the random
number is never printed but it is used for sorting.

    Well, the problem is that the above doesn't work if you have to
use more than one pattern, because the '(e' construct will affect
only the last element. Since I don't know how many elements will be
present, I cannot use an '(e' construct on each element. Moreover it
would be very messy.

    What I want to do is to generate (in an array) a list of files
sorted randomly, given some globbing patterns. Since the list can be
quite large, I think that doing the glob on the command line is not a
good idea, so I would call the function like:

    shuffle dir1/* dir2/* ...

    The globbing will be internal, so 'shuffle' is really an alias to
"noglob 'shuffle'".

    I've tried to use '$~' in the solution above (the '%0...' one),
but it doesn't work because although files in dir1 and files in dir2
are sorted randomly, dir1 files appear always before dir2 files. It
seems that the random number doesn't affect the sorting of pathnames
:?

    Any simple way of using the above solution for this new problem
or should I try a new solution? Any simple way of doing the random
sort on a group of patterns?

    Thanks a lot in advance :)

    Raúl Núñez de Arenas Coronado

-- 
Linux Registered User 88736 | http://www.dervishd.net
http://www.pleyades.net & http://www.gotesdelluna.net
It's my PC and I'll cry if I want to...


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Sorting file names randomly
  2005-07-23 19:42 Sorting file names randomly DervishD
@ 2005-07-23 21:26 ` DervishD
  2005-07-24  6:44   ` Bart Schaefer
  0 siblings, 1 reply; 16+ messages in thread
From: DervishD @ 2005-07-23 21:26 UTC (permalink / raw)
  To: Zsh Users

    Hi all :) and sorry for self-replying...

 * DervishD <zsh@dervishd.net> dixit:
>     A time ago (zsh-workers/19128) Bart explained me how to sort
> randomly a group of files. Namely, the solution is this:
> 
>     array=(*(e:'reply=%0(l..$RANDOM)$REPLY:))
>     array=(${(%)array)
[...]
>     Well, the problem is that the above doesn't work if you have to
> use more than one pattern, because the '(e' construct will affect
> only the last element. Since I don't know how many elements will be
> present, I cannot use an '(e' construct on each element. Moreover it
> would be very messy.

    I've written a not very good solution that at least works:

function shuffle () {

    emulate -L zsh 
    
    setopt nullglob globdots rcexpandparam
    
    RANDOM=`date +%s`
    [[ $# -eq 0 ]] && set '*'

    reply=($~*(e:'REPLY="${(l.5..0.)RANDOM} $REPLY"':))
    reply=(${(o)reply})
    reply=(${reply/#????? /})

    print -l $reply

    return 0
}
alias shuffle="noglob 'shuffle'"

    The function returns the list in the 'reply' array parameter, and
prints it on stdout.

    If anybody can make it better/shorter, suggestions are welcome ;)
The three array assignments can be probably shortened.

    Raúl Núñez de Arenas Coronado

-- 
Linux Registered User 88736 | http://www.dervishd.net
http://www.pleyades.net & http://www.gotesdelluna.net
It's my PC and I'll cry if I want to...


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Sorting file names randomly
  2005-07-23 21:26 ` DervishD
@ 2005-07-24  6:44   ` Bart Schaefer
  2005-07-24  7:39     ` DervishD
                       ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Bart Schaefer @ 2005-07-24  6:44 UTC (permalink / raw)
  To: Zsh Users

On Jul 23,  9:42pm, DervishD wrote:
}
}     shuffle dir1/* dir2/* ...

There's no reason to noglob and alias this.  The space required to expand
the glob on the command line is no worse than what you're doing inside
the function anyway, and there aren't argument-size limits on calls to
shell functions, only on external commands.

}     I've tried to use '$~' in the solution above (the '%0...' one),
} but it doesn't work because although files in dir1 and files in dir2
} are sorted randomly, dir1 files appear always before dir2 files. It
} seems that the random number doesn't affect the sorting of pathnames

Right, (e:..:) is applied to the base file name, within each directory,
not to the entire string being globbed.

}     Any simple way of using the above solution for this new problem

Not really; glob qualifiers aren't going to do it for you.

} Any simple way of doing the random sort on a group of patterns?

You'll have to first expand them and then sort the resulting array.

On Jul 23, 11:26pm, DervishD wrote:
}
}     The function returns the list in the 'reply' array parameter, and
} prints it on stdout.
} 
}     If anybody can make it better/shorter, suggestions are welcome ;)

The following won't work in versions of zsh that lack the += assignment:

    function shuffle {
      emulate -L zsh
      integer i
      reply=()
      # set -- $~*   # uncomment to use with noglob alias
      for ((i=1; i <= $#; ++i)) { reply[i*RANDOM/32768+1]+=($argv[i]) }
      shift reply
      print -l $reply
    }

The use of array[index]+=(list) means we can insert stuff into the middle
of the array without replacing the stuff that's there.  This has the side
effect that array[1] is always empty (because we always append things
after it), which is why the shift is needed.

So this is a true shuffle; for each "card" $argv[i], we insert it into
the reply "deck" at a random position among the previous i-1 cards.

A more efficient way might be this:

    function shuffle {
      emulate -L zsh
      declare -A h
      local +h -Z 5 RANDOM=$SECONDS
      integer i
      # set -- $~*   # uncomment to use with noglob alias
      for ((i=1; i <= $#; ++i)) { h[$i.$RANDOM]=$argv[i] }
      reply=( $h )
      print -l $reply
    }

This creates random but unique hash keys and then retrieves the shuffled
values in one assignment; we don't care that the order of hash values is
indeterminate, because we want it to be random!  The local RANDOM is
there to force it to be zero-padded to 5 places, so all the hash keys
are the same length; probably not essential.

(Incidentally, I didn't test this, but I'll bet that "seeding" a local
RANDOM like that ruins the repeatable sequence of the global RANDOM.)

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com

Zsh: http://www.zsh.org | PHPerl Project: http://phperl.sourceforge.net   


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Sorting file names randomly
  2005-07-24  6:44   ` Bart Schaefer
@ 2005-07-24  7:39     ` DervishD
  2005-07-24  8:37     ` DervishD
  2007-11-19  4:21     ` Clint Adams
  2 siblings, 0 replies; 16+ messages in thread
From: DervishD @ 2005-07-24  7:39 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh Users

    Hi Bart :)

 * Bart Schaefer <schaefer@brasslantern.com> dixit:
> On Jul 23,  9:42pm, DervishD wrote:
> }     shuffle dir1/* dir2/* ...
> There's no reason to noglob and alias this.  The space required to
> expand the glob on the command line is no worse than what you're
> doing inside the function anyway, and there aren't argument-size
> limits on calls to shell functions, only on external commands.

    I thought that command line size limits applied to shell
functions too and I wanted to avoid exceeding it. The expansion
inside the shell function is done in an assignment, not a call, so I
assumed that size constrains didn't apply.

    Thanks for the explanation, because I find very useful to be able
to do the glob in the command line, just in case I want to make sure
about what is being generated :)

> }     Any simple way of using the above solution for this new problem
> Not really; glob qualifiers aren't going to do it for you.

    OK. Thanks.
 
> } Any simple way of doing the random sort on a group of patterns?
> You'll have to first expand them and then sort the resulting array.

    That's, more or less, what I'm doing now in my function.
 
> }     The function returns the list in the 'reply' array parameter, and
> } prints it on stdout.
> }     If anybody can make it better/shorter, suggestions are welcome ;)
> The following won't work in versions of zsh that lack the += assignment:

    Mine has, and I don't want this function to be portable, my only
aim is to make it work in my box ;)
 
>     function shuffle {
>       emulate -L zsh
>       integer i
>       reply=()
>       # set -- $~*   # uncomment to use with noglob alias
>       for ((i=1; i <= $#; ++i)) { reply[i*RANDOM/32768+1]+=($argv[i]) }
>       shift reply
>       print -l $reply
>     }

    Why is it better than my function? Appart from the fact that I
don't fully understand it ;) I don't see any advantage. What I'm
missing here? Maybe this is faster than doing the '(e' thing?
 
> So this is a true shuffle; for each "card" $argv[i], we insert it into
> the reply "deck" at a random position among the previous i-1 cards.

    Yes, I now see how it works.
 
> A more efficient way might be this:
> 
>     function shuffle {
>       emulate -L zsh
>       declare -A h
>       local +h -Z 5 RANDOM=$SECONDS
>       integer i
>       # set -- $~*   # uncomment to use with noglob alias
>       for ((i=1; i <= $#; ++i)) { h[$i.$RANDOM]=$argv[i] }
>       reply=( $h )
>       print -l $reply
>     }
> 
> This creates random but unique hash keys and then retrieves the shuffled
> values in one assignment; we don't care that the order of hash values is
> indeterminate, because we want it to be random!  The local RANDOM is
> there to force it to be zero-padded to 5 places, so all the hash keys
> are the same length; probably not essential.

    I understand this one better ;) That's another solution I thought
of, but I assumed that if I used associative arrays, the order of
elements would be the order in which they were inserted (which is
not, I've discovered right now). So my only 'safe bet' was to use a
normal array and using $RANDOM as the index, but that had another
problem: if I shuffle tree or four lines, I will have an array 2^16
items large, mostly empty.

    Anyway, the ordering of elements in an associative array is not
very random if $RANDOM is not included in the key, and I don't
understand it :?? How are associative arrays elements sorted?
Randomly? First-added is first?

> (Incidentally, I didn't test this, but I'll bet that "seeding" a local
> RANDOM like that ruins the repeatable sequence of the global RANDOM.)

    Probably, but I don't care.

    Thanks a lot for your help :))) Since speed is not a problem,
I'll try with the first solution, although mine is fast enough for my
needs (and in a month I probably still remember how it works ;)).
Thanks again.

    Raúl Núñez de Arenas Coronado

-- 
Linux Registered User 88736 | http://www.dervishd.net
http://www.pleyades.net & http://www.gotesdelluna.net
It's my PC and I'll cry if I want to...


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Sorting file names randomly
  2005-07-24  6:44   ` Bart Schaefer
  2005-07-24  7:39     ` DervishD
@ 2005-07-24  8:37     ` DervishD
  2005-07-24  8:40       ` DervishD
  2007-11-19  4:21     ` Clint Adams
  2 siblings, 1 reply; 16+ messages in thread
From: DervishD @ 2005-07-24  8:37 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh Users

    Hi Bart :)

 * Bart Schaefer <schaefer@brasslantern.com> dixit:
> On Jul 23,  9:42pm, DervishD wrote:
> }
> }     shuffle dir1/* dir2/* ...
> There's no reason to noglob and alias this.  The space required to
> expand the glob on the command line is no worse than what you're
> doing inside the function anyway, and there aren't argument-size
> limits on calls to shell functions, only on external commands.

    How about this?:

function shuffle () {

    emulate -L zsh 
    
    setopt nullglob globdots rcexpandparam
    
    RANDOM=`date +%s`
    [[ $# -eq 0 ]] && set '*'

    reply=()
    reply=($*)
    reply=($reply(e:'REPLY="${(l.5..0.)RANDOM} $REPLY"':))
    reply=(${(o)reply})
    reply=(${reply/#????? /})

    print -l $reply

    return 0
}

    It does the globbing outside, and shuffles correctly. Any way of
making the 'reply' assignments shorter? Should I go definitely for a
'for' loop and an associative array?

    Thanks a lot :)

    Raúl Núñez de Arenas Coronado

-- 
Linux Registered User 88736 | http://www.dervishd.net
http://www.pleyades.net & http://www.gotesdelluna.net
It's my PC and I'll cry if I want to...


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Sorting file names randomly
  2005-07-24  8:37     ` DervishD
@ 2005-07-24  8:40       ` DervishD
  2005-07-24 10:32         ` Bart Schaefer
  0 siblings, 1 reply; 16+ messages in thread
From: DervishD @ 2005-07-24  8:40 UTC (permalink / raw)
  To: Bart Schaefer, Zsh Users

    Hi :)

 * DervishD <zsh@dervishd.net> dixit:
>     How about this?:
> 
> function shuffle () {
> 
>     emulate -L zsh 
>     
>     setopt nullglob globdots rcexpandparam
>     
>     RANDOM=`date +%s`
>     [[ $# -eq 0 ]] && set '*'
> 
>     reply=()

    Of course this is not needed, I left it in place from a
modification I did to test.

>     reply=($*)
>     reply=($reply(e:'REPLY="${(l.5..0.)RANDOM} $REPLY"':))
>     reply=(${(o)reply})

    How could I avoid doing this? I cannot put the 'o' in the
assignment above this one because it doesn't work, it seems to sort
*before* applying the 'e' glob modifier).

>     reply=(${reply/#????? /})
> 
>     print -l $reply
> 
>     return 0
> }

    Raúl Núñez de Arenas Coronado

-- 
Linux Registered User 88736 | http://www.dervishd.net
http://www.pleyades.net & http://www.gotesdelluna.net
It's my PC and I'll cry if I want to...


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Sorting file names randomly
  2005-07-24  8:40       ` DervishD
@ 2005-07-24 10:32         ` Bart Schaefer
  2005-07-25  6:47           ` Bart Schaefer
                             ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Bart Schaefer @ 2005-07-24 10:32 UTC (permalink / raw)
  To: Zsh Users

On Jul 24,  9:39am, DervishD wrote:
}
} >       for ((i=1; i <= $#; ++i)) { reply[i*RANDOM/32768+1]+=($argv[i]) }
} 
}     Why is it better than my function?

It's shorter (which is one of the things you asked for), and it only
does array processing rather than building up and tearing down strings.

} Maybe this is faster than doing the '(e' thing?

More on this below.

} >       for ((i=1; i <= $#; ++i)) { h[$i.$RANDOM]=$argv[i] }
} 
}     I understand this one better ;) That's another solution I thought
} of, but I assumed that if I used associative arrays, the order of
} elements would be the order in which they were inserted (which is
} not, I've discovered right now).
} 
}     Anyway, the ordering of elements in an associative array is not
} very random if $RANDOM is not included in the key, and I don't
} understand it :?? How are associative arrays elements sorted?

Are you familiar with the concept of hash tables?  That's how nearly
all languages that have associative arrays, implement them, and in
many cases (e.g. Perl) they're even called "hashes" by the language.

You can think of the elements as being ordered by a set of numbers
computed by applying a simple algorithm to the ascii values of the
characters in the key strings.  It's actually a bit more complex than
that, but close enough.

On Jul 24, 10:37am, DervishD wrote:
}
}     How about this?:

Well ...

} function shuffle () {
} 
}     setopt nullglob globdots rcexpandparam
}     
}     reply=()
}     reply=($*)

Don't you mean $~* there?  Otherwise you have the problem with
multiple directories that you alluded to once before.

}     reply=($reply(e:'REPLY="${(l.5..0.)RANDOM} $REPLY"':))

This is wasteful in a number of ways.

First, the (l.5..0.) is just left-zero-padding $RANDOM, so rather than
force the shell parse that and work out what to do once for every file
name, it would be better to declare "local +h -Z 5 RANDOM" as I did.
(Just remember to seed RANDOM when making it local.)

Second, by using a glob qualifier, you're forcing the shell to stat()
every file name second time, after it has already been done once when
reply=($~*) is assigned [assuming $~* is what you meant].

Third, you're doing string concatenation, adding six bytes for each
file name.  If you're worried about exceeding argument limits, you
ought to be worried about how much extra memory that eats.

Fourth, you've eventually got to do this ...

}     reply=(${reply/#????? /})

... which has to copy every string in order to pattern-match it and
chop it up before assigning it back again, so you're roughly doubling
the memory needed right there, possibly as much as tripling it if I
recall correctly how array assignments are performed.

My hash solution isn't very much less memory intensive (if you skipped
the final assignment to the reply array and just printed the values it
would be better); but the += version is about as small a footprint as
you're going to get, because inserting array slices only copies the new
elements being inserted (everything else is moving of pointers to the
existing elements).

}     print -l $reply
} 
}     return 0

Unless you expect "print" to fail, the "return 0" is redundant.

}     It does the globbing outside, and shuffles correctly. Any way of
} making the 'reply' assignments shorter?

You can eliminate the glob qualifier by using the "eval" trick I posted
on a different thread (placement of quotes and spaces is important):

eval 'reply=(' '$RANDOM '${(q)reply} ')'

However, you can't do fewer than three assignments.

On Jul 24, 10:40am, DervishD wrote:
} 
} >     reply=($*)
} >     reply=($reply(e:'REPLY="${(l.5..0.)RANDOM} $REPLY"':))
} >     reply=(${(o)reply})
} 
}     How could I avoid doing this? I cannot put the 'o' in the
} assignment above this one because it doesn't work, it seems to sort
} *before* applying the 'e' glob modifier).

Obviously the glob applies after any sorting in that second assignment.
The glob qualifier is being applied to the strings that *result* from
the parameter expansion; that's why you need the rcexpandparam option.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Sorting file names randomly
  2005-07-24 10:32         ` Bart Schaefer
@ 2005-07-25  6:47           ` Bart Schaefer
  2005-07-25 13:15           ` DervishD
  2005-07-25 13:27           ` DervishD
  2 siblings, 0 replies; 16+ messages in thread
From: Bart Schaefer @ 2005-07-25  6:47 UTC (permalink / raw)
  To: Zsh Users

On Jul 24, 10:32am, Bart Schaefer wrote:
}
} }     reply=()
} }     reply=($*)
} 
} Don't you mean $~* there?  Otherwise you have the problem with
} multiple directories that you alluded to once before.

I later realized that I was misled by the

    [[ $# -eq 0 ]] && set '*'

into thinking you were still working with the noglob alias around
the function.  If the glob is already expanded before the function
call, please ignore what I said about $~*.

In either case, though, the reply=() line is unnecessary.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Sorting file names randomly
  2005-07-24 10:32         ` Bart Schaefer
  2005-07-25  6:47           ` Bart Schaefer
@ 2005-07-25 13:15           ` DervishD
  2005-07-25 13:27           ` DervishD
  2 siblings, 0 replies; 16+ messages in thread
From: DervishD @ 2005-07-25 13:15 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh Users

    Hi Bart :)

 * Bart Schaefer <schaefer@brasslantern.com> dixit:
> On Jul 24,  9:39am, DervishD wrote:
> } >       for ((i=1; i <= $#; ++i)) { reply[i*RANDOM/32768+1]+=($argv[i]) }
> }     Why is it better than my function?
> It's shorter (which is one of the things you asked for), and it only
> does array processing rather than building up and tearing down strings.

    Which is much more slower.
 
> }     Anyway, the ordering of elements in an associative array is not
> } very random if $RANDOM is not included in the key, and I don't
> } understand it :?? How are associative arrays elements sorted?
> Are you familiar with the concept of hash tables?  That's how nearly
> all languages that have associative arrays, implement them, and in
> many cases (e.g. Perl) they're even called "hashes" by the language.

    I haven't took a look at zsh sources (well, I've done it at some
points, but never a general look), so I didn't assume you were using
hash tables for associative arrays. Thanks for the explanation :)

> } function shuffle () {
> } 
> }     setopt nullglob globdots rcexpandparam
> }     
> }     reply=()
> }     reply=($*)
> Don't you mean $~* there?  Otherwise you have the problem with
> multiple directories that you alluded to once before.

    No, I've got rid of the noglob thing, thanks to your idea :)
 
> }     reply=($reply(e:'REPLY="${(l.5..0.)RANDOM} $REPLY"':))
> This is wasteful in a number of ways.

    OK, let's see :)
 
> First, the (l.5..0.) is just left-zero-padding $RANDOM, so rather than
> force the shell parse that and work out what to do once for every file
> name, it would be better to declare "local +h -Z 5 RANDOM" as I did.
> (Just remember to seed RANDOM when making it local.)

    Mmm, I didn't knew you can make a predefined shell parameter
(like RANDOM is) 'local', so I didn't the -Z thing. But thanks for
illustrating this, because is VERY useful :)))

    Just to make sure: you can do whatever thing you want with
'typeset' on a predefined shell parameter just like you would do with
your own parameters, right? Any important limitation?

> Second, by using a glob qualifier, you're forcing the shell to stat()
> every file name second time, after it has already been done once when
> reply=($~*) is assigned [assuming $~* is what you meant].

    Oh, crap, I didn't thought about this neither. Obviously a glob
modifier HAS to stat the file name to see if it is a regular file,
directory, has N links, and whatever other tests you want to carry :(

> Third, you're doing string concatenation, adding six bytes for each
> file name.  If you're worried about exceeding argument limits, you
> ought to be worried about how much extra memory that eats.

    Exceeding argument limits is one thing, because no matter how
many resources do you have, if the command line size limit is 256k,
that's all you're going to get. OTOH, memory usage is not an issue,
the script does not run at arbitrary times, if the memory in the
machine is stressed, I would probably not run the script (or shell
function, or whatever).

    That's the reason I was using memory freely inside the shell
function, I was not bothered by resource usage.

> Fourth, you've eventually got to do this ...
> }     reply=(${reply/#????? /})
> ... which has to copy every string in order to pattern-match it and
> chop it up before assigning it back again, so you're roughly
> doubling the memory needed right there, possibly as much as
> tripling it if I recall correctly how array assignments are
> performed.

    Here I assumed that the array was processed one element at a time
so I didn't consider that the memory usage doubled. Cool :)))
 
> My hash solution isn't very much less memory intensive (if you skipped
> the final assignment to the reply array and just printed the values it
> would be better); but the += version is about as small a footprint as
> you're going to get, because inserting array slices only copies the new
> elements being inserted (everything else is moving of pointers to the
> existing elements).

    Cool!. I'm going to use your += solution, thanks a lot :)
 
> }     print -l $reply
> } 
> }     return 0
> Unless you expect "print" to fail, the "return 0" is redundant.

    I know, but I use a template for shell functions and shell
scripts, and it always do an 'emulate -L zsh' at the beginning and
'return 0' at the end O:)
 
> } >     reply=($*)
> } >     reply=($reply(e:'REPLY="${(l.5..0.)RANDOM} $REPLY"':))
> } >     reply=(${(o)reply})
> }     How could I avoid doing this? I cannot put the 'o' in the
> } assignment above this one because it doesn't work, it seems to sort
> } *before* applying the 'e' glob modifier).
> Obviously the glob applies after any sorting in that second assignment.

    That wasn't obvious to me. I probably assumed left-to-right
processing, inconciously.

    Bart, thanks a lot for your examples, but LOT'S of thanks for
your explanations. Really, you've teached me a lot about shell
scripting, not only in this message, but over almost four years in
this mailing list. My 'mobs' project wouldn't have been possible
without your help and your kindness when explaining things. I really
owe you a lot.

    Raúl Núñez de Arenas Coronado

-- 
Linux Registered User 88736 | http://www.dervishd.net
http://www.pleyades.net & http://www.gotesdelluna.net
It's my PC and I'll cry if I want to...


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Sorting file names randomly
  2005-07-24 10:32         ` Bart Schaefer
  2005-07-25  6:47           ` Bart Schaefer
  2005-07-25 13:15           ` DervishD
@ 2005-07-25 13:27           ` DervishD
  2005-07-25 17:46             ` Bart Schaefer
  2 siblings, 1 reply; 16+ messages in thread
From: DervishD @ 2005-07-25 13:27 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh Users

    Hi Bart :)

    Just a clarification...

 * Bart Schaefer <schaefer@brasslantern.com> dixit:
> First, the (l.5..0.) is just left-zero-padding $RANDOM, so rather than
> force the shell parse that and work out what to do once for every file
> name, it would be better to declare "local +h -Z 5 RANDOM" as I did.
> (Just remember to seed RANDOM when making it local.)

    When I asked you about special shell parameters in my last
message, I was talking exactly of the '+h' option to typeset. I just
missed the '+h' in the example, I didn't remember about 'hiding'.

    Anyway my question is still valid: can you do any valid
modification (not only -Z, but -L, -R, etc., -i, -E, -F...) to them
or every special parameter has its own constraints?

    Thanks :)

    Raúl Núñez de Arenas Coronado

-- 
Linux Registered User 88736 | http://www.dervishd.net
http://www.pleyades.net & http://www.gotesdelluna.net
It's my PC and I'll cry if I want to...


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Sorting file names randomly
  2005-07-25 13:27           ` DervishD
@ 2005-07-25 17:46             ` Bart Schaefer
  2005-07-25 18:10               ` DervishD
  0 siblings, 1 reply; 16+ messages in thread
From: Bart Schaefer @ 2005-07-25 17:46 UTC (permalink / raw)
  To: Zsh Users

On Jul 25,  3:15pm, DervishD wrote:
}
} > it would be better to declare "local +h -Z 5 RANDOM"
} > (Just remember to seed RANDOM when making it local.)
} 
}     Just to make sure: you can do whatever thing you want with
} 'typeset' on a predefined shell parameter just like you would do with
} your own parameters, right? Any important limitation?

[From the follow-up message]
}     Anyway my question is still valid: can you do any valid
} modification (not only -Z, but -L, -R, etc., -i, -E, -F...) to them
} or every special parameter has its own constraints?

There are no constraints if you use -h to hide the special meaning.

When using +h, the constraints imposed by the predefined type of each
parameter will apply.  For example, you can't turn RANDOM into a float
or an array, and specifying zero-padding for USERNAME will instead pad
with spaces because you can't change the string to an integer.

} > Fourth, you've eventually got to do this ...
} > }     reply=(${reply/#????? /})
} 
}     Here I assumed that the array was processed one element at a time
} so I didn't consider that the memory usage doubled. Cool :)))

The ${reply/#????? /} part is processed one element at a time, but the
entire right-side of the assignment expression has to be assembled from
the results of the substitution before the actual assignment is done.

}     Bart, thanks a lot for your examples, but LOT'S of thanks for
} your explanations.

Por nada.  Just keep asking interesting questions.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Sorting file names randomly
  2005-07-25 17:46             ` Bart Schaefer
@ 2005-07-25 18:10               ` DervishD
  0 siblings, 0 replies; 16+ messages in thread
From: DervishD @ 2005-07-25 18:10 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh Users

    Hi Bart :)

 * Bart Schaefer <schaefer@brasslantern.com> dixit:
> When using +h, the constraints imposed by the predefined type of each
> parameter will apply.  For example, you can't turn RANDOM into a float
> or an array, and specifying zero-padding for USERNAME will instead pad
> with spaces because you can't change the string to an integer.

    OK, that makes perfect sense.
 
> } > Fourth, you've eventually got to do this ...
> } > }     reply=(${reply/#????? /})
> }     Here I assumed that the array was processed one element at a time
> } so I didn't consider that the memory usage doubled. Cool :)))
> The ${reply/#????? /} part is processed one element at a time, but the
> entire right-side of the assignment expression has to be assembled from
> the results of the substitution before the actual assignment is done.

    That looks logical, too, because zsh is not smart enough (yet) to
know that the source and destination is the same array.
 
> }     Bart, thanks a lot for your examples, but LOT'S of thanks for
> } your explanations.
> Por nada.  Just keep asking interesting questions.

    The correct sentence for "you're welcome" is "de nada", BTW, but
thanks for answering in spanish :) And since I'm always getting into
problems I cannot solve by myself, I'm sure that I'll find my way
into many 'interesting' (read: buggering) problems ;) Thanks a lot,
really.

    Raúl Núñez de Arenas Coronado

-- 
Linux Registered User 88736 | http://www.dervishd.net
http://www.pleyades.net & http://www.gotesdelluna.net
It's my PC and I'll cry if I want to...


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Sorting file names randomly
  2005-07-24  6:44   ` Bart Schaefer
  2005-07-24  7:39     ` DervishD
  2005-07-24  8:37     ` DervishD
@ 2007-11-19  4:21     ` Clint Adams
  2007-11-19  8:57       ` Bart Schaefer
  2 siblings, 1 reply; 16+ messages in thread
From: Clint Adams @ 2007-11-19  4:21 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh Users

On Sun, Jul 24, 2005 at 06:44:15AM +0000, Bart Schaefer wrote:
> A more efficient way might be this:
> 
>     function shuffle {
>       emulate -L zsh
>       declare -A h
>       local +h -Z 5 RANDOM=$SECONDS
>       integer i
>       # set -- $~*   # uncomment to use with noglob alias
>       for ((i=1; i <= $#; ++i)) { h[$i.$RANDOM]=$argv[i] }
>       reply=( $h )
>       print -l $reply
>     }

Is there any chance that prepending "$i." and hashing it will
decrease the randomness of the shuffle significantly?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Sorting file names randomly
  2007-11-19  4:21     ` Clint Adams
@ 2007-11-19  8:57       ` Bart Schaefer
  2007-11-19  9:08         ` Bart Schaefer
  2007-11-19 11:44         ` Clint Adams
  0 siblings, 2 replies; 16+ messages in thread
From: Bart Schaefer @ 2007-11-19  8:57 UTC (permalink / raw)
  To: Zsh Users

On Nov 18, 11:21pm, Clint Adams wrote:
} Subject: Re: Sorting file names randomly
}
} On Sun, Jul 24, 2005 at 06:44:15AM +0000, Bart Schaefer wrote:
} >       for ((i=1; i <= $#; ++i)) { h[$i.$RANDOM]=$argv[i] }
} 
} Is there any chance that prepending "$i." and hashing it will
} decrease the randomness of the shuffle significantly?

I no longer remember why I didn't just use h[$RANDOM] -- it may have
been a typo.  Looking back at the part of my message that you trimmed,
I said:

: The local RANDOM is there to force it to be zero-padded to 5 places,
: so all the hash keys are the same length; probably not essential.

But $i is not padded, so if that's prepended the hash keys aren't all
the same length any more, which is why I wonder whether it's meant to
be there at all.

However, I suspect the randomness might be reduced for large numbers
of arguments whether or not $i is prepended, because within each hash
bucket the values are in a list in the order they were added to that
bucket.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Sorting file names randomly
  2007-11-19  8:57       ` Bart Schaefer
@ 2007-11-19  9:08         ` Bart Schaefer
  2007-11-19 11:44         ` Clint Adams
  1 sibling, 0 replies; 16+ messages in thread
From: Bart Schaefer @ 2007-11-19  9:08 UTC (permalink / raw)
  To: Zsh Users

On Nov 19, 12:57am, Bart Schaefer wrote:
}
} } Is there any chance that prepending "$i." and hashing it will
} } decrease the randomness of the shuffle significantly?
} 
} I no longer remember why I didn't just use h[$RANDOM]

Oh, duh ... it's to assure that the keys are unique.  That's what I get
for trying to answer email last thing before I go to bed.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Sorting file names randomly
  2007-11-19  8:57       ` Bart Schaefer
  2007-11-19  9:08         ` Bart Schaefer
@ 2007-11-19 11:44         ` Clint Adams
  1 sibling, 0 replies; 16+ messages in thread
From: Clint Adams @ 2007-11-19 11:44 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh Users

On Mon, Nov 19, 2007 at 12:57:18AM -0800, Bart Schaefer wrote:
> However, I suspect the randomness might be reduced for large numbers
> of arguments whether or not $i is prepended, because within each hash
> bucket the values are in a list in the order they were added to that
> bucket.

I'm dealing with 50,000 elements right now, so hash collisions do not
seem to be an issue.  (Simulating both hasher() and the Jenkins
One-at-a-time hash I get $#elements unique buckets).

I should probably actually try to measure the randomness of these runs
or give up.


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2007-11-19 11:45 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-07-23 19:42 Sorting file names randomly DervishD
2005-07-23 21:26 ` DervishD
2005-07-24  6:44   ` Bart Schaefer
2005-07-24  7:39     ` DervishD
2005-07-24  8:37     ` DervishD
2005-07-24  8:40       ` DervishD
2005-07-24 10:32         ` Bart Schaefer
2005-07-25  6:47           ` Bart Schaefer
2005-07-25 13:15           ` DervishD
2005-07-25 13:27           ` DervishD
2005-07-25 17:46             ` Bart Schaefer
2005-07-25 18:10               ` DervishD
2007-11-19  4:21     ` Clint Adams
2007-11-19  8:57       ` Bart Schaefer
2007-11-19  9:08         ` Bart Schaefer
2007-11-19 11:44         ` Clint Adams

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).