zsh-users
 help / color / mirror / code / Atom feed
* Re: Emulating 'locate'
       [not found] ` <1031002023639.ZM22046@candle.brasslantern.com>
@ 2003-10-02  8:03   ` DervishD
  2003-10-02 14:29     ` Bart Schaefer
  2003-10-03 16:22     ` Lloyd Zusman
  0 siblings, 2 replies; 18+ messages in thread
From: DervishD @ 2003-10-02  8:03 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh Users

    Hi Bart :)

 * Bart Schaefer <schaefer@brasslantern.com> dixit:
> [We should just go off and have our own little mailing list.]

    Well... As I said a few weeks ago, I'm on the way of learning Zsh
and sometimes I practice what is said in the manual doing things like
these. Sometimes I have success, others I don't. I know, I make many
questions, and maybe you are the only able to answer then, so don't
feel guilty if you want to ignore most of them ;)))

    I know that I'm boring asking zsh questions every day...

> } Well, I suppose that slashes must be
> } matched explicitly (that is what ** is for...)
> No, that is what **/ is for.

    Ok, now I got it. Anyway I used (*/)# too with the same results,
so I thought that ** was magic everywhere in the path.

> I suspect you really did
> 
>     print /**/*ir2*/*/**
>                    ^^^ Note only one star here
> when you meant
> 
>     print /**/*ir2*/**/*

    I tried both, but with the last one ir2 was not found at the end
of the path :(((
 
> But that's still not sufficient, because it requires that *ir2* be only
> an intervening directory and not the last file or directory in the path.

    Exactly ;)))

> For that you have to use brace expansion, because you can't mix **/ and
> any other form of alternation:

    Didn't know about the mixing...

>     locate() { print -l /**/*${^*}*{,/**/*} }

    Ok, it works like a charm... Thanks a lot, as always :)
 
> (You really ought to be sending these questions to -users, not -workers.)

    Sorry, I thought I was doing that already, but the mutt aliases I
use for zsh lists were all the same. I did the cut'n'paste but not
the modification, and all aliases go to zsh-workers. Sorry, this
answer goes to zsh-users. Thanks for pointing.

    Raúl Núñez de Arenas Coronado

-- 
Linux Registered User 88736
http://www.pleyades.net & http://raul.pleyades.net/


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Emulating 'locate'
  2003-10-02  8:03   ` Emulating 'locate' DervishD
@ 2003-10-02 14:29     ` Bart Schaefer
  2003-10-02 15:53       ` DervishD
  2003-10-03 16:22     ` Lloyd Zusman
  1 sibling, 1 reply; 18+ messages in thread
From: Bart Schaefer @ 2003-10-02 14:29 UTC (permalink / raw)
  To: Zsh Users

On Oct 2, 10:03am, DervishD wrote:
}
} > For that you have to use brace expansion, because you can't mix **/ and
} > any other form of alternation:
} 
}     Didn't know about the mixing...

To be precise, you can't express "this file pattern OR this hierarchy"
with (pat1|pat2) syntax.  The alternate patterns can't contain slashes.
That leaves using brace expansion to express "this file pattern" and
"this hierarchy" as separate patterns.

Which reminds me ... instead of this:

} >     locate() { print -l /**/*${^*}*{,/**/*} }

You might want:

    locate() {
	setopt localoptions nullglob nocshnullglob
	print -l /**/*${^*}*{,/**/*}
    }

Do you see why?


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Emulating 'locate'
  2003-10-02 14:29     ` Bart Schaefer
@ 2003-10-02 15:53       ` DervishD
  2003-10-02 17:08         ` Oliver Kiddle
  0 siblings, 1 reply; 18+ messages in thread
From: DervishD @ 2003-10-02 15:53 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh Users

    Hi Bart :)

 * Bart Schaefer <schaefer@brasslantern.com> dixit:
> } >     locate() { print -l /**/*${^*}*{,/**/*} }
> You might want:
>     locate() {
>        setopt localoptions nullglob nocshnullglob
>        print -l /**/*${^*}*{,/**/*}
>     }
> Do you see why?

    Well, the localoptions is for making option changes local, so
options are restored ;) The nullglob and nocshnullglob are for making
sure that zsh won't barf if no matches are found. In fact, something
like this would be better for users:

    locate () {
        setopt localoptions nullglob nocshnullglob rcexpandparam
        local -a matches

        matches=(/**/*${*}*{,/**/*})

        if [[ -z "$matches" ]]
        then
            print "Sorry, no matches found for '$*'"
        else
            print -l $matches
        fi

    }

    Thanks for your help :))

    Raúl Núñez de Arenas Coronado

-- 
Linux Registered User 88736
http://www.pleyades.net & http://raul.pleyades.net/


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Emulating 'locate'
  2003-10-02 15:53       ` DervishD
@ 2003-10-02 17:08         ` Oliver Kiddle
  2003-10-02 19:27           ` DervishD
  0 siblings, 1 reply; 18+ messages in thread
From: Oliver Kiddle @ 2003-10-02 17:08 UTC (permalink / raw)
  To: DervishD; +Cc: Zsh Users

DervishD wrote:

>     Well, the localoptions is for making option changes local, so
> options are restored ;) The nullglob and nocshnullglob are for making
> sure that zsh won't barf if no matches are found. In fact, something
> like this would be better for users:
> 
>     locate () {
>         setopt localoptions nullglob nocshnullglob rcexpandparam
>         local -a matches
> 
>         matches=(/**/*${*}*{,/**/*})
> 
>         if [[ -z "$matches" ]]
>         then
>             print "Sorry, no matches found for '$*'"
>         else
>             print -l $matches
>         fi

What? So instead of zsh barfing on no matches, your script goes and
does the barfing for it: if I'm not missing the plot entirely here
you've basically just undone the effect of the nullglob options
manually.

You'd only want to do this if this locate() function is really part of
something bigger and the error message is indicating something really
is wrong. In which case you might want to send it to stderr with >&2.

There are good reasons why commands like locate and grep don't print
silly messages when nothing is found. The one useful improvement you
might make would be to return 1 if no matches are found. The zsh "no
matches found" error is a different case because the command isn't even
run: something that wouldn't otherwise be obvious.

Oliver


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Emulating 'locate'
  2003-10-02 17:08         ` Oliver Kiddle
@ 2003-10-02 19:27           ` DervishD
  0 siblings, 0 replies; 18+ messages in thread
From: DervishD @ 2003-10-02 19:27 UTC (permalink / raw)
  To: Oliver Kiddle; +Cc: Zsh Users

    Hi Oliver :))

 * Oliver Kiddle <okiddle@yahoo.co.uk> dixit:
> >     locate () {
[...]
> What? So instead of zsh barfing on no matches, your script goes and
> does the barfing for it: if I'm not missing the plot entirely here
> you've basically just undone the effect of the nullglob options
> manually.

    Not exactly. I've replaced a message like 'zsh: no matches found:
/**/*whatever*...' by a simple 'Sorry I didn't found whatever'. This
*is* intended, because I was writing a locate emulation for end users
(just for fun, I must admit, but for end users...). I didn't had in
mind using this function in pipelines or complex commands.

> You'd only want to do this if this locate() function is really part of
> something bigger and the error message is indicating something really
> is wrong. In which case you might want to send it to stderr with >&2.

    You are completely right.
 
> There are good reasons why commands like locate and grep don't print
> silly messages when nothing is found.

    I can think of one right now: if you use it in a command
substitution, you really don't want your file list, for example,
being 'grep: arse! cannot find your effin regex'... Yes, obviously
the less I could do with the function above is redirecting the
message to stderr, my mistake O:))

    Anyway I don't use 'locate' in pipelines or command substitution,
so I didn't think about this situation. Thanks for pointing.

> The one useful improvement you
> might make would be to return 1 if no matches are found.

    Yes, this would be a good idea, thanks a lot :)) And I must admit
- painfully ;)) - that it would be even better than a message.

    Raúl Núñez de Arenas Coronado

-- 
Linux Registered User 88736
http://www.pleyades.net & http://raul.pleyades.net/


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Emulating 'locate'
  2003-10-02  8:03   ` Emulating 'locate' DervishD
  2003-10-02 14:29     ` Bart Schaefer
@ 2003-10-03 16:22     ` Lloyd Zusman
  2003-10-04 10:48       ` DervishD
  1 sibling, 1 reply; 18+ messages in thread
From: Lloyd Zusman @ 2003-10-03 16:22 UTC (permalink / raw)
  To: zsh-users

DervishD <raul@pleyades.net> writes:

>     Hi Bart :)
>
>  * Bart Schaefer <schaefer@brasslantern.com> dixit:
>> [We should just go off and have our own little mailing list.]
>
>     Well... As I said a few weeks ago, I'm on the way of learning Zsh
> and sometimes I practice what is said in the manual doing things like
> these. Sometimes I have success, others I don't. I know, I make many
> questions, and maybe you are the only able to answer then, so don't
> feel guilty if you want to ignore most of them ;)))
>
> [ ... ]
>
>
>>     locate() { print -l /**/*${^*}*{,/**/*} }
>
>     Ok, it works like a charm... Thanks a lot, as always :)

I might have missed something about this in the first part of the thread
a couple weeks ago (those messages have already expired on my system),
but in case it wasn't mentioned before, I want to point out that this
function is _extremely_ slow in comparison to the standard 'locate'
command.  It traverses through every accessible item on every accessible
file system in order to check for a match.  On my server, it's literally
thousands of times slower than using the standard 'locate'.

I'm not sure how it compares to this:

  locate() { find / -name "*${^*}*" -print }

... but it's certainly has more or less the same order of magnitude
of slowness.

Figuring this out is a very good learning experience for zsh.  However,
I would not recommend installing this function for everyday use on a
reasonably sized system.


> [ ... ]

-- 
 Lloyd Zusman
 ljz@asfast.com


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Emulating 'locate'
  2003-10-03 16:22     ` Lloyd Zusman
@ 2003-10-04 10:48       ` DervishD
  2003-10-04 13:48         ` Lloyd Zusman
  0 siblings, 1 reply; 18+ messages in thread
From: DervishD @ 2003-10-04 10:48 UTC (permalink / raw)
  To: Lloyd Zusman; +Cc: zsh-users

    Hi Lloyd :)

 * Lloyd Zusman <ljz@asfast.com> dixit:
> >>     locate() { print -l /**/*${^*}*{,/**/*} }
> >     Ok, it works like a charm... Thanks a lot, as always :)
> I might have missed something about this in the first part of the thread
> a couple weeks ago (those messages have already expired on my system),
> but in case it wasn't mentioned before, I want to point out that this
> function is _extremely_ slow in comparison to the standard 'locate'
> command.  It traverses through every accessible item on every accessible
> file system in order to check for a match.  On my server, it's literally
> thousands of times slower than using the standard 'locate'.

    Obviously: locate uses a database of names for doing the
'location'. Moreover, I don't know exactly if locate is faster than
doing a grep in the same database (uncompressed, of course... The
locate database is front-compressed, see find manual for details).

    The 'locate' command doesn't do any magic for being fast: the
price it pays is the need of a database, that may be outdated (so you
will miss files, or find nonexistent ones...). If you want reliable
results you have two options:

    - Use the zsh version, or a version with 'find'.
    - Update de database regularly. Very regularly, in fact. If files
are created and destroyed frequently, you will have to update the
database continously... On the average system, anyway, this is not an
issue, specially if you look for files that reside on 'stable' parts
of the system.
 
> I'm not sure how it compares to this:
>   locate() { find / -name "*${^*}*" -print }

    This is faster, IMHO, because AFAIK find uses a non-recursive
algorithm to recurse the hierarchy. Although I'm not sure about that
glob pattern you use, since it will be interpreted by find, not the
shell :?? The manual says you can use a shell pattern, but I'm not
sure about who interprets it. If it is find who interprets, then
${^*} won't work as expected. Using more ellaborate patterns is an
advantage of using the zsh version.

> Figuring this out is a very good learning experience for zsh. 
> However, I would not recommend installing this function for
> everyday use on a reasonably sized system.

    Of course ;))) But on small systems or when searching on a
limited set of directories, the zsh version, although slower, permits
more ellaborated searches, IMHO. And doesn't find false positives
(nonexistent files) nor misses files ;) But you're true, this is more
a learning experience than a function of real use. For it to be
useful, it must be rewritten to use a database, or something like
that...

    Raúl Núñez de Arenas Coronado

-- 
Linux Registered User 88736
http://www.pleyades.net & http://raul.pleyades.net/


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Emulating 'locate'
  2003-10-04 10:48       ` DervishD
@ 2003-10-04 13:48         ` Lloyd Zusman
  2003-10-04 15:12           ` DervishD
  2003-10-04 16:37           ` Bart Schaefer
  0 siblings, 2 replies; 18+ messages in thread
From: Lloyd Zusman @ 2003-10-04 13:48 UTC (permalink / raw)
  To: zsh-users

DervishD <raul@pleyades.net> writes:

>     Hi Lloyd :)
>
>  * Lloyd Zusman <ljz@asfast.com> dixit:
>> >>     locate() { print -l /**/*${^*}*{,/**/*} }
>> >     Ok, it works like a charm... Thanks a lot, as always :)
>> I might have missed something about this in the first part of the thread
>> a couple weeks ago (those messages have already expired on my system),
>> but in case it wasn't mentioned before, I want to point out that this
>> function is _extremely_ slow in comparison to the standard 'locate'
>> command.  [ ... ]
>
>     Obviously: locate uses a database of names for doing the
> 'location'. Moreover, I don't know exactly if locate is faster than
> doing a grep in the same database (uncompressed, of course... The
> locate database is front-compressed, see find manual for details).
>
>     The 'locate' command doesn't do any magic for being fast: the
> price it pays is the need of a database, that may be outdated (so you
> will miss files, or find nonexistent ones...). If you want reliable
> results you have two options:
>
>     - Use the zsh version, or a version with 'find'.
>     - Update de database regularly. Very regularly, in fact. If files
> are created and destroyed frequently, you will have to update the
> database continously... On the average system, anyway, this is not an
> issue, specially if you look for files that reside on 'stable' parts
> of the system.

Well, I generally use the 'locate' command when I want to do a global
search over my entire system.  I always am aware that it might be
out-dated, and I go back to 'find' when I want to do a search that is
up-to-the-moment accurate.  However, in that case, I target it to a
specific directory tree, and rarely, if ever recurse down from the root
directory unless I want to take a long coffee break waiting for results,
and I don't mind users screaming at me for slowing down the system.

Your locate function would be even better than it already is if you
could point it at a directory instead of having it always start at root.
That would be an interesting continuation of this exercise!


>> I'm not sure how it compares to this:
>>   locate() { find / -name "*${^*}*" -print }
>
>     This is faster, IMHO, because AFAIK find uses a non-recursive
> algorithm to recurse the hierarchy. Although I'm not sure about that
> glob pattern you use, since it will be interpreted by find, not the
> shell :?? The manual says you can use a shell pattern, but I'm not
> sure about who interprets it. If it is find who interprets, then
> ${^*} won't work as expected. Using more ellaborate patterns is an
> advantage of using the zsh version.

zsh interprets the ${^*} part in intersperses it between the other two
asterisks when the shell function is being invoked, and 'find'
interprets the result.  I think I should have left out the ^, however,
or probably only used ${1}.

I just ran a timing test, and unfortunately, 'find' fares better than
your locate function, which I named 'xlocate' on my system.  Here are
the results:

  find / -name specific-file -print   # 15 min 19 sec elapsed
  xlocate specific-file               # 28 min 40 sec elapsed

Of course, your function provides zsh's much richer set of matching
capabilities.


>> Figuring this out is a very good learning experience for zsh. 
>> However, I would not recommend installing this function for
>> everyday use on a reasonably sized system.
>
>     Of course ;))) But on small systems or when searching on a
> limited set of directories, the zsh version, although slower, permits
> more ellaborated searches, IMHO. And doesn't find false positives
> (nonexistent files) nor misses files ;) But you're true, this is more
> a learning experience than a function of real use. For it to be
> useful, it must be rewritten to use a database, or something like
> that...

Well, I think that there is a way to make it quite good for everyday use
without having to go so far as to create a database: just come up with a
way to target the search from a specific directory instead of always
having to start from root.  If your shell function could take an
additional first argument, namely the directory under which to start
searching, it would be great, IMHO.  For example:

  # look under my HOME directory and find all
  # files whose names match the x*.c pattern
  locate ~ 'x*.c'

  # I know that 'lost-file-name' is located under
  # /usr/share, but I can't for the life of me
  # remember where it is
  locate /usr/share lost-file-name

  # Give me a list of every GIF, JPEG, and PNG
  # on my entire system.  I don't mind taking
  # a coffee break while waiting for the results
  locate / '(#i)*.{gif,jp{,e}g,png}'

Here's my first try at it (I call it 'xlocate' so as not to conflict
with the 'locate' command on my system):

  xlocate() {
    setopt nullglob extendedglob
    eval print -l ${argv[1]%/}'/**/'${^argv[2,-1]}'{,/**/*}'
  }

I removed the asterisks before and after the ${^argv[2,-1]} so I don't
lose the ability to do the following:

  xlocate ~ '*.c'   # only matches *.c files under HOME
  xlocate ~ c       # only matches files named 'c' under HOME


>     Raúl Núñez de Arenas Coronado

-- 
 Lloyd Zusman
 ljz@asfast.com


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Emulating 'locate'
  2003-10-04 13:48         ` Lloyd Zusman
@ 2003-10-04 15:12           ` DervishD
  2003-10-04 17:05             ` Lloyd Zusman
  2003-10-04 16:37           ` Bart Schaefer
  1 sibling, 1 reply; 18+ messages in thread
From: DervishD @ 2003-10-04 15:12 UTC (permalink / raw)
  To: Lloyd Zusman; +Cc: zsh-users

    Hi Lloyd :)

 * Lloyd Zusman <ljz@asfast.com> dixit:
> >     - Update de database regularly. Very regularly, in fact. If files
> > are created and destroyed frequently, you will have to update the
> > database continously... On the average system, anyway, this is not an
> > issue, specially if you look for files that reside on 'stable' parts
> > of the system.
> Well, I generally use the 'locate' command when I want to do a global
> search over my entire system.  I always am aware that it might be
> out-dated, and I go back to 'find' when I want to do a search that is
> up-to-the-moment accurate.

    That's a good deal, because both searches will be fast, and the
second one, the 'find' one, will be issued over a limited set of
directories. In fact I do the same, although instead of using 'find'
I do 'print -l **...', because I type it fast and the speed
difference when dealing with small hierarchies is nelligible ;)

> However, in that case, I target it to a specific directory tree,
> and rarely, if ever recurse down from the root directory unless I
> want to take a long coffee break waiting for results, and I don't
> mind users screaming at me for slowing down the system.

    ;)))))))))) I see you are not a BOFH ;))) Confess it: you like
your users XDDD

> Your locate function would be even better than it already is if you
> could point it at a directory instead of having it always start at root.
> That would be an interesting continuation of this exercise!

    But that's easy. Right now I think of a solution: adding a flag
to start the search from the root, for example, or something like
that. Or easier, even, if the search term starts with a dot, then
strip that dot and do the search 'locally'. The flag is cleaner,
anyway.

> >> I'm not sure how it compares to this:
> >>   locate() { find / -name "*${^*}*" -print }
> >     This is faster, IMHO, because AFAIK find uses a non-recursive
> > algorithm to recurse the hierarchy. Although I'm not sure about that
> > glob pattern you use, since it will be interpreted by find, not the
> > shell :??
> zsh interprets the ${^*} part in intersperses it between the other two
> asterisks when the shell function is being invoked, and 'find'
> interprets the result.  I think I should have left out the ^, however,
> or probably only used ${1}.

    My fault: since the asterisk won't expand when quoted, like the
example, by brain went to a travel in a fantastic land and thought
that the asterisk in the braces won't be expanded neither... Anyway,
it shouldn't work because, as you suggest, you should have used just
${1}. The ^ is necessary to correctly expand multiple patterns, one
per positional parameter. No matter, really, because you can use the
shell expansion to generate many '-name' options, one per positional
parameter.

> I just ran a timing test, and unfortunately, 'find' fares better than
> your locate function, which I named 'xlocate' on my system.  Here are
> the results:
> 
>   find / -name specific-file -print   # 15 min 19 sec elapsed
>   xlocate specific-file               # 28 min 40 sec elapsed

    Ooops... Nearly doubles the time... As I said, 'find' uses a
non-recursive approach for finding files, and that is a good point.
In fact, the standard way of finding files is 'find' because it is
faster ;))) I should look at the sources for making sure it is not
recursive...

    BTW, you seem to have a *really* big set of files...
 
> > For it to be
> > useful, it must be rewritten to use a database, or something like
> > that...
> Well, I think that there is a way to make it quite good for everyday use
> without having to go so far as to create a database: just come up with a
> way to target the search from a specific directory instead of always
> having to start from root.

    Well, that's a solution, too. Usually I make my searches from my
root directory, that's because I use locate. In my box files are
created and/or deleted sparingly, so I just update the database once
a week.

> If your shell function could take an
> additional first argument, namely the directory under which to start
> searching, it would be great, IMHO.  For example:

    Thanks for the code :))

>   xlocate() {
>     setopt nullglob extendedglob
>     eval print -l ${argv[1]%/}'/**/'${^argv[2,-1]}'{,/**/*}'
>   }

    Nice! :)))
 
> I removed the asterisks before and after the ${^argv[2,-1]} so I don't
> lose the ability to do the following:
> 
>   xlocate ~ '*.c'   # only matches *.c files under HOME
>   xlocate ~ c       # only matches files named 'c' under HOME

    That's a thing I did yesterday, because I use to write patterns
at my locate, and if I specify a filename I prefer to use that
filename. Thanks again for your help and suggestions :)

    Raúl Núñez de Arenas Coronado

-- 
Linux Registered User 88736
http://www.pleyades.net & http://raul.pleyades.net/


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Emulating 'locate'
  2003-10-04 13:48         ` Lloyd Zusman
  2003-10-04 15:12           ` DervishD
@ 2003-10-04 16:37           ` Bart Schaefer
  2003-10-04 19:33             ` Lloyd Zusman
  1 sibling, 1 reply; 18+ messages in thread
From: Bart Schaefer @ 2003-10-04 16:37 UTC (permalink / raw)
  To: zsh-users

On Oct 4,  9:48am, Lloyd Zusman wrote:
}
} >> I'm not sure how it compares to this:
} >>   locate() { find / -name "*${^*}*" -print }
} 
} zsh interprets the ${^*} part in intersperses it between the other two
} asterisks when the shell function is being invoked, and 'find'
} interprets the result.  I think I should have left out the ^, however,
} or probably only used ${1}.

Actually the caret is meaningless for $* when in double quotes.  What
you were thinking of was probably "*${^@}*".

    set "*${^@}*"
    eval find / '\(' -name "'${(j:' -o -name ':)@}'" '\)' -print

} I just ran a timing test, and unfortunately, 'find' fares better than
} your locate function, which I named 'xlocate' on my system.  Here are
} the results:
} 
}   find / -name specific-file -print   # 15 min 19 sec elapsed
}   xlocate specific-file               # 28 min 40 sec elapsed

I think that's expected rather than unfortunate.  For one thing, that
find command will only print paths that end in names matching the
pattern, whereas xlocate descends and prints entire trees below any
directory matching the pattern.  (I'm not even sure how to express the
latter in find without resorting to -exec of another find.)  However,
zsh also does a lot of stat() calls during the glob to avoid following
symlinks and perform MARK_DIRS and so on, and it buffers up all the
results and does a duplicate-eliminating sort pass as well (in case of
a path like x/y/z/foo/a/b/c/foo/p/d/q when globbing for **/foo/*).

} Well, I think that there is a way to make it quite good for everyday use
} without having to go so far as to create a database: just come up with a
} way to target the search from a specific directory [...]
} 
} Here's my first try at it (I call it 'xlocate' so as not to conflict
} with the 'locate' command on my system):
} 
}   xlocate() {
}     setopt nullglob extendedglob
}     eval print -l ${argv[1]%/}'/**/'${^argv[2,-1]}'{,/**/*}'
}   }

You should use localoptions there, and you can avoid the eval:

    xlocate() {
	setopt localoptions nullglob extendedglob
	print -l ${~argv[1]%/}/**/${~^argv[2,-1]}{,/**/*}
    }

And if you add this alias (which must come after the function def'n):

    alias xlocate='noglob xlocate'

Then you can use glob patterns without quoting, as in

    xlocate ~ *.c

} I removed the asterisks before and after the ${^argv[2,-1]} so I don't
} lose the ability to do the following:
} 
}   xlocate ~ '*.c'   # only matches *.c files under HOME
}   xlocate ~ c       # only matches files named 'c' under HOME

You could have the best of both worlds with a simple addition:

    xlocate() {
	setopt localoptions nullglob extendedglob
	# If there's only one argument, behave more like 'locate'
	((ARGC == 1)) && set / "*${(q)1}*"
	print -l ${~argv[1]%/}/**/${~^argv[2,-1]}{,/**/*}
    }

Or maybe this is better:

    xlocate() {
	setopt localoptions nullglob extendedglob
	# If the first argument is not a directory, behave like locate
	[ -d $~1 ] || set / "*${^@}*"
	print -l ${~argv[1]%/}/**/${~^argv[2,-1]}{,/**/*}
    }

(I used [ ] there for a reason: [[ ]] doesn't glob the argument of -d.)


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Emulating 'locate'
  2003-10-04 15:12           ` DervishD
@ 2003-10-04 17:05             ` Lloyd Zusman
  2003-10-04 21:35               ` DervishD
  0 siblings, 1 reply; 18+ messages in thread
From: Lloyd Zusman @ 2003-10-04 17:05 UTC (permalink / raw)
  To: zsh-users

DervishD <raul@pleyades.net> writes:

>     Hi Lloyd :)
>
>  * Lloyd Zusman <ljz@asfast.com> dixit:
>>
>>
>> However, in that case, I target it to a specific directory tree,
>> and rarely, if ever recurse down from the root directory unless I
>> want to take a long coffee break waiting for results, and I don't
>> mind users screaming at me for slowing down the system.
>
>     ;)))))))))) I see you are not a BOFH ;))) Confess it: you like
> your users XDDD

Of course.  That's nothing to be ashamed of. :)


> [ ... ]
>
>>   find / -name specific-file -print   # 15 min 19 sec elapsed
>>   xlocate specific-file               # 28 min 40 sec elapsed
>
> [ ... ]
>
>     BTW, you seem to have a *really* big set of files...

Well, I manage a small system with around 150-160 users total, with
maybe 15-20 percent of them active at the moment.  Only a handful are
shell users; the rest are mostly email users; there are a small number
of web users, as well.  It doesn't take much to generate lots of files,
even in a modestly sized system such as this one.


>     Thanks for the code :))
>
>>   xlocate() {
>>     setopt nullglob extendedglob
>>     eval print -l ${argv[1]%/}'/**/'${^argv[2,-1]}'{,/**/*}'
>>   }
>
>     Nice! :)))

Well, I created a full-blown version with user help, meaningful error
messages, etc., and I put it into /etc/zshrc so it's available to all
the shell users on my system.  I call it 'zfind' ("zsh find"), and the
code is below.


> [ ... ]
>
> [ ... ] Thanks again for your help and suggestions :)
>
> [ ... ]
>
>     Raúl Núñez de Arenas Coronado

My pleasure.

Here's the code to my full-blown zfind function, right out of my
/etc/zshrc:

zfind() {

  local usage moreusage match oiplus1 verbose=1

  usage="\nusage: $prog [ -qhH ] dir pattern ...

  -h  =>  print a short version of help to stderr

  -H  =>  print a longer version of help to stdout (so you
          can easily pipe it through your favorite pager)

  -q  =>  suppress all error messages except for help and
          the message for illegal flags on the command line\n"

  moreusage="
  This command recursively searches under the directory tree 
  specified by 'dir' for any filesystem items whose names match 
  each the 'pattern' items that are specified.

  Its usage approximates that of the 'find' command with the '-name'
  option.

  Example:

    $0 ~ '(#i)*.{gif,png,jp{,e}g}'

    Recursively lists all items under your HOME directory whose
    names have the suffix '.gif', '.png', '.jpg', or '.jpeg'.  
    In this case (due to the '(#i)' prefix), matches are done in 
    a case-insensitve manner.

  This command makes use of zsh's extended pattern matching.
  To get more information about this, do a 'man zshexpn' and 
  look under the FILENAME GENERATION section.\n"

  while getopts qhH arg
  do
    case "${arg}" in
    q)
       verbose=
       ;;
    h)
       print -u2 "${usage}"
       return 1
       ;;
    H)
       print "${usage}${moreusage}"
       return 1
       ;;
    ?)
       print -u2 "for help, invoke \"$0 -h\" or \"$0 -H\""
       return 1
       ;;
    esac
  done

  (( $# <= $OPTIND )) && {
    print -u2 "${usage}"
    return 1
  }

  setopt nullglob extendedglob

  [[ -d ${argv[$OPTIND]} ]] || {
    [[ -n "${verbose}" ]] && {
      print -u2 "$0: directory not found: ${argv[$OPTIND]}"
    }
    return 1
  }

  (( oiplus1 = $OPTIND + 1 ))
  eval match='('${argv[$OPTIND]%/}'/**/'${^argv[$oiplus1,-1]}'{,/**/*})'

  if [[ -z "${match}" ]]
  then
    [[ -n "${verbose}" ]] && {
      print -u2 "$0: not found within \"${argv[$OPTIND]}\" tree: " \
                "${^argv[$oiplus1,-1]}"
    }
    return 1
  else
    print -l ${match}
  fi
}



--
 Lloyd Zusman
 ljz@asfast.com


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Emulating 'locate'
  2003-10-04 16:37           ` Bart Schaefer
@ 2003-10-04 19:33             ` Lloyd Zusman
  2003-10-04 21:29               ` DervishD
  2003-10-04 22:40               ` Bart Schaefer
  0 siblings, 2 replies; 18+ messages in thread
From: Lloyd Zusman @ 2003-10-04 19:33 UTC (permalink / raw)
  To: zsh-users

Bart Schaefer <schaefer@brasslantern.com> writes:

> On Oct 4,  9:48am, Lloyd Zusman wrote:
> }
> } >> I'm not sure how it compares to this:
> } >>   locate() { find / -name "*${^*}*" -print }
> } 
> } zsh interprets the ${^*} part in intersperses it between the other two
> } asterisks when the shell function is being invoked, and 'find'
> } interprets the result.  I think I should have left out the ^, however,
> } or probably only used ${1}.
>
> Actually the caret is meaningless for $* when in double quotes.  What
> you were thinking of was probably "*${^@}*".
>
>     set "*${^@}*"
>     eval find / '\(' -name "'${(j:' -o -name ':)@}'" '\)' -print

Well, to be honest, I wasn't thinking at all. :)

I just copied part of the argument specfication from the extended-glob
version of the shell function to my quick, off the cuff version using
'find'.

Your example indeed does what I intended ... if I were to have been
thinking. :)


> } [ ... ]
> } 
> }   find / -name specific-file -print   # 15 min 19 sec elapsed
> }   xlocate specific-file               # 28 min 40 sec elapsed
>
> I think that's expected rather than unfortunate.  For one thing, that
> find command will only print paths that end in names matching the
> pattern, whereas xlocate descends and prints entire trees below any
> directory matching the pattern.  (I'm not even sure how to express the
> latter in find without resorting to -exec of another find.)  However,
> zsh also does a lot of stat() calls during the glob to avoid following
> symlinks and perform MARK_DIRS and so on, and it buffers up all the
> results and does a duplicate-eliminating sort pass as well (in case of
> a path like x/y/z/foo/a/b/c/foo/p/d/q when globbing for **/foo/*).

'find' has to do the same stat() calls as well, as it also has to
identify directories, avoid traversing symlinks, etc.  I didn't take
into consideration the fact that the zsh version keeps traversing even
when it finds a match.  However, in my timing example, I was searching
for a single file that happens to reside in the leaf of a directory
tree.  For the purpose of this test, I made sure that there was only one
instance of a file with this basename.  Therefore, both 'find' and 'zsh'
both traversed my entire file system, and both considered considered the
same number of items.

Actually, this brings up a question: I presume that if I want the zsh
version to only look for matches on the basename of paths being tested,
much like 'find', all that's needed would be to leave off the trailing
{,/**/*} ... correct?

I re-ran my earlier xlocate timing with a version that didn't have this
trailing {,/**/*}, and I did another xlocate timing with a version that
didn't surround the trailing ${^*} with asterisks.  And for
completeness, I did another 'find' run, this time with asterisks
surrounding the fine name.  Here are the results, along with a rehash of
the earlier findings, which are numbered 1 and 2, below (all of the
'print -l' versions are within the 'xlocate' alias, with 'specific-file'
passed as an argument):

1. find / -name specific-file -print              15 min 19 sec elapsed
2. print -l /**/*${^*}*{,/**/*}                   28 min 40 sec elapsed
3. print -l /**/*${^*}*                           13 min 58 sec elapsed
4. print -l /**/${^*}                             14 min 09 sec elapsed
5. find / -name '*specific-file*' -print          14 min 10 sec elapsed

I did numbers 1 and 2 at roughly the same time, and numbers 3, 4, and 5
at around the same time, a couple hours later.  My system was less
loaded during the 3/4/5 tests than during the earlier ones, which at
explains the somewhat lower values for all three of those elapsed times.

This is a rather non-scientific test, as you can't generalize from a
sample space of 1.  Nonetheless, the trends that are shown seem
reasonable: the zsh version is considerably slower with the trailing
{,/**/*}, and similar matching within 'find' and zsh turn out to take
similar amounts of time.

Based on this, it seems that zsh and 'find' are both maximally optimized
with regard to recursive searching ... or at least the're both optimized
equally well. :)   Therefore, I would no longer advise against using zsh
for these kinds of tasks.  And given that zsh's globbing is much more
sophisticated than find's, I would now lean towards using zsh in these
cases ... as long as you are careful about choosing matching constructs
that suit (and do not exceed) the task at hand.


> } [ ... ]
> }
> }   xlocate() {
> }     setopt nullglob extendedglob
> }     eval print -l ${argv[1]%/}'/**/'${^argv[2,-1]}'{,/**/*}'
> }   }
>
> You should use localoptions there, and you can avoid the eval:
>
>     xlocate() {
> 	setopt localoptions nullglob extendedglob
> 	print -l ${~argv[1]%/}/**/${~^argv[2,-1]}{,/**/*}
>     }
>
> And if you add this alias (which must come after the function def'n):
>
>     alias xlocate='noglob xlocate'

Well, using this alias causes the argv indices to be off by one in the
shell function: $0 becomes 'noglob', argv[1] becomes 'xlocate', etc.
The way I handle that case in my previously posted version (with the
help text, error checking, etc.) is to put the following near the top of
the shell function, and to use ${prog} everywhere I was previously using
$0.  In the shorter version of xlocate, above, a similar thing would
also have to be done with the argv indices.

  if [[ $0 = noglob ]]
  then
    prog=${argv[1]}
    (( OPTIND = $OPTIND + 1 ))
  else
    prog=$0
  fi


> [ ... ]

-- 
 Lloyd Zusman
 ljz@asfast.com


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Emulating 'locate'
  2003-10-04 19:33             ` Lloyd Zusman
@ 2003-10-04 21:29               ` DervishD
  2003-10-04 22:40               ` Bart Schaefer
  1 sibling, 0 replies; 18+ messages in thread
From: DervishD @ 2003-10-04 21:29 UTC (permalink / raw)
  To: Lloyd Zusman; +Cc: zsh-users

    Hi Lloyd :)

 * Lloyd Zusman <ljz@asfast.com> dixit:
> And given that zsh's globbing is much more
> sophisticated than find's, I would now lean towards using zsh in these
> cases ... as long as you are careful about choosing matching constructs
> that suit (and do not exceed) the task at hand.

    That's a pretty hard job... But I must admit I'm not a globbing
guru... I end up using asterisk everywere... ;)

    Raúl Núñez de Arenas Coronado

-- 
Linux Registered User 88736
http://www.pleyades.net & http://raul.pleyades.net/


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Emulating 'locate'
  2003-10-04 17:05             ` Lloyd Zusman
@ 2003-10-04 21:35               ` DervishD
  0 siblings, 0 replies; 18+ messages in thread
From: DervishD @ 2003-10-04 21:35 UTC (permalink / raw)
  To: Lloyd Zusman; +Cc: zsh-users

    Hi Lloyd :)

 * Lloyd Zusman <ljz@asfast.com> dixit:
> >     ;)))))))))) I see you are not a BOFH ;))) Confess it: you like
> > your users XDDD
> Of course.  That's nothing to be ashamed of. :)

    True :)) It's quite uncommon, anyway, but it's nice to see a
sysadmin with a pet feeling for his users ;)) Last time I suffered a
sysadmin, back at the university, I end up thinking summoning
Nyarlathotep for eating him was a pretty good idea. Fortunately I'm
not good at globbing nor at casting spells. And you don't want a
fiasco summoning an elder one XDDDD

> >     BTW, you seem to have a *really* big set of files...
> Well, I manage a small system with around 150-160 users total

    Pretty good...

> >     Thanks for the code :))
> Well, I created a full-blown version with user help, meaningful error
> messages, etc., and I put it into /etc/zshrc so it's available to all
> the shell users on my system.  I call it 'zfind' ("zsh find"), and the
> code is below.

    Thanks a lot, truly!!!!. I can't believe, you don't abuse your
users, you comment your code... Who are you and what have you done
with the *real* sysadmin? XDDD

    Thanks for the function, I'll stea^H^H^H^Hinspire from it }:)
Seriously, I'll adopt and adapt it for my system, thanks ;)
 
    Raúl Núñez de Arenas Coronado

-- 
Linux Registered User 88736
http://www.pleyades.net & http://raul.pleyades.net/


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Emulating 'locate'
  2003-10-04 19:33             ` Lloyd Zusman
  2003-10-04 21:29               ` DervishD
@ 2003-10-04 22:40               ` Bart Schaefer
  2003-10-04 23:18                 ` Lloyd Zusman
  1 sibling, 1 reply; 18+ messages in thread
From: Bart Schaefer @ 2003-10-04 22:40 UTC (permalink / raw)
  To: zsh-users

On Oct 4,  3:33pm, Lloyd Zusman wrote:
}
} Based on this, it seems that zsh and 'find' are both maximally optimized
} with regard to recursive searching ... or at least the're both optimized
} equally well. :)

For certain searches, "find -depth" might actually be faster.  Zsh always
does breadth-first globbing, even when asked to sort the final results
depth-first.

} >     alias xlocate='noglob xlocate'
} 
} Well, using this alias causes the argv indices to be off by one in the
} shell function: $0 becomes 'noglob', argv[1] becomes 'xlocate', etc.

If you're seeing that, then you've accidentally created a function named
"noglob" that has the same body as "xlocate".  Try this:

	alias foo='bar foo'
	foo() { echo $0 }
	functions bar
	functions foo

Note that "foo()" is considered to be "in the command position" and thus
the alias expands and you get

	bar foo () { echo $0 }

which defines two functions, "bar" and "foo" with identical bodies.  I'd
wager that you created the alias, then changed the definition of xlocate,
and ended up with a function named "noglob".

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com

Zsh: http://www.zsh.org | PHPerl Project: http://phperl.sourceforge.net   


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Emulating 'locate'
  2003-10-04 22:40               ` Bart Schaefer
@ 2003-10-04 23:18                 ` Lloyd Zusman
  2003-10-05 15:57                   ` Bart Schaefer
  0 siblings, 1 reply; 18+ messages in thread
From: Lloyd Zusman @ 2003-10-04 23:18 UTC (permalink / raw)
  To: zsh-users

Bart Schaefer <schaefer@brasslantern.com> writes:

> On Oct 4,  3:33pm, Lloyd Zusman wrote:
> }
> } [ ... ]
> }
> For certain searches, "find -depth" might actually be faster.  Zsh
> always does breadth-first globbing, even when asked to sort the final
> results depth-first.

I guess that comes into play when I want to find something that happens
to be buried deep inside of a directory tree, where the parents directories
have lots of files.


> } [ ... ]
> } 
> } Well, using this alias causes the argv indices to be off by one in the
> } shell function: $0 becomes 'noglob', argv[1] becomes 'xlocate', etc.
>
> If you're seeing that, then you've accidentally created a function named
> "noglob" that has the same body as "xlocate".  Try this:
>
> 	alias foo='bar foo'
> 	foo() { echo $0 }
> 	functions bar
> 	functions foo
>
> Note that "foo()" is considered to be "in the command position" and thus
> the alias expands and you get
>
> 	bar foo () { echo $0 }
>
> which defines two functions, "bar" and "foo" with identical bodies.  I'd
> wager that you created the alias, then changed the definition of xlocate,
> and ended up with a function named "noglob".

Yep.  That's exactly what happened.  Thank you.  I kept re-invoking
". /etc/zshrc" to test some changes to my function as I was developing
it.  The alias command "alias xlocate='noglob xlocate'" was then in
effect the next time I sourced /etc/zshrc.

Therefore, prior to the function definition, I now do this:

  { unalias xlocate; unfunction xlocate } 2>/dev/null

But besides that, would another way to prevent this problem be to always
define functions with "function foo" instead of "foo()"?


-- 
 Lloyd Zusman
 ljz@asfast.com


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Emulating 'locate'
  2003-10-04 23:18                 ` Lloyd Zusman
@ 2003-10-05 15:57                   ` Bart Schaefer
  2003-10-06 13:37                     ` Lloyd Zusman
  0 siblings, 1 reply; 18+ messages in thread
From: Bart Schaefer @ 2003-10-05 15:57 UTC (permalink / raw)
  To: zsh-users

On Oct 4,  7:18pm, Lloyd Zusman wrote:
}
} Therefore, prior to the function definition, I now do this:
} 
}   { unalias xlocate; unfunction xlocate } 2>/dev/null
} 
} But besides that, would another way to prevent this problem be to always
} define functions with "function foo" instead of "foo()"?

Yes, the latter will work.  Note, however, that in ksh the meanings of
"foo() {...}" and "function foo {...}" are not quite equivalent, and
the zsh/ksh/bash developers have been discussing standardization of
some of those kinds of details, so there's a very small chance that in
the future you won't always be able to use them interchangeably.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Emulating 'locate'
  2003-10-05 15:57                   ` Bart Schaefer
@ 2003-10-06 13:37                     ` Lloyd Zusman
  0 siblings, 0 replies; 18+ messages in thread
From: Lloyd Zusman @ 2003-10-06 13:37 UTC (permalink / raw)
  To: zsh-users

Bart Schaefer <schaefer@brasslantern.com> writes:

> On Oct 4,  7:18pm, Lloyd Zusman wrote:
> }
> } Therefore, prior to the function definition, I now do this:
> } 
> }   { unalias xlocate; unfunction xlocate } 2>/dev/null
> } 
> } But besides that, would another way to prevent this problem be to always
> } define functions with "function foo" instead of "foo()"?
>
> Yes, the latter will work.  Note, however, that in ksh the meanings of
> "foo() {...}" and "function foo {...}" are not quite equivalent, and
> the zsh/ksh/bash developers have been discussing standardization of
> some of those kinds of details, so there's a very small chance that in
> the future you won't always be able to use them interchangeably.

Well, then I'll just play it safe and stick with unalias/unfunction
before my function definitions.

Thanks.


-- 
 Lloyd Zusman
 ljz@asfast.com


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2003-10-06 13:37 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20031001221753.GA23189@DervishD>
     [not found] ` <1031002023639.ZM22046@candle.brasslantern.com>
2003-10-02  8:03   ` Emulating 'locate' DervishD
2003-10-02 14:29     ` Bart Schaefer
2003-10-02 15:53       ` DervishD
2003-10-02 17:08         ` Oliver Kiddle
2003-10-02 19:27           ` DervishD
2003-10-03 16:22     ` Lloyd Zusman
2003-10-04 10:48       ` DervishD
2003-10-04 13:48         ` Lloyd Zusman
2003-10-04 15:12           ` DervishD
2003-10-04 17:05             ` Lloyd Zusman
2003-10-04 21:35               ` DervishD
2003-10-04 16:37           ` Bart Schaefer
2003-10-04 19:33             ` Lloyd Zusman
2003-10-04 21:29               ` DervishD
2003-10-04 22:40               ` Bart Schaefer
2003-10-04 23:18                 ` Lloyd Zusman
2003-10-05 15:57                   ` Bart Schaefer
2003-10-06 13:37                     ` Lloyd Zusman

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).