* iterating through a hierarchy with a filter
@ 2008-04-10 2:56 Alexy Khrabrov
2008-04-10 8:51 ` Peter Stephenson
0 siblings, 1 reply; 11+ messages in thread
From: Alexy Khrabrov @ 2008-04-10 2:56 UTC (permalink / raw)
To: zsh-users
Greetings -- I have a series of XML files scattered around a
hierarchy. I also have a filter script which reads stdin and writes
stdout in a standard Unix way.
I need to create a mirror hierarchy where files are results of
applying the filter to the originals, replacing the original
extension, say .xml, with the result extension, say .txt. Except for
the extension change, the hierarchy should be preserved.
Ideally, my script would not know anything about this whole process,
so I can take any filter and use it for my transforms. The hierarchy
for the transformed tree must be created anew so I can easily throw it
away (i.e. we do not output the results next to the originals). The
original and result extensions should be provided as a parameter to
the process.
Which zshfoo can I use for it above and beyond find with exec helper
script?
Cheers,
Alexy
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iterating through a hierarchy with a filter
2008-04-10 2:56 iterating through a hierarchy with a filter Alexy Khrabrov
@ 2008-04-10 8:51 ` Peter Stephenson
2008-04-10 9:03 ` Alexy Khrabrov
0 siblings, 1 reply; 11+ messages in thread
From: Peter Stephenson @ 2008-04-10 8:51 UTC (permalink / raw)
To: zsh-users
Alexy Khrabrov wrote:
> Greetings -- I have a series of XML files scattered around a
> hierarchy. I also have a filter script which reads stdin and writes
> stdout in a standard Unix way.
>
> I need to create a mirror hierarchy where files are results of
> applying the filter to the originals, replacing the original
> extension, say .xml, with the result extension, say .txt. Except for
> the extension change, the hierarchy should be preserved.
>
> Ideally, my script would not know anything about this whole process,
> so I can take any filter and use it for my transforms. The hierarchy
> for the transformed tree must be created anew so I can easily throw it
> away (i.e. we do not output the results next to the originals). The
> original and result extensions should be provided as a parameter to
> the process.
>
> Which zshfoo can I use for it above and beyond find with exec helper
> script?
zsh doesn't have any specific code for descending hierarchies, so you
would have to do that by trickery. If it's shallow enough that globbing
the whole thing in one go will work, you can do things along the lines
of (untested):
for file1 in source/**/*.xml; do
file2=dest/${${file1##source/}:r}.txt
destdir=${file2:h}
[[ -d $destdir ]] || mkdir -p $destdir
filter <$file1 >$file2
done
If that doesn't work even with a few small tweaks, you'll probably have
to tell us why before we can advise better.
--
Peter Stephenson <pws@csr.com> Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK Tel: +44 (0)1223 692070
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iterating through a hierarchy with a filter
2008-04-10 8:51 ` Peter Stephenson
@ 2008-04-10 9:03 ` Alexy Khrabrov
2008-04-10 9:31 ` Peter Stephenson
2008-04-10 10:52 ` Thor Andreassen
0 siblings, 2 replies; 11+ messages in thread
From: Alexy Khrabrov @ 2008-04-10 9:03 UTC (permalink / raw)
To: Peter Stephenson; +Cc: zsh-users
[-- Attachment #1: Type: text/plain, Size: 720 bytes --]
On Apr 10, 2008, at 1:51 AM, Peter Stephenson wrote:
> [...]
>
> zsh doesn't have any specific code for descending hierarchies, so you
> would have to do that by trickery. If it's shallow enough that
> globbing
> the whole thing in one go will work, you can do things along the lines
> of (untested):
>
> for file1 in source/**/*.xml; do
> file2=dest/${${file1##source/}:r}.txt
> destdir=${file2:h}
> [[ -d $destdir ]] || mkdir -p $destdir
> filter <$file1 >$file2
> done
>
> If that doesn't work even with a few small tweaks, you'll probably
> have
> to tell us why before we can advise better.
Well, my hierarchy is a million small files. So I doubt globbing will
work -- should I try? :)
Cheers,
Alexy
[-- Attachment #2: Type: text/html, Size: 1092 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iterating through a hierarchy with a filter
2008-04-10 9:03 ` Alexy Khrabrov
@ 2008-04-10 9:31 ` Peter Stephenson
2008-04-10 9:34 ` Peter Stephenson
2008-04-10 10:52 ` Thor Andreassen
1 sibling, 1 reply; 11+ messages in thread
From: Peter Stephenson @ 2008-04-10 9:31 UTC (permalink / raw)
To: zsh-users
On Thu, 10 Apr 2008 02:03:21 -0700
Alexy Khrabrov <deliverable@gmail.com> wrote:
> On Apr 10, 2008, at 1:51 AM, Peter Stephenson wrote:
> >
> > for file1 in source/**/*.xml; do
> > file2=dest/${${file1##source/}:r}.txt
> > destdir=${file2:h}
> > [[ -d $destdir ]] || mkdir -p $destdir
> > filter <$file1 >$file2
> > done
> >
> > If that doesn't work even with a few small tweaks, you'll probably
> > have to tell us why before we can advise better.
>
> Well, my hierarchy is a million small files. So I doubt globbing will
> work -- should I try? :)
You might get lucky, but it sounds like you need to do it the hard way.
You can use that code as the core, except you can move the directory
handling out of the way and end up with something like (again this is
untested):
handledir() {
local sdir=$1 ddir=$2
local dir file tail
[[ -d $ddir ]] || mkdir -p $ddir
for dir in $sdir/*(/N); do
handledir $dir $ddir/${dir:t}
done
for file in $sdir/*.xml(N)); do
filter <$file >$ddir/{$file:t}
done
}
handledir sourcedir destdir
The only special zsh features are the globbing flags. (N) forces
the expression to expand to nothing at all if no patterns matched.
--
Peter Stephenson <pws@csr.com> Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK Tel: +44 (0)1223 692070
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iterating through a hierarchy with a filter
2008-04-10 9:31 ` Peter Stephenson
@ 2008-04-10 9:34 ` Peter Stephenson
0 siblings, 0 replies; 11+ messages in thread
From: Peter Stephenson @ 2008-04-10 9:34 UTC (permalink / raw)
Cc: zsh-users
Peter Stephenson wrote:
> filter <$file >$ddir/{$file:t}
^^^^^^^^^
That should be ${file:t:r}.txt
--
Peter Stephenson <pws@csr.com> Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK Tel: +44 (0)1223 692070
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iterating through a hierarchy with a filter
2008-04-10 9:03 ` Alexy Khrabrov
2008-04-10 9:31 ` Peter Stephenson
@ 2008-04-10 10:52 ` Thor Andreassen
2008-04-10 10:56 ` Thor Andreassen
2008-04-10 11:42 ` Stephane Chazelas
1 sibling, 2 replies; 11+ messages in thread
From: Thor Andreassen @ 2008-04-10 10:52 UTC (permalink / raw)
To: zsh-users
On Thu, Apr 10, 2008 at 02:03:21AM -0700, Alexy Khrabrov wrote:
>
> On Apr 10, 2008, at 1:51 AM, Peter Stephenson wrote:
[...]
> >for file1 in source/**/*.xml; do
> > file2=dest/${${file1##source/}:r}.txt
> > destdir=${file2:h}
> > [[ -d $destdir ]] || mkdir -p $destdir
> > filter <$file1 >$file2
> >done
[...]
> Well, my hierarchy is a million small files. So I doubt globbing will
> work -- should I try? :)
Doing it as a stream should work, e.g.:
find source/ -iname '*.xml' | while read file1; do
file2=dest/${${file1##source/}:r}.txt
destdir=${file2:h}
[[ -d $destdir ]] || mkdir -p $destdir
filter < $file1 > $file2
done
--
Thor
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iterating through a hierarchy with a filter
2008-04-10 10:52 ` Thor Andreassen
@ 2008-04-10 10:56 ` Thor Andreassen
2008-04-10 11:42 ` Stephane Chazelas
1 sibling, 0 replies; 11+ messages in thread
From: Thor Andreassen @ 2008-04-10 10:56 UTC (permalink / raw)
To: zsh-users
On Thu, Apr 10, 2008 at 12:52:45PM +0200, Thor Andreassen wrote:
[...]
> find source/ -iname '*.xml' | while read file1; do
Improvement, use:
find source/ -type f -iname '*.xml'
to make sure you only filter real files.
--
Thor
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iterating through a hierarchy with a filter
2008-04-10 10:52 ` Thor Andreassen
2008-04-10 10:56 ` Thor Andreassen
@ 2008-04-10 11:42 ` Stephane Chazelas
2008-04-14 21:39 ` Alexy Khrabrov
1 sibling, 1 reply; 11+ messages in thread
From: Stephane Chazelas @ 2008-04-10 11:42 UTC (permalink / raw)
To: zsh-users
On Thu, Apr 10, 2008 at 12:52:45PM +0200, Thor Andreassen wrote:
[...]
> find source/ -iname '*.xml' | while read file1; do
That should be while IFS= read -r file1
but that assumes that the file names don't contain newline
characters. -iname is a GNU extension, that's neither POSIX nor
Unix. While you're at using GNU extensions, you could use
-print0:
find source -iname '*.xml' | while IFS= read -rd$'\0' file1...
as $'\0' won't be found in a file name.
> file2=dest/${${file1##source/}:r}.txt
> destdir=${file2:h}
> [[ -d $destdir ]] || mkdir -p $destdir
> filter < $file1 > $file2
> done
[...]
This can be done in zsh with:
for file1 in **/*.(#i)xml(.NDoN); do
That builds the whole list first, but you can also do:
process() {
file2=dest/${${1#source/}:r}.txt
destdir=${file2:h}
[[ -d $destdir ]] || mkdir -p -- $destdir
filter < $file1 > $file2
return 1
}
: **/*.(#i)xml(.NDoN+process)
POSIXly, you could do:
find source -type f -name '*.[xX][mM][lL]' -exec sh -c '
for file1 do
file2=${file1%.*}.txt
file2=dest/${file2#source/}
destdir=${file2%/*}
[ -d "$destdir" ] || mkdir -p -- "$destdir"
filter < "$file1" > "$file2"
done' inline {} +
--
Stéphane
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iterating through a hierarchy with a filter
2008-04-10 11:42 ` Stephane Chazelas
@ 2008-04-14 21:39 ` Alexy Khrabrov
2008-04-14 23:33 ` Vincent Lefevre
0 siblings, 1 reply; 11+ messages in thread
From: Alexy Khrabrov @ 2008-04-14 21:39 UTC (permalink / raw)
To: zsh-users
I have the filter given as a parameter to my script, invoked as
suggested,
$filter < $file1 > $file2
If I give a single existing script as a parameter, it works fine. If,
however, I give it
walk 'iconv -f utf8 -t cp1251' srcdir tgtdir ...
-- I get "command not found" for 'iconv -f utf8 -t cp1251' at the line
above. Since the walk script starts with
#/bin/zsh
filter=$1
I wonder what kind of quoting happens and how to "dequote" it so the
command line will look indeed like
iconv -f utf8 -t cp1251 < $file1 > $file2
E.g., doing filter="$1" doesn't change it.
Cheers,
Alexy
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iterating through a hierarchy with a filter
2008-04-14 21:39 ` Alexy Khrabrov
@ 2008-04-14 23:33 ` Vincent Lefevre
2008-04-15 9:51 ` Peter Stephenson
0 siblings, 1 reply; 11+ messages in thread
From: Vincent Lefevre @ 2008-04-14 23:33 UTC (permalink / raw)
To: zsh-users
On 2008-04-14 14:39:25 -0700, Alexy Khrabrov wrote:
> I have the filter given as a parameter to my script, invoked as
> suggested,
>
> $filter < $file1 > $file2
>
> If I give a single existing script as a parameter, it works fine. If,
> however, I give it
>
> walk 'iconv -f utf8 -t cp1251' srcdir tgtdir ...
>
> -- I get "command not found" for 'iconv -f utf8 -t cp1251' at the line
> above. Since the walk script starts with
>
> #/bin/zsh
> filter=$1
You need to do sh word-splitting on $1 and make $filter an array:
filter=(${=1})
Alternatively, you can enable sh word-splitting globally.
--
Vincent Lefèvre <vincent@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iterating through a hierarchy with a filter
2008-04-14 23:33 ` Vincent Lefevre
@ 2008-04-15 9:51 ` Peter Stephenson
0 siblings, 0 replies; 11+ messages in thread
From: Peter Stephenson @ 2008-04-15 9:51 UTC (permalink / raw)
To: zsh-users
On Tue, 15 Apr 2008 01:33:53 +0200
Vincent Lefevre <vincent@vinc17.org> wrote:
> On 2008-04-14 14:39:25 -0700, Alexy Khrabrov wrote:
> > If I give a single existing script as a parameter, it works fine. If,
> > however, I give it
> >
> > walk 'iconv -f utf8 -t cp1251' srcdir tgtdir ...
> >
> > -- I get "command not found" for 'iconv -f utf8 -t cp1251' at the line
> > above.
>
> You need to do sh word-splitting on $1 and make $filter an array:
>
> filter=(${=1})
>
> Alternatively, you can enable sh word-splitting globally.
That should work fine in this case.
More generally, I would be inclined to take the attitude that the argument
to your script is a complete command line in itself. In that case, the
logical thing to do is to "eval" the variable that contains it. That means
that you can put anything there you would in a normal zsh command line. If
you want the command to be executed "at arm's length", put the eval inside
parentheses:
(eval $1) <input >output
I have a vague memory that the shell is smart enough only to fork once
if there's a single external command in the eval, but the mists of time may
be confusing me.
--
Peter Stephenson <pws@csr.com> Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK Tel: +44 (0)1223 692070
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2008-04-15 9:52 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-04-10 2:56 iterating through a hierarchy with a filter Alexy Khrabrov
2008-04-10 8:51 ` Peter Stephenson
2008-04-10 9:03 ` Alexy Khrabrov
2008-04-10 9:31 ` Peter Stephenson
2008-04-10 9:34 ` Peter Stephenson
2008-04-10 10:52 ` Thor Andreassen
2008-04-10 10:56 ` Thor Andreassen
2008-04-10 11:42 ` Stephane Chazelas
2008-04-14 21:39 ` Alexy Khrabrov
2008-04-14 23:33 ` Vincent Lefevre
2008-04-15 9:51 ` Peter Stephenson
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/zsh/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).