From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: Date: Thu, 1 Jun 2006 17:06:28 +0100 From: rog@vitanuova.com To: 9fans@cse.psu.edu Subject: Re: [9fans] csv files -> embarrasing In-Reply-To: <447EF168.10209@comtv.ru> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Topicbox-Message-UUID: 58d093d2-ead1-11e9-9d60-3106f5b1d025 > sed 's/"([^"]*)"/''\1''/g; s/,/ /g' $* | > while (s=`{read}) { > echo 's=('$"s')' | rc > echo $s(1) $s(3) $s(4) > } unfortunately this doesn't work, for quite a few reasons. 1) the sed script doesn't deal with quoted double-quotes. 2) nor does it deal with single quotes. 2) the `{read} idiom tokenizes s, ignoring multiple spaces, so that information is lost when putting them back together with $"s 3) the environment might be shared, but rc caches environment variables, so the value of $s when passed to the second echo is the same as that before the rc invocation. i've also encountered newlines in values in csv files, which won't help matters. assuming that there aren't any newlines, an approximation to a solution might be: sed -e 's/''/''''/g' -e 's/"(([^"]|"")*)"/''\1''/g' -e 's/""/"/g' -e 's/.*/s=(&)/' | ifs='' while(e=`{read}){ eval $e echo $s(1) $s(3) $s(4) } i really don't like using eval in this way though. if you get it wrong, you've got a nasty loophole.