From mboxrd@z Thu Jan  1 00:00:00 1970
Message-ID: <e0aa13905a775d320025eaa772092b6a@vitanuova.com>
Date: Thu,  1 Jun 2006 17:06:28 +0100
From: rog@vitanuova.com
To: 9fans@cse.psu.edu
Subject: Re: [9fans] csv files -> embarrasing
In-Reply-To: <447EF168.10209@comtv.ru>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
Topicbox-Message-UUID: 58d093d2-ead1-11e9-9d60-3106f5b1d025

> sed 's/"([^"]*)"/''\1''/g; s/,/ /g' $* |
> 	while (s=`{read}) {
> 		echo 's=('$"s')' | rc
> 		echo $s(1) $s(3) $s(4)
>	}

unfortunately this doesn't work, for quite a few reasons.
1) the sed script doesn't deal with quoted double-quotes.
2) nor does it deal with single quotes.
2) the `{read} idiom tokenizes s, ignoring multiple spaces, so
that information is lost when putting them back together with $"s
3) the environment might be shared, but rc caches environment
variables, so the value of $s when passed to the second
echo is the same as that before the rc invocation.

i've also encountered newlines in values in csv files,
which won't help matters.
assuming that there aren't any newlines, an approximation
to a solution might be:

sed -e 's/''/''''/g' -e 's/"(([^"]|"")*)"/''\1''/g' -e 's/""/"/g' -e 's/.*/s=(&)/' |
	ifs='' while(e=`{read}){
		eval $e
		echo $s(1) $s(3) $s(4)
	}

i really don't like using eval in this way though. if you get it wrong,
you've got a nasty loophole.