9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* Re: [9fans] csv files -> embarrasing
@ 2006-05-01  0:34 erik quanstrom
  0 siblings, 0 replies; 9+ messages in thread
From: erik quanstrom @ 2006-05-01  0:34 UTC (permalink / raw)
  To: 9fans

ooks slightly more painful than the previous case, but i think that
using my original trick, one can continue to add elements to the
lifing pipeline and move character-stuffed " and \" out of the way, too.

and didn't i learn that trick well from parsing data from yahoo categories.

- erik

On Sun Apr 30 05:38:13 CDT 2006, mattmobile@proweb.co.uk wrote:
>  > Is parsing CSV really so difficult?
>
> depends on your CSV
>
> 1, "2", "3""3","4, 4", "5\",5"
>
> which I interpret as (\n separated) :
> 1
> 2
> 3"3
> 4, 4
> 5",5
>
>
> If you've ever worked with Excel CSV you know what CSV hell is like


^ permalink raw reply	[flat|nested] 9+ messages in thread
* [9fans] csv files -> embarrasing
@ 2006-04-28 14:36 Steve Simon
  2006-04-28 14:43 ` quanstro
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Steve Simon @ 2006-04-28 14:36 UTC (permalink / raw)
  To: 9fans

Ok, I have spent half an hour trying to parse CSV files
and it's getting embarrasing, I could do it in C but I should
be able to use rc + sed + awk.

The problem is that some of my CSV files fields contain whitespace
and thus have double quotes around them.

I thought rc knows about %q quotes strings so I could use it to
do my parsing, but it fails, can this be done, or is C the answer?
seems a shame to resort to sledge hammers.

-Steve

cpu% cat file.csv
a,b,"c,d,e",f,g
p,q,r,s,t

cpu%
cpu% cat extract
#!/bin/rc

sed 's/"([^"]*)"/''\1''/g; s/,/ /g' $* |
	while (s=`{read})
		echo $s(1) $s(3) $s(4)


cpu% extract file.csv
a 'c d
p r s



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-06-02  3:09 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-05-01  0:34 [9fans] csv files -> embarrasing erik quanstrom
  -- strict thread matches above, loose matches on Subject: below --
2006-04-28 14:36 Steve Simon
2006-04-28 14:43 ` quanstro
2006-04-28 16:29 ` Russ Cox
2006-04-28 18:28 ` lucio
2006-04-30 10:36   ` matt
2006-06-01 13:53 ` Victor Nazarov
2006-06-01 16:06   ` rog
2006-06-02  3:09 ` Rogelio Serrano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).