From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <163def7f5ee028605e6ccee38bc7c57f@collyer.net> To: 9fans@cse.psu.edu Subject: Re: [9fans] blanks already handled properly? From: Geoff Collyer MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Date: Thu, 4 Jul 2002 16:29:15 -0700 Topicbox-Message-UUID: c296f650-eaca-11e9-9e20-41e7f4b1d025 > In case this solves the problem, we would only have to search for > programs that don't handle ' ' within file names and fix them; then > remove quoting from the output of programs that do not output > commands; then add quoting to the output of programs that output > commands. But this assumes that programs know when they are and aren't manipulating file names, which is not true in general (see Software Tools for a fuller exposition of this philosophy). sort and awk don't know if field 4 is a file name or not. Plus, as Charles observed, one might want to quote fields other than file names that contain blanks. In some ways, it seems that adopting a non-space delimiter might be the least painful of the alternatives to deal with file formats. While I'm here, this is the script I use to print hex values and glyphs (if we have any) of characters listed in /lib/unicode: #!/bin/rc # uniquery pattern... - print hex & glyph of any chars matching pattern # in /lib/unicode sts='' for (pat) { hexes = `{grep $pat /lib/unicode | column 1} if (~ $#hexes 0) { echo $0: no such unicode chars: $pat >[1=2] sts='no such chars' } if not for (hex in $hexes) unicode $hex-$hex } exit $sts You can replace "column 1" with "awk '{print $1}'". This is column: #!/bin/rc # column [-F sep] [n...]] - print n'th column(s) rfork e switch ($1) { case -F sep=-F^$2; shift 2 case -F?* sep=-F^`{echo $1 | sed 's/^-F//'}; shift } switch ($#*) { case 0 * = 1 case * if (! ~ $1 [0-9] [0-9][0-9] [0-9][0-9][0-9]) { echo usage: $0 '[-F sep] [n...]' >[1=2] exit usage } } arglist=`{echo $* | sed -e 's/[0-9]+/$&,/g' -e 's/,$//'} exec awk $sep '{print '^$"arglist^'}' Here's a sample use of uniquery: : cpu; uniquery space 0008 0020 00a0   2002   2003   2004   2005   2006   2007   2008   2009   200a   200b ​ 2408 ␈ 2420 ␠ 3000   303f 〿 feff