9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: Fco.J.Ballesteros <nemo@plan9.escet.urjc.es>
To: 9fans@cse.psu.edu
Subject: Re: [9fans] blanks already handled properly?
Date: Fri,  5 Jul 2002 09:52:24 +0200	[thread overview]
Message-ID: <887f8d1b34047ae49b20087db9b2f055@plan9.escet.urjc.es> (raw)

[-- Attachment #1: Type: text/plain, Size: 1509 bytes --]

:  > In case this solves the problem, we would only have to search for
:  > programs that don't handle ' ' within file names and fix them; then
:  > remove quoting from the output of programs that do not output
:  > commands; then add quoting to the output of programs that output
:  > commands.
:  
:  But this assumes that programs know when they are and aren't
:  manipulating file names, which is not true in general (see Software

My argument doesn't imply that all programs know when they are handling
file names...

:  Tools for a fuller exposition of this philosophy).  sort and awk don't
:  know if field 4 is a file name or not.  Plus, as Charles observed, one
:  might want to quote fields other than file names that contain blanks.

...  because in cases like this, I'd say you have to use a sensible
file format so that sort and awk could be used later on it.

Only programs that know that are producing command lines (which
include file names) would need to quote their output file names.

Agreement on this?

:  In some ways, it seems that adopting a non-space delimiter might be
:  the least painful of the alternatives to deal with file formats.

Exactly, as I said in my previous mail, if space is causing a problem
within your file format, we could say just that it's not a good file
format and it should be changed (If you want quoting you can still
include that as part of your format, independently of file names).

Any other problem with this approach?

[-- Attachment #2: Type: message/rfc822, Size: 3103 bytes --]

From: Geoff Collyer <geoff@collyer.net>
To: 9fans@cse.psu.edu
Subject: Re: [9fans] blanks already handled properly?
Date: Thu, 4 Jul 2002 16:29:15 -0700
Message-ID: <163def7f5ee028605e6ccee38bc7c57f@collyer.net>

> In case this solves the problem, we would only have to search for
> programs that don't handle ' ' within file names and fix them; then
> remove quoting from the output of programs that do not output
> commands; then add quoting to the output of programs that output
> commands.

But this assumes that programs know when they are and aren't
manipulating file names, which is not true in general (see Software
Tools for a fuller exposition of this philosophy).  sort and awk don't
know if field 4 is a file name or not.  Plus, as Charles observed, one
might want to quote fields other than file names that contain blanks.

In some ways, it seems that adopting a non-space delimiter might be
the least painful of the alternatives to deal with file formats.


While I'm here, this is the script I use to print hex values and
glyphs (if we have any) of characters listed in /lib/unicode:

#!/bin/rc
# uniquery pattern... - print hex & glyph of any chars matching pattern
#	in /lib/unicode
sts=''
for (pat) {
	hexes = `{grep $pat /lib/unicode | column 1}
	if (~ $#hexes 0) {
		echo $0: no such unicode chars: $pat >[1=2]
		sts='no such chars'
	}
	if not
		for (hex in $hexes)
			unicode $hex-$hex
}
exit $sts

You can replace "column 1" with "awk '{print $1}'".  This is column:

#!/bin/rc
# column [-F sep] [n...]] - print n'th column(s)
rfork e
switch ($1) {
case -F
	sep=-F^$2; shift 2
case -F?*
	sep=-F^`{echo $1 | sed 's/^-F//'}; shift
}
switch ($#*) {
case 0
	* = 1
case *
	if (! ~ $1 [0-9] [0-9][0-9] [0-9][0-9][0-9]) {
		echo usage: $0 '[-F sep] [n...]' >[1=2]
		exit usage
	}
}
arglist=`{echo $* | sed -e 's/[0-9]+/$&,/g' -e 's/,$//'}
exec awk $sep '{print '^$"arglist^'}'

Here's a sample use of uniquery:

: cpu; uniquery space
0008
0020  
00a0  
2002  
2003  
2004  
2005  
2006  
2007  
2008  
2009  
200a  
200b ​
2408 ␈
2420 ␠
3000  
303f 〿
feff 

             reply	other threads:[~2002-07-05  7:52 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-07-05  7:52 Fco.J.Ballesteros [this message]
  -- strict thread matches above, loose matches on Subject: below --
2002-07-05  9:22 okamoto
2002-07-05  8:14 Fco.J.Ballesteros
2002-07-05  9:02 ` arisawa
2002-07-04 23:29 Geoff Collyer
2002-07-04 14:53 FJ Ballesteros

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=887f8d1b34047ae49b20087db9b2f055@plan9.escet.urjc.es \
    --to=nemo@plan9.escet.urjc.es \
    --cc=9fans@cse.psu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).