From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: text/plain; charset=US-ASCII; format=flowed Mime-Version: 1.0 (Apple Message framework v481) From: Michael Baldwin To: 9fans@cse.psu.edu Content-Transfer-Encoding: 7bit In-Reply-To: <20020601165932.L29024@cackle.proxima.alt.za> Message-Id: <9C52EA7C-7588-11D6-9E8E-000393726A14@orb.sh> Subject: [9fans] spaces, separators, and utf-8 Date: Sat, 1 Jun 2002 13:54:25 -0400 Topicbox-Message-UUID: a415bc34-eaca-11e9-9e20-41e7f4b1d025 On Saturday, June 1, 2002, at 10:59 , lucio@proxima.alt.za wrote: > The clincher is that the space is useful both as a separator of > command line arguments and as a joiner of filename "words". command line arguments can be trivially quoted. if the issue is possible shell misinterpretation, then we've got a whole slew of allowed chars that are "problems": * ? ; < > [ ] { } ` ' " $ & ^ # = \ |. they are all allowed now, and if i type them at a shell, i've got to quote them. one shell (mash) used : as a special char, and there were and are files with colons in them in the plan 9 distribution. so why does space get all the scorn? > Seeing as even Michael Baldwin does not suggest using spaces as path > separators (why not?) ["even Michael Baldwin" -- are you saying i now have a reputation as a communist?!] we already have a separator (/), no need for another one. or are you asking why not space instead? when i call a file "My Great Novel" it doesn't seem natural to think of space as a path separator. i used Primos eons ago, and they used ">", which is perhaps arguably slightly more intuitive than "/" (or ":" or "\"). i can imagine wanting to put a date in a filename as in 2002/06/01 or use / in other ways in names, but > is harder to imagine. but in the end, you must have a separator, so you just pick a char and say it's the separator. it is /. and you cannot use / otherwise in a path. fine, that's life. and 9P even works when accessing something that doesn't use / because the protocol itself doesn't use / in Twalk. so one can even get to those ugly \ systems from plan 9 (until they do something stupid like put / in a path element). but space as a path separator? yikes, no. but speaking to digy's point, i'm glad that control chars are disallowed. i think it is useful to have a char or two that you know are outside the possible charset for filenames. i'm thinking of \t and \n, which can easily be used in text programs to delimit paths if they don't feel like quoting. and NUL, well let's not get started. does NUL work *anywhere*? can't use it in C strings, can't use it in acme or rio or sam, can't use it in old 9P. curiously enough, 9P2000 can actually transport it. but just say no to NUL. > The rationale being that long filenames, GUIs and Internationalisation > are all the _new_ rage and may as well be lumped into a single > paradigm change. hmm, i thought that internationalization by using utf-8 everywhere (including in pathnames) was pioneered by plan 9 itself. and it is a good idea. mac os x uses utf-8 paths and does ok with utf-8 in terminal windows and mail; what about the other unixoid systems? but are there any systems that handle utf-8 as cleanly as plan 9 yet? i don't know of any. the mac has problems (do ls in a Terminal window, or use TextEdit or [gasp] vi or emacs), and if there is a convenient input method, i haven't found it. now it would be a great thing if this attribute of plan 9 (utf-8 everywhere and it just works, decent C language support, and a goodly-sized unicode font) were put into the commercial OS's out there. hey geoff, can you pull that off at apple? certainly you wouldn't be opposed to *that* crusade?!