From mboxrd@z Thu Jan 1 00:00:00 1970 To: 9fans@cse.psu.edu Subject: Re: [9fans] experimental change for devmnt to deal with spaces From: rog@vitanuova.com MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Message-Id: <20020703174640.EF02919A1C@mail.cse.psu.edu> Date: Wed, 3 Jul 2002 18:52:25 +0100 Topicbox-Message-UUID: c0fbd46e-eaca-11e9-9e20-41e7f4b1d025 nemo: > BTW, is the UTF8 of 0x00a0 0xa0? I think so, but I'm not sure. no rune above 127 has the same representation in utf8. (0xa0 is represented by [0xc2, 0x80]). that means your job is a little harder as you have to grow the messages. but i really don't think this is in the right place, as it affects *every* user-level implemented fileserver, and it's quite possible that a fileserver produces in a file names that have been created inside it. (e.g. think of upas/fs, "create mboxname"). however i still think it shouldn't be too bad if done at the boundaries of the system, where one is less likely to have such things going on, and where one is notionally crossing a language boundary anyway (e.g. one doesn't expect a Windows filesystem to have files containing filenames in plan 9 format). rob: > Change space! actually i thought i was on the side of *not* changing space! space in plan 9 was always a character that could be used to separate filenames unconditionally; now it's not, and that change affects many things. a few things that are affected: * filenames of the same length no longer take the same number of characters. * {ls | sort -f} no longer works. (in fact sort probably needs to understand quoting for its field separators, as probably do several other tools). * regexps for matching filenames become much more complex, and non-determinate, as we don't know which characters were inside or outside the quotes. acme right-button semantics can never work properly on quoted filenames. * greater potential for visual misidentification of distinct filenames: i think that it's more difficult to pick out the filenames in: 'A few very ''long file' names 'in the same ' place than in: A_few_very__'long_file names in_the_same__ place (substituting '_' for whatever the "space" char is), as there's an immediate indication of which spaces are separators and which are part of the filenames. * the fact that spaces run together visually provides great possibilities for ambiguities in filenames (not that that isn't an issue with unicode anyway :-]) * echo no longer works reliably as a convenient way of selecting filenames: e.g. % echo *.txt x.txt y.txt z.txt % ls -l --rw-rw-r-- M 32 rog rog 1 Jul 3 18:31 x.txt y.txt --rw-rw-r-- M 32 rog rog 1 Jul 3 18:31 z.txt oh well. rog.