From mboxrd@z Thu Jan 1 00:00:00 1970 To: 9fans@cse.psu.edu Subject: Re: [9fans] blanks in file names From: forsyth@vitanuova.com MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Message-Id: <20020702110848.F30EA19992@mail.cse.psu.edu> Date: Tue, 2 Jul 2002 12:09:33 +0100 Topicbox-Message-UUID: bf6da6a4-eaca-11e9-9e20-41e7f4b1d025 >>rog@vitanuova.com wrote: >> the benefits of converting to utf-8 throughout were obvious. does the >> ability to type ' ' rather than (say) ALT-' ' really confer comparable >> advantages? >The benefits of any feature depend on whether you make use of them. >People living in an ASCII world get no particular benefit from UTF8, >and people with spaces in file names (e.g. on a Windows filesystem) >get substantial benefit from having their files handled properly. no one is suggesting, least of all roger, that nothing be done to allow access to files between Plan 9 and systems that have spaces in names. the disagreement is about the scope, and the means. >>In any case, I agree that blanks are here to stay and I'd like Plan 9 >>to handle then as nicely as it handles . it's not just spaces. i have had to handle / as well, for instance. that might not be of interest to some, but it has occurred. from an end user's point of view, it seems perfectly reasonable to me. i'd also pick out something from a previous comment: >>Plan 9 to Unicode and UTF: too hard, too much code to change, too many >>symmetries broken. But there's no way this problem is as hard as that >>conversion, and we handled that one just fine. All that's missing is surely it was much easier to do once the problems of Unicode's original 8-bit representation pre-UTF had been dealt with: In August 1992, X-Open circulated a proposal for another UTF-like byte encoding of Unicode characters. Their major concern was that an embedded character in a file name (in particular a slash) could be part of an escape sequence in UTF and therefore confuse a traditional file system. that single change to UTF made it more straightforward to work out where the potential problems were, not least some older code could no longer fail (as it would have done with the earlier proposal): p = strrchr(filename, '/'); for instance. prior to that, each such instance needed to be tracked down and examined, assuming (on non-Plan9 systems) that source was available. of course, larger changes were required to tools that needed to support Unicode well (regexp, tr, wc, and so on). still, with many potential possibilities for mechanical confusion eliminated at a stroke, i'd say it instantly made the idea attractive. there were not as many problems to worry about. i do think the support for quoting is useful for many things (roger added support for quoting to Inferno's String module several years ago for just that reason), but i'm not sure myself it's the right or sufficient solution for file names.