9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* Re: [9fans] blanks in file names
@ 2002-07-02 11:09 forsyth
  2002-07-02 11:53 ` matt
  0 siblings, 1 reply; 103+ messages in thread
From: forsyth @ 2002-07-02 11:09 UTC (permalink / raw)
  To: 9fans

>>rog@vitanuova.com wrote:
>> the benefits of converting to utf-8 throughout were obvious.  does the
>> ability to type ' ' rather than (say) ALT-' ' really confer comparable
>> advantages?

>The benefits of any feature depend on whether you make use of them.
>People living in an ASCII world get no particular benefit from UTF8,
>and people with spaces in file names (e.g. on a Windows filesystem)
>get substantial benefit from having their files handled properly.

no one is suggesting, least of all roger, that nothing be done to
allow access to files between Plan 9 and systems that have spaces in names.
the disagreement is about the scope, and the means.

>>In any case, I agree that blanks are here to stay and I'd like Plan 9
>>to handle then as nicely as it handles .

it's not just spaces.  i have had to handle / as well, for instance.
that might not be of interest to some, but it has occurred.
from an end user's point of view, it seems perfectly reasonable to me.

i'd also pick out something from a previous comment:

>>Plan 9 to Unicode and UTF: too hard, too much code to change, too many
>>symmetries broken.  But there's no way this problem is as hard as that
>>conversion, and we handled that one just fine.  All that's missing is

surely it was much easier to do once the problems of Unicode's
original 8-bit representation pre-UTF had been dealt with:

	In August 1992, X-Open circulated a proposal for another UTF-like byte
	encoding of Unicode characters.  Their major concern was that an
	embedded character in a file name (in particular a slash) could be
	part of an escape sequence in UTF and therefore confuse a traditional
	file system.

that single change to UTF made it more straightforward to work out where
the potential problems were, not least some older code could no longer fail (as it
would have done with the earlier proposal):
	p = strrchr(filename, '/');
for instance.  prior to that, each such instance needed to be tracked down
and examined, assuming (on non-Plan9 systems) that source was available.
of course, larger changes were required to tools that needed to support Unicode
well (regexp, tr, wc, and so on).   still, with many potential possibilities
for mechanical confusion eliminated at a stroke, i'd say it instantly made the
idea attractive.  there were not as many problems to worry about.

i do think the support for quoting is useful for many things (roger added support for quoting to
Inferno's String module several years ago for just that reason), but i'm not sure myself
it's the right or sufficient solution for file names.



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-15  4:03 Geoff Collyer
  2002-07-15 14:53 ` Jack Johnson
  2002-07-21 19:47 ` Dave
  0 siblings, 2 replies; 103+ messages in thread
From: Geoff Collyer @ 2002-07-15  4:03 UTC (permalink / raw)
  To: 9fans

My apologies for dragging this out, but I think this gets to the heart
of the matter.

The cost of strlen doesn't matter, it really doesn't matter.  I can't
recall seeing a real-life program where the cost of strlen was more
than trivia.

A quick comparison on Plan 9 shows that strlen runs at 19 times the
speed of DES, using the same 64-byte string as input each time, which
is almost certainly longer than the average string.  DES is really
slow, by design; it's not just a question of a few more memory
accesses; it's just hard to do the bit-swizzling quickly in software
(though it can be done relatively quickly in hardware).

One of the big achievements of Unix was to get people to stop worrying
about the microseconds and look at the slightly larger picture of what
could be done if your first concern were not microseconds, and that
was a quarter-century ago!  With processor cycles being so cheap and
available now, it's generally not worth worrying about expenses until
they cause a real (measurable, reproducible) problem.  Proposals based
on supposedly greater efficiency for a new open(2) interface are not
worth considering, particularly when open's efficiency isn't currently
a problem, and the new interface is ugly, incompatible with the
existing clean and simple one, and purports to solve non-problems like
allowing NUL and slash in file name components.


One downside of APE is that it's made some people think of Plan 9 as
Just Another Goddamned Unix Implementation.  If you're not interested
in exploring what's new in Plan 9, and are offended that realloc isn't
provably optimal or that GNU configure doesn't just run out the box,
why are you using Plan 9?  FreeBSD, OpenBSD, NetBSD and Linux are all
available at no monetary cost.



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-14 18:28 rob pike, esq.
  0 siblings, 0 replies; 103+ messages in thread
From: rob pike, esq. @ 2002-07-14 18:28 UTC (permalink / raw)
  To: 9fans

> > strlen() is an expensive operation.
> 
> No, it's not terribly expensive.

The strlen() of this thread is certainly an expensive operation.

-rob



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-11 23:56 okamoto
  0 siblings, 0 replies; 103+ messages in thread
From: okamoto @ 2002-07-11 23:56 UTC (permalink / raw)
  To: 9fans

>The idea was born from the discussion between Dave and I.
>Dave proposed '\' escape.

No, both are completely misunderstanding the point which Rob pointed.
You can read once more his and nemo's remarks before.

Kenji



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-11  8:39 Geoff Collyer
  2002-07-14 18:13 ` Dave
  0 siblings, 1 reply; 103+ messages in thread
From: Geoff Collyer @ 2002-07-11  8:39 UTC (permalink / raw)
  To: 9fans

dave@dave.tj,

Have you actually used Plan 9 or even read its manual?  I ask because
you've made a succession of odd statements:

> strlen() is an expensive operation.

No, it's not terribly expensive.  Even if it were, most strings are
short, so it's not much of an issue.  DES encryption is expensive.

> realloc() sucks in a multithreaded environment.

You don't define ``sucks'', but the whole malloc family use internal
locks on Plan 9 so that they *do* work well (or at least don't corrupt
memory) in multiprocess or multithread programs.

> Also, I'd like to mention again that I'm not asking the kernel to
> allocate memory.

The kernel already allocates memory, including string memory.  What's
the big deal?



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-11  5:50 okamoto
  0 siblings, 0 replies; 103+ messages in thread
From: okamoto @ 2002-07-11  5:50 UTC (permalink / raw)
  To: 9fans

>Eh, I tire of the discussion

Yes, indeed.
We can refer to a statement from Rob times before.

Kenji



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-11  1:41 anothy
       [not found] ` <"anothy@cosym.net"@Jul>
  0 siblings, 1 reply; 103+ messages in thread
From: anothy @ 2002-07-11  1:41 UTC (permalink / raw)
  To: 9fans

on the "linked lists for paths instead of strings with /" topic:
while i can see how this could be interesting to persue, from a
systems architecture research point of view, i'm unclear on how
exactly this helps us solve any of the problems people are
talking about. maybe i'm just missing something; if so, please
clue me in. but here's what i don't see:

first, it seems your suggestion is primarialy aimed at allowing
"/" in file names (perhaps some argument about generality or
interoperation with DOS-derived systems and "\" could be made,
as well). okay, i can see the theoretical benefit there (but it
strikes me as a much more dubious gain than spaces in file
names). but what would this system _look_ like, from a user's
point of view? what would the output of `pwd' be? a list of
strings? but how can i store a representation of that? wouldn't
that involve lots of interaction with the shell, or whatever
else was doing the interpretation (like $path does, although
that's used is very few places, and is very function-specific,
so isn't really much of an issue)?

second, i'm just not seeing at all how this helps us address
spaces in file names (and maybe it wasn't meant to; this
thread has wandered a bit). the dificult part of the problem
(perhaps the entirety of the problem) seems to lie in getting
the various tools that need to communicate to understand the
edges of file names: things like "grep `{ls} foo". i fail to
see how any of the kernel changes (or library additions) you
propose even talk to this realm of problem.

third: want newline too?

on another embedded thread, you wrote:
// You can do a lot of things if you're prepared to get
// involved in the functions that your OS should be doing
// automatically.  Try running an FTP mirror to a busy site
// that way, though, and you'll quickly discover...

that you've completely missed what dan was saying? he (and
Nemo before him) was talking about the ability of Plan 9's
dump to back out changes to kernel, library, and utilitys
uniformly. i don't have a clue what FTP or HTTP mirrors or
"things your OS should do automatically" have to do with
any of this. are we just having unrelated conversations at
each other?
ア


^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-10 18:27 David Gordon Hogan
  2002-07-10 20:56 ` arisawa
  0 siblings, 1 reply; 103+ messages in thread
From: David Gordon Hogan @ 2002-07-10 18:27 UTC (permalink / raw)
  To: 9fans

Isn't this thread dead yet?



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-10  8:00 Fco.J.Ballesteros
  0 siblings, 0 replies; 103+ messages in thread
From: Fco.J.Ballesteros @ 2002-07-10  8:00 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 277 bytes --]

Really, I make changes now and then; many times after
trying the resulting binaries I change my mind and
use yesterday + cp to restore the source back to its
previous state. Some other times I bind temporary directories
on top of the sources and make the changes there.


[-- Attachment #2: Type: message/rfc822, Size: 2260 bytes --]

From: Dave <dave@dave.tj>
To: 9fans@cse.psu.edu
Subject: Re: [9fans] blanks in file names
Date: Tue, 09 Jul 2002 11:23:25 -0400 (EDT)
Message-ID: <200207091523.g69FNP431828@dave2.dave.tj>

You're not going to do that every time you make a change to the
filesystem.  Besides, that won't undo all the "changes" to all the
new programs that lack complexity because they don't have to reinvent
the wheel.

 - Dave


Fco.J.Ballesteros wrote:
> 
> >> But above all, I will undo the changes made in this respect to my
> >> local system if you guys or the system designers choose a different way.
> ...
> > Undoing kernel-level changes won't be easy, especially when people start
> 
> 9fs dump
> cp blah blah
> 
> Sorry, couldn't resist. I just love this system :-)
> 

^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-09  7:54 Fco.J.Ballesteros
  0 siblings, 0 replies; 103+ messages in thread
From: Fco.J.Ballesteros @ 2002-07-09  7:54 UTC (permalink / raw)
  To: 9fans

>> But above all, I will undo the changes made in this respect to my
>> local system if you guys or the system designers choose a different way.
...
> Undoing kernel-level changes won't be easy, especially when people start

9fs dump
cp blah blah

Sorry, couldn't resist. I just love this system :-)



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-09  1:08 okamoto
  0 siblings, 0 replies; 103+ messages in thread
From: okamoto @ 2002-07-09  1:08 UTC (permalink / raw)
  To: 9fans

>if you are writing in english, not necessarily. yes if you are writing in
>japanese. in that case, it should have seventeen (five/seven/five) *onji*
>which is not a syllable, but a sound symbol. 

Aa!  Now, I understood the difference between 俳句 and Haiku.
I'll show you a 俳句example, which is one of the most famous one
written for this season by 芭蕉(Bashou) posted before.

古池や蛙とびこむ水の音

Abobe is written using Hiragana and Kanji, which is Japanese writing
style heritaged from Heian-Ara invented by Japanese women..., anyway
it should be read as

ふるいけや かわず とびこむ みずのおと

Now, you counts 5+7+5 hiraganas, which are the reading showtest units
(syllable?) in Japanese.  Here, I intentianlly put 'blank' between 5+7+5,
but it not used formally.

Another thing, in 俳句 we need some key word which implies a season
when the author want to express.   In the above example, it's 蛙(frog).
Frogs apear to us from early summer.  So, I felt that an English version
of Haiku which someone showed here is not 俳句, but 川柳(senryu).

Sorry making noise.

Kenji -- Enjoying UTF-8 capability of Plan 9



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-08 23:19 David Gordon Hogan
  2002-07-08 23:30 ` Dave
  0 siblings, 1 reply; 103+ messages in thread
From: David Gordon Hogan @ 2002-07-08 23:19 UTC (permalink / raw)
  To: 9fans

>> I thought the making of saints wasn't exactly in the kernel's job
>> description.
> 
> Naw, making saints is "canonization".  Making artillery is a perfectly
> good kernel task.

Such kernels may, however, be subject to export restrictions...



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-08 20:22 rob pike, esq.
  2002-07-08 21:21 ` Dave
  0 siblings, 1 reply; 103+ messages in thread
From: rob pike, esq. @ 2002-07-08 20:22 UTC (permalink / raw)
  To: 9fans

> changing open(2), execve(2), etc. is
> a requirement if we want a fundamental solution to the problem. 

Fundamental solutions should be applied only to fundamental problems.
This is not a fundamental problem; far from it.  The means should be
proportionate to the ends.

> > Don't you think that changing open, considering '/' in names, and
> > similar stuff is just too much? 

Yes, it is, because to do so contradicts too many conventions that are
integral to the system.

-rob



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-08 12:18 forsyth
       [not found] ` <"forsyth@caldo.demon.co.uk"@Jul>
  0 siblings, 1 reply; 103+ messages in thread
From: forsyth @ 2002-07-08 12:18 UTC (permalink / raw)
  To: 9fans

>>Finally, it gives us the capability of getting away from even the most
>>elementary requirements in a filesystem (like inodes) at some point in
>>the future without extensive code changes.  Basically, it's all about

Plan 9 hasn't really got assumptions about inodes.  9P doesn't use them.
(makes it tricky to do NFS, as it happens, though that's not Plan 9's fault
this time.)



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-08  8:59 Fco.J.Ballesteros
       [not found] ` <Fco.J.Ballesteros@Jul>
  0 siblings, 1 reply; 103+ messages in thread
From: Fco.J.Ballesteros @ 2002-07-08  8:59 UTC (permalink / raw)
  To: 9fans

I think we've lost focus. I'm scared.

The whole point was to handle characters which are allowed but may
cause problems like ' ' and ''''.  And AFAIK, the problem was that we
get them from outside systems, and that people may be so used to them
that they may even start to use them on native Plan 9 files.

Don't you think that changing open, considering '/' in names, and
similar stuff is just too much?  I'm scared.  That may be interesting,
but it would lead to a very different system.

Moreover, do you think that the system designers would ever consider
'/' as a legitimate character within file names?  Although I don't
know, I'd bet they'll never do that (at least I have to say I would
never do it, sorry).

I think I'm just going to try option bⁱ myself and then send a diff
in case it works out.

But above all, I will undo the changes made in this respect to my
local system if you guys or the system designers choose a different way.
It's a very nice system and I wouldn't like to get N different ones
nor to break it.


-- 
ⁱ Option b was: remove quoting from ls et al, add it to those that
print commands, and fix those that don't cosider ' ' in file names.



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-08  0:38 Scott Schwartz
  0 siblings, 0 replies; 103+ messages in thread
From: Scott Schwartz @ 2002-07-08  0:38 UTC (permalink / raw)
  To: 9fans

| If '/' is prohibitted as an element of file name and directory name,
| then no change to open is required.
| 
| Let's assume we accept '/' as an element of names,
| then how do you express path in rc?
 
cd (stl list/vector foo)

I think this discussion is verging on reinventing lisp, where you have a
well specified format for textually representing data, standard primatives
for reading and writing it, and a set of datatypes to store it in.



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-07  5:59 Geoff Collyer
  0 siblings, 0 replies; 103+ messages in thread
From: Geoff Collyer @ 2002-07-07  5:59 UTC (permalink / raw)
  To: 9fans

I had thought that one of my sicker ideas was too ridiculous to
suggest, but perhaps not, given the precedent:

A pair of kernel devices (or a single device with an encode or decode
indicator) to translate hURL-encoded file names into plan 9 file
names, and vice versa.  After translation, they are passed to namec.
This lets you use all the glorious botches that have been invented for
hURL (or is it URI?) encoding.

	#:dfile://localhost/my%20file
and
	#:dmy%20file

would map to "my file".  In the presence of a file server that
understands the encoded form, one could do the reverse mapping:
"#:emy file" should probably map to

	my%20file

A further benefit of using the encoded form of names in files is that
programs can then guess fairly reliably which are file names: a field
starting with "file://localhost/" is likely to be a file name.


I'm not seriously suggesting using awful web syntax, but perhaps the
general idea suggests a way forward.



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-05 19:21 David Gordon Hogan
  2002-07-05 19:52 ` Jim Choate
  2002-07-05 20:10 ` Mark Bitting
  0 siblings, 2 replies; 103+ messages in thread
From: David Gordon Hogan @ 2002-07-05 19:21 UTC (permalink / raw)
  To: 9fans

>>>> Change space! Change the file delimiter!
>>>> The shell will never recover.  The system will break.
>>>> I will mourn.
>>> 
>>> Isn't a haiku supposed to have 17 syllables?
>> 
>> Some would dispute the imposition of such an absolute rule.
> 
> Is that why your message has only 16 syllables?

Word.



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-05 18:26 Sape Mullender
  0 siblings, 0 replies; 103+ messages in thread
From: Sape Mullender @ 2002-07-05 18:26 UTC (permalink / raw)
  To: 9fans

>>> Change space! Change the file delimiter!
>>> The shell will never recover.  The system will break.
>>> I will mourn.
>> 
>> Isn't a haiku supposed to have 17 syllables?
> 
> Some would dispute the imposition of such an absolute rule.

Is that why your message has only 16 syllables?



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-05 18:23 David Gordon Hogan
  0 siblings, 0 replies; 103+ messages in thread
From: David Gordon Hogan @ 2002-07-05 18:23 UTC (permalink / raw)
  To: 9fans

>> Change space! Change the file delimiter!
>> The shell will never recover.  The system will break.
>> I will mourn.
> 
> Isn't a haiku supposed to have 17 syllables?

Some would dispute the imposition of such an absolute rule.



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-05  1:21 okamoto
  0 siblings, 0 replies; 103+ messages in thread
From: okamoto @ 2002-07-05  1:21 UTC (permalink / raw)
  To: 9fans

古池や蛙飛び込む水の音

furu ike ya kawazu tobikomu mizuno oto

Hmmm, it seems to be too long than 17 characters. :-)
Sorry ;_;

Kenji



^ permalink raw reply	[flat|nested] 103+ messages in thread
[parent not found: <20020703160003.27491.58783.Mailman@psuvax1.cse.psu.edu>]
* Re: [9fans] blanks in file names
@ 2002-07-04 12:26 Fco.J.Ballesteros
  0 siblings, 0 replies; 103+ messages in thread
From: Fco.J.Ballesteros @ 2002-07-04 12:26 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 702 bytes --]

I'm sorry if in any of my previous mails I did sound
like `Rob did it wrong' or something like that.
I didn't mean that at all, and apologize if that
was what I actually wrote.

I think we all now agree that
-> translating in the mount driver is a bad idea.
-> translating in the servers might work, but we're not sure.
-> %q is a good thing, but,
-> ls et al shouldn't use %q (unless other programs change their conventions
  to read file names and digest the quotes properly).

I'm sorry my diff triggered this, but in any case I think we
all are trying to find a way, and more discussions like this will
probably follow.

thanks you all; this has been instructive, at least for me.

[-- Attachment #2: Type: message/rfc822, Size: 1393 bytes --]

From: forsyth@vitanuova.com
To: 9fans@cse.psu.edu
Subject: Re: [9fans] blanks in file names
Date: Thu, 4 Jul 2002 13:20:54 +0100
Message-ID: <20020704122031.530FF19981@mail.cse.psu.edu>

>>So we need an uniform set of conventions, and make all the programs
>>obey them. And that's not the case right now. That's all I was saying
>>regarding %q.

yes, but to be fair, i believe that is the `making the effort' (if i remember the phrase correctly)
that rob mentioned in a previous message!

^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-04 12:20 forsyth
  0 siblings, 0 replies; 103+ messages in thread
From: forsyth @ 2002-07-04 12:20 UTC (permalink / raw)
  To: 9fans

>>So we need an uniform set of conventions, and make all the programs
>>obey them. And that's not the case right now. That's all I was saying
>>regarding %q.

yes, but to be fair, i believe that is the `making the effort' (if i remember the phrase correctly)
that rob mentioned in a previous message!



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-04 11:37 Fco.J.Ballesteros
  0 siblings, 0 replies; 103+ messages in thread
From: Fco.J.Ballesteros @ 2002-07-04 11:37 UTC (permalink / raw)
  To: 9fans

:  i don't think %q per se is a bad thing, but i don't think filenames
:  should require it and hence i don't think tools like ls, pwd, etc
:  should output filenames using it.

agree



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-04 11:36 rog
  0 siblings, 0 replies; 103+ messages in thread
From: rog @ 2002-07-04 11:36 UTC (permalink / raw)
  To: 9fans

> Perhaps all this discussion is a symptom that a different convention to
> name files is needed. For example, if file names were always quoted
> (need they to be or not), all programs could rely on a simple set of
> rules to handle file names.
>
> Is there agreement on this? Or is something else I'm also missing?

i don't think %q per se is a bad thing, but i don't think filenames
should require it and hence i don't think tools like ls, pwd, etc
should output filenames using it.

then we're back to where we started (a comfortable place to be IMHO)
except that we have a nice, easily available convention for
serialising arguments containing spaces, and we can talk to alien
filesystems that expect spaces in their names.

  rog.



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-04  9:50 Fco.J.Ballesteros
  0 siblings, 0 replies; 103+ messages in thread
From: Fco.J.Ballesteros @ 2002-07-04  9:50 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 169 bytes --]

So we need an uniform set of conventions, and make all the programs
obey them. And that's not the case right now. That's all I was saying
regarding %q.

thanks



[-- Attachment #2: Type: message/rfc822, Size: 2227 bytes --]

From: forsyth@caldo.demon.co.uk
To: 9fans@cse.psu.edu
Subject: Re: [9fans] blanks in file names
Date: Thu, 4 Jul 2002 10:41:48 +0100
Message-ID: <30a861a9c8a7a63c0e69d46049d70908@caldo.demon.co.uk>

>>Is there agreement on this? Or is something else I'm also missing?

not exactly.  i think there's an implication though about the uniform
implementation of quoting rules throughout the system so that
	read(filename)
	open(dirname + "/" + filename)
would work because `filename' and `dirname' etc haven't any quotes,
because lines read from standard input (say) have been parsed.
some programs do so already but the conventions vary from place to place.
essentially, strings inside programs are in their parsed form and
strings read and written by a program are expected to be suitably quoted.
thus
	ls | mumble
would work because mumble will apply the standard rules (might be as simple
as calling tokenize consistently) to each line of its input, and thus unquote it
to reveal the original file names.

i say it's not just file names particularly; for instance, input to a program might
be fields
	when where howmuch why who
allowing
	15/9/1660 London 4d 'First Cup of Tea' 'Saml Pepys'
and with suitable changes comm, sort, etc. could be applied in obvious ways.

^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-04  9:41 forsyth
  0 siblings, 0 replies; 103+ messages in thread
From: forsyth @ 2002-07-04  9:41 UTC (permalink / raw)
  To: 9fans

>>Is there agreement on this? Or is something else I'm also missing?

not exactly.  i think there's an implication though about the uniform
implementation of quoting rules throughout the system so that
	read(filename)
	open(dirname + "/" + filename)
would work because `filename' and `dirname' etc haven't any quotes,
because lines read from standard input (say) have been parsed.
some programs do so already but the conventions vary from place to place.
essentially, strings inside programs are in their parsed form and
strings read and written by a program are expected to be suitably quoted.
thus
	ls | mumble
would work because mumble will apply the standard rules (might be as simple
as calling tokenize consistently) to each line of its input, and thus unquote it
to reveal the original file names.

i say it's not just file names particularly; for instance, input to a program might
be fields
	when where howmuch why who
allowing
	15/9/1660 London 4d 'First Cup of Tea' 'Saml Pepys'
and with suitable changes comm, sort, etc. could be applied in obvious ways.



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-04  8:31 Fco.J.Ballesteros
  0 siblings, 0 replies; 103+ messages in thread
From: Fco.J.Ballesteros @ 2002-07-04  8:31 UTC (permalink / raw)
  To: 9fans

> it would be good to have some simple but consistent conventions for quoting data,
> not particularly file names.  it's true that it would probably affect many commands
> (sort, sed, awk etc) but i think it might be worthwhile.

I'd like the file names to be always the same. What I don't like about
%q is that some times you get /a/b and other times you get '/a/b c'.

This is a serious problem IMHO, since the "'" would make open fail.

I'd like

	read(filename)
	open(filename)

and
	read(filename)
	open(dirname + "/" + filename)


to keep on working, which is no longer the case after %q. (Didn't check
naming in the kernel recently, but I think these calls would fail).

Perhaps all this discussion is a symptom that a different convention to
name files is needed. For example, if file names were always quoted
(need they to be or not), all programs could rely on a simple set of
rules to handle file names.

Is there agreement on this? Or is something else I'm also missing?



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-04  8:22 forsyth
  0 siblings, 0 replies; 103+ messages in thread
From: forsyth @ 2002-07-04  8:22 UTC (permalink / raw)
  To: 9fans

>>:  should be the character in the name, the quoting approach seems
>>:  the only one that works properly, and we have some work to do
>>:  to work out all the interactions with shell scanning (oh dear!), and changing

>>But it's not just the shell, almost *any* program reading file names would
>>have to deal with quoting. What has been a file name is changing after %q,

you stopped the quote just a little early:
	to work out all the interactions with shell scanning (oh dear!), and changing
 ->	(i suspect) the input conventions of quite a few commands, but still, it's a finite task.
but i ought to have said `the input [and output] conventions ...'

it would be good to have some simple but consistent conventions for quoting data,
not particularly file names.  it's true that it would probably affect many commands
(sort, sed, awk etc) but i think it might be worthwhile.

it's something that XML can't do properly for instance.
just amazing.  all that effort for a revolution, and botched at the start.



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-04  7:53 Fco.J.Ballesteros
  0 siblings, 0 replies; 103+ messages in thread
From: Fco.J.Ballesteros @ 2002-07-04  7:53 UTC (permalink / raw)
  To: 9fans

That's important since I may have names like
'a b' and 'a_b', both legal, and your translation
would get into throuble here.

> Hello,
> 
>>You can't do that becaause if a file contains a genuine
>>underscore, the outgoing software will think it's a space.
> the outgoing software will also think it's a underscaore.
> Is this important?
> Some OSs already map 'Windows' to 'WINDOWS'.
> 
> Kenji Arisawa



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-04  7:47 Fco.J.Ballesteros
  0 siblings, 0 replies; 103+ messages in thread
From: Fco.J.Ballesteros @ 2002-07-04  7:47 UTC (permalink / raw)
  To: 9fans

forsyth:
:  the problem rob alluded to, by analogy with NAT, should not
:  arise within the Plan 9 system.  for instance, if i have a file
:  of file names, can i read it in and be sure to access those names?
:  if space is _ and _ is `boo!' it's anyone's guess.

But space is not _, space is space.
As I said in a previous mail, if you have a file of file names,
you can still read and use it. AFAIK, you can be sure to access those
names.

Nevertheless, I can understand your arguments for doing it at
boundaries and not at the kernel.

:  should be the character in the name, the quoting approach seems
:  the only one that works properly, and we have some work to do
:  to work out all the interactions with shell scanning (oh dear!), and changing

But it's not just the shell, almost *any* program reading file names would
have to deal with quoting. What has been a file name is changing after %q,
before that, a program would expect something like [/]a/b/..., that's no longer
the case.



^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-04  6:34 forsyth
  2002-07-04  7:39 ` Lucio De Re
  0 siblings, 1 reply; 103+ messages in thread
From: forsyth @ 2002-07-04  6:34 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 3315 bytes --]

the advantage of pushing stuff like this back to the boundaries,
the points at which Plan 9 interacts with other systems and formats
(often using 9P), rather than building it into the heart of Plan 9, is that
at least the Plan 9 system itself can have straightforward internal rules.
at the boundaries you're bound to be aware that there
are differences (in practice).  for instance, you can create
names via dossrv that Windows itself cannot access properly
(left as an exercise for the reader), but aren't prohibited by
the FAT32 specification, or indeed, documented accurately
anywhere.  it's only a little better with 9660srv.

the problem rob alluded to, by analogy with NAT, should not
arise within the Plan 9 system.  for instance, if i have a file
of file names, can i read it in and be sure to access those names?
if space is _ and _ is `boo!' it's anyone's guess.

it's more understandable if, when problems do arise, they appear only at
the boundaries with other systems, where there are necessarily
some differences (remembering of course that any given difference might or
might not be strictly `necessary').

so far, it seems to me:

if Plan 9 allows space directly in the storage names, then space
should be the character in the name, the quoting approach seems
the only one that works properly, and we have some work to do
to work out all the interactions with shell scanning (oh dear!), and changing
(i suspect) the input conventions of quite a few commands, but still, it's a finite task.
the possible scale of that task--even understanding all the implications--
can i think be underestimated, which is why i snarl a bit at
people who are incredibly glib about it [which rob is not].

otherwise, Plan 9 should prohibit space in its file names, as
previously, and the user interface can use something like U+00A0 to
provide the functionality that at least some humans seem to think is
essential.  (computers don't give a dam whether names have spaces or
not.) at the boundaries with Windows or Mac or Unix that character can
be mapped to and fro, but there is already mapping of various sorts
taking place there.  (Windows itself has trouble enough with its
conventions; another little exercise: what is a file name separator
under Windows and how do you use it?) to return to the file name list
example, it's less surprising that Plan 9 can't take a list of file in
Windows style and access them directly.  that wouldn't work whether
space were allowed or not.   Plan 9 already does translate at
the boundaries between UTF-8 and whatever the other system uses.

the latter approach handles not just space but also some other
characters such as `/' that turn up at the boundaries, but cannot be
handled by Plan 9.  again, some are glib about the need for it,
and i snarl again.  even so, pragmatically there's no need to use one technique
for everything if space is regarded as utterly different from all
other such characters.  space could be used as itself within the system
and we can map `/' as and when we find it.

it's also worth noting that having consistent quoting conventions to allow spaces
in space-separated input is probably worthwhile on its own, so not all
the work in Plan A will vanish.  don't quit that editor just yet.

[-- Attachment #2: Type: message/rfc822, Size: 1658 bytes --]

To: 9fans@cse.psu.edu
Subject: Re: [9fans] blanks in file names
Date: Thu,  4 Jul 2002 07:10:55 +0900
Message-ID: <20020703221118.15154199B7@mail.cse.psu.edu>

Hello,

>You can't do that becaause if a file contains a genuine
>underscore, the outgoing software will think it's a space.
the outgoing software will also think it's a underscaore.
Is this important?
Some OSs already map 'Windows' to 'WINDOWS'.

Kenji Arisawa

^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-03  8:00 Fco.J.Ballesteros
  2002-07-03 12:00 ` Lucio De Re
  0 siblings, 1 reply; 103+ messages in thread
From: Fco.J.Ballesteros @ 2002-07-03  8:00 UTC (permalink / raw)
  To: 9fans

rog@vitanuova.com :
:  /sys/src/cmd/dossrv
:  /sys/src/cmd/9660srv
:  /sys/src/cmd/tapefs
:  /sys/src/cmd/unix/u9fs
:  /sys/src/cmd/ftpfs
:  	convert "space char" to/from external actual space on create,
:  	walk, wstat, stat and directory reads.

One crazy idea I had was to do that translation in the mount driver.
That way the server would be happy to think that it uses space, and
the client plan 9 program would be happy to see 00A0 or whatever
without confussion with the space character. 

lucio@proxima.alt.za : 
:  What I'm saying, is that I'd like to target a kernel that is entirely
:  delimiter agnostic and promote each user application in the same
:  direction as a long-term project.  In the interim, constructs that
:  cast delimiters in stone should be removed wherever possible.

IMHO, the problem is mostly the user programs and not the kernel.
AFAIK, the kernel is fine if you don't use / and \0 as delimiters
(which seems reasonable to me, although some guys might want to use it too). 

But the tradition that blanks separate arguments is deeply embedded in
user programs, perhaps most notably the shell.

Assume the kernel has changed to use openv[], what would the shell
do to deal with spaces vs 00A0s ?




^ permalink raw reply	[flat|nested] 103+ messages in thread
* Re: [9fans] blanks in file names
@ 2002-07-02 18:14 rog
  2002-07-02 23:08 ` Dan Cross
  0 siblings, 1 reply; 103+ messages in thread
From: rog @ 2002-07-02 18:14 UTC (permalink / raw)
  To: 9fans

> 	- change the system to translate blanks in file names to the
> 	  same character.

it might be worth identifying the exact points that would need changing
should such a scheme be chosen.

/sys/src/cmd/dossrv
/sys/src/cmd/9660srv
/sys/src/cmd/tapefs
/sys/src/cmd/unix/u9fs
/sys/src/cmd/ftpfs
	convert "space char" to/from external actual space on create,
	walk, wstat, stat and directory reads.

/sys/src/9/port/latin1.c
	make it possible to have a single character ALT sequence
	(if ALT-space is what's desired).

>	- change the quote library so it does not quote that character.

it doesn't matter if it does...
however various places would need to change back so they didn't quote
chars (e.g.  ls, pwd?)

a suitable character would need to be chosen (e.g.  00A0), and it
would be nice if most fonts displayed it in a reasonably consistent
manner.

perhaps a new function could be added to libc that does the
conventional conversion from ' ' to the new space char so that
programs with GUI entry boxes that know that a string is being typed
in a filename context can trivially do the conversion.

what have i missed?



^ permalink raw reply	[flat|nested] 103+ messages in thread
* [9fans] blanks in file names
@ 2002-07-02  9:53 Fco.J.Ballesteros
  0 siblings, 0 replies; 103+ messages in thread
From: Fco.J.Ballesteros @ 2002-07-02  9:53 UTC (permalink / raw)
  To: 9fans

After reading all the post about this, and thinking about it
I think it's more clean to use different characters for space and
blank within a file name — as someone said before.

The reason why I think so is that it would simplify the library
and at the same time avoid problems with quoting. For example,
although replica uses %q, I ended up deleting a file name 'chk, because
somehow replica got confussed. It's true that I could have tried
to avoid the confussion and fix the bug; but I think this suggests that
it's not so easy to quote things.

On the other side, there're two places where file names get blanks in:
- from foreign systems
- locally created files

On both places it'd be easy for the user to type Alt-spc, perhaps more
simpler than it'd be to write 'blah blah'.

Now, does anyone from the Labs think otherwise, and if so,
what's the reason. I'd like to learn from this mistake, if my current
view of the problem can be considered as so.

In any case, I agree that blanks are here to stay and I'd like Plan 9
to handle then as nicely as it handles ☺ⁱ⁲.
thanks



ⁱ Why did ☺ get into unicode?
⁲ Why didn't :-( get in then? (Did they read TPOP?).



^ permalink raw reply	[flat|nested] 103+ messages in thread

end of thread, other threads:[~2002-07-21 19:52 UTC | newest]

Thread overview: 103+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-07-02 11:09 [9fans] blanks in file names forsyth
2002-07-02 11:53 ` matt
2002-07-02 13:29   ` Boyd Roberts
2002-07-02 14:57     ` FJ Ballesteros
2002-07-02 16:23       ` Lucio De Re
2002-07-03 19:21       ` rob pike, esq.
2002-07-03 14:31         ` FJ Ballesteros
2002-07-02 18:28   ` plan9
2002-07-03 13:54     ` arisawa
2002-07-03 14:24       ` FJ Ballesteros
2002-07-03 19:40       ` rob pike, esq.
2002-07-03 22:10         ` arisawa
2002-07-04  8:30       ` Ralph Corderoy
  -- strict thread matches above, loose matches on Subject: below --
2002-07-15  4:03 Geoff Collyer
2002-07-15 14:53 ` Jack Johnson
2002-07-21 19:52   ` Dave
2002-07-21 19:47 ` Dave
2002-07-14 18:28 rob pike, esq.
2002-07-11 23:56 okamoto
2002-07-11  8:39 Geoff Collyer
2002-07-14 18:13 ` Dave
2002-07-11  5:50 okamoto
2002-07-11  1:41 anothy
     [not found] ` <"anothy@cosym.net"@Jul>
2002-07-11  6:47   ` Dave
2002-07-10 18:27 David Gordon Hogan
2002-07-10 20:56 ` arisawa
2002-07-10  8:00 Fco.J.Ballesteros
2002-07-09  7:54 Fco.J.Ballesteros
2002-07-09  1:08 okamoto
2002-07-08 23:19 David Gordon Hogan
2002-07-08 23:30 ` Dave
2002-07-08 20:22 rob pike, esq.
2002-07-08 21:21 ` Dave
2002-07-08 23:27   ` Dan Cross
2002-07-08 23:30     ` Dan Cross
2002-07-08 12:18 forsyth
     [not found] ` <"forsyth@caldo.demon.co.uk"@Jul>
2002-07-08 20:42   ` Dave
2002-07-08  8:59 Fco.J.Ballesteros
     [not found] ` <Fco.J.Ballesteros@Jul>
2002-07-08 20:18   ` Dave
2002-07-09 15:23   ` Dave
2002-07-10 16:02   ` Dave
2002-07-10 20:59     ` FJ Ballesteros
2002-07-10 21:51       ` Dave
2002-07-10 22:22         ` Dan Cross
2002-07-10 23:01           ` Dave
2002-07-11  2:00             ` Dan Cross
2002-07-11  6:14               ` Dave
2002-07-11  6:38                 ` Lucio De Re
2002-07-14 18:00                   ` Dave
2002-07-11 13:14                 ` arisawa
2002-07-12 12:28                   ` arisawa
2002-07-11 16:23                 ` Dan Cross
2002-07-11 10:43             ` Ish Rattan
2002-07-14 18:49               ` Dave
2002-07-08  0:38 Scott Schwartz
2002-07-07  5:59 Geoff Collyer
2002-07-05 19:21 David Gordon Hogan
2002-07-05 19:52 ` Jim Choate
2002-07-05 20:10 ` Mark Bitting
2002-07-05 18:26 Sape Mullender
2002-07-05 18:23 David Gordon Hogan
2002-07-05  1:21 okamoto
     [not found] <20020703160003.27491.58783.Mailman@psuvax1.cse.psu.edu>
2002-07-04 23:35 ` Andrew Simmons
2002-07-04 22:42   ` Sam
2002-07-04 22:44     ` Sam
2002-07-08 16:14   ` ozan s yigit
2002-07-04 12:26 Fco.J.Ballesteros
2002-07-04 12:20 forsyth
2002-07-04 11:37 Fco.J.Ballesteros
2002-07-04 11:36 rog
2002-07-04  9:50 Fco.J.Ballesteros
2002-07-04  9:41 forsyth
2002-07-04  8:31 Fco.J.Ballesteros
2002-07-04  8:22 forsyth
2002-07-04  7:53 Fco.J.Ballesteros
2002-07-04  7:47 Fco.J.Ballesteros
2002-07-04  6:34 forsyth
2002-07-04  7:39 ` Lucio De Re
2002-07-04  9:32   ` Nikolai SAOUKH
2002-07-03  8:00 Fco.J.Ballesteros
2002-07-03 12:00 ` Lucio De Re
2002-07-03 19:39   ` rob pike, esq.
2002-07-07  4:02     ` Dave
2002-07-07  5:17       ` arisawa
     [not found]         ` <"arisawa@ar.aichi-u.ac.jp"@Jul>
2002-07-07  5:38           ` Dave
2002-07-07  6:04             ` arisawa
2002-07-07  7:16               ` arisawa
2002-07-07 16:11           ` Dave
2002-07-07 16:12           ` Dave
2002-07-10 21:58           ` Dave
2002-07-10 22:38             ` arisawa
2002-07-11  5:10           ` Dave
2002-07-14 18:32           ` Dave
2002-07-14 18:51             ` Jim Choate
2002-07-14 23:27             ` arisawa
2002-07-08  9:48       ` Boyd Roberts
2002-07-08 20:22         ` Dave
2002-07-09  8:24           ` Boyd Roberts
2002-07-09 15:25             ` Dave
2002-07-08 23:05         ` Berry Kercheval
2002-07-02 18:14 rog
2002-07-02 23:08 ` Dan Cross
2002-07-02  9:53 Fco.J.Ballesteros

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).