9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] kfs un-removable file
@ 2003-10-27 13:07 steve-simon
  2003-10-27 13:35 ` Russ Cox
  0 siblings, 1 reply; 6+ messages in thread
From: steve-simon @ 2003-10-27 13:07 UTC (permalink / raw)
  To: 9fans

Hi,

Maybe I should just ignore this but I have managed to
create a file that I cannot remove (under kfs).

I have tried 
	disk/kfscmd 'remove /usr/steve/work/pcp/e€€B'
and 
	disk/kfscmd 'clri /usr/steve/work/pcp/e€€B'
but both of them error with "can't walk e€€B"

NB the above where cut and paste so the runes are correct.

term% ls -lq
(000000000000f6c5 1 00) --rw-r--r-- M 8 steve steve 0 Sep 18 12:56 e€€B
term% ls | xd -c
0000000   e c2 80 c2 80  B \n
0000007 


Anyone any ideas how I can get rid of this file?

-Steve


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [9fans] kfs un-removable file
  2003-10-27 13:07 [9fans] kfs un-removable file steve-simon
@ 2003-10-27 13:35 ` Russ Cox
  2003-10-27 15:15   ` C H Forsyth
  2003-10-27 16:02   ` rog
  0 siblings, 2 replies; 6+ messages in thread
From: Russ Cox @ 2003-10-27 13:35 UTC (permalink / raw)
  To: 9fans

> Maybe I should just ignore this but I have managed to
> create a file that I cannot remove (under kfs).
>
> I have tried
> 	disk/kfscmd 'remove /usr/steve/work/pcp/e=C2=80=C2=80B'
> and=
> 	disk/kfscmd 'clri /usr/steve/work/pcp/e=C2=80=C2=80B'
> but both of them error with "can't walk e=C2=80=C2=80B"
>
> NB the above where cut and paste so the runes are correct.

I doubt that.  C2 80 is UTF for 0x80, the error rune.
When any of the UTF routines process a bad UTF sequence, they
replace it with the error rune.  So what's really happening,
probably, is that kfs is giving you bad data (not UTF) and
ls is coping.

If you write a program to walk through the directory manually
and call remove on the offending file, you should be fine.
Just avoid all the utf routines (like print!).

Of course, kfs might check that names you give it are okay
UTF, but off the top of my head I don't think it does.

Russ


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [9fans] kfs un-removable file
  2003-10-27 13:35 ` Russ Cox
@ 2003-10-27 15:15   ` C H Forsyth
  2003-10-27 16:15     ` ron minnich
  2003-10-27 16:02   ` rog
  1 sibling, 1 reply; 6+ messages in thread
From: C H Forsyth @ 2003-10-27 15:15 UTC (permalink / raw)
  To: 9fans

>>If you write a program to walk through the directory manually
>>and call remove on the offending file, you should be fine.
>>Just avoid all the utf routines (like print!).

be sure to call it `dsw'



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [9fans] kfs un-removable file
  2003-10-27 13:35 ` Russ Cox
  2003-10-27 15:15   ` C H Forsyth
@ 2003-10-27 16:02   ` rog
  2003-10-27 16:06     ` Scott Schwartz
  1 sibling, 1 reply; 6+ messages in thread
From: rog @ 2003-10-27 16:02 UTC (permalink / raw)
  To: 9fans

> I doubt that.  C2 80 is UTF for 0x80, the error rune.
> When any of the UTF routines process a bad UTF sequence, they
> replace it with the error rune.  So what's really happening,
> probably, is that kfs is giving you bad data (not UTF) and
> ls is coping.

that's not necessarily the case.

we've got some files on our filesystem (an old style fileserver) that
have C2 80 sequences in them, and they seem to be unremovable.  a
direct stat on the files still gives the c2-80 sequences (as far as i
can see convM2D doesn't do any utf conversions, so it shouldn't be
necessary to look at the raw dir format)

not only are the files non-removable, several of them have a few
duplicates.

after a little experimentation, it seems that the fileserver (and
presumably kfs too) doesn't check utf consistency on input, but does
convert utf chars on output (mind you, it's not obvious from a quick
check in the source).

in fact, in the example i just tried, i did similar to:

	char buf[] = "/tmp/yyXz";
	buf[7] = 0xff; create(buf, OWRITE, 8r666);
	buf[7] = 0xfd; create(buf, OWRITE, 8r666);

both creates succeeded, and i now have two unremovable
files in my /tmp (oops).

cat /tmp | xd -c

shows that the filenames of each are identical (in this
case they're each exactly "yy").

i'd suggest that perhaps it'd be a good idea for the fileserver
to canonicalise names on creation as quite apart from
invalid utf sequences, aren't there several possible
utf sequences that can validly map to the same character?



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [9fans] kfs un-removable file
  2003-10-27 16:02   ` rog
@ 2003-10-27 16:06     ` Scott Schwartz
  0 siblings, 0 replies; 6+ messages in thread
From: Scott Schwartz @ 2003-10-27 16:06 UTC (permalink / raw)
  To: 9fans

| invalid utf sequences, aren't there several possible
| utf sequences that can validly map to the same character?

I think only the shortest such sequence is supposed to be allowed, so
maybe calling it an error is better than canonicalizing.  On the other
hand, Tcl uses a multibyte encoding of \0 to handle embedded nuls.
That seems like a useful hack.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [9fans] kfs un-removable file
  2003-10-27 15:15   ` C H Forsyth
@ 2003-10-27 16:15     ` ron minnich
  0 siblings, 0 replies; 6+ messages in thread
From: ron minnich @ 2003-10-27 16:15 UTC (permalink / raw)
  To: 9fans

On Mon, 27 Oct 2003, C H Forsyth wrote:

> >>If you write a program to walk through the directory manually
> >>and call remove on the offending file, you should be fine.
> >>Just avoid all the utf routines (like print!).
>
> be sure to call it `dsw'
>

I think I might build a USB interface to my old PDP 11/45 front panel.
Then, I can have dsw read the switches.

Oh, shucks, no inumbers! But FIDS are still 16 bits, right? no problem!

ron



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2003-10-27 16:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-10-27 13:07 [9fans] kfs un-removable file steve-simon
2003-10-27 13:35 ` Russ Cox
2003-10-27 15:15   ` C H Forsyth
2003-10-27 16:15     ` ron minnich
2003-10-27 16:02   ` rog
2003-10-27 16:06     ` Scott Schwartz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).