9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] long filenames in cwfs
@ 2013-06-23  1:48 arisawa
  2013-06-23  7:45 ` cinap_lenrek
  0 siblings, 1 reply; 11+ messages in thread
From: arisawa @ 2013-06-23  1:48 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Hello,

importing files from outer world (mainly OSX or Win),  we are sometimes obliged to handle 
long filenames.
fossil is no problem.
cwfs has limit of name length that is insufficient to handle these long names.
any idea?

Kenji Arisawa




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] long filenames in cwfs
  2013-06-23  1:48 [9fans] long filenames in cwfs arisawa
@ 2013-06-23  7:45 ` cinap_lenrek
  2013-06-23 13:36   ` arisawa
  0 siblings, 1 reply; 11+ messages in thread
From: cinap_lenrek @ 2013-06-23  7:45 UTC (permalink / raw)
  To: 9fans

the file name length can be configured at compile time.
look for the NAMELEN enum.

like the cwfs64x in 9front uses 144 instead of 56.

you mentioned windows. theres a -o trspaces option in cifsd
that will convert space characters to non breaking space
and back for you.

hope this is usefull...

--
cinap



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] long filenames in cwfs
  2013-06-23  7:45 ` cinap_lenrek
@ 2013-06-23 13:36   ` arisawa
  2013-06-23 14:08     ` erik quanstrom
  0 siblings, 1 reply; 11+ messages in thread
From: arisawa @ 2013-06-23 13:36 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Thank you cinap,

I tried to copy all my  Dropbox data to cwfs.
the number of files that exceeded 144B name limit was only 3 in 40000 files.
I will be happy if cwfs64x natively accepts longer name, but the requirement
is almost endless. for example, OSX support 1024B names.
I wonder if making NAMELEN larger is the only way  to handle the problem.

Kenji Arisawa

On 2013/06/23, at 16:45, cinap_lenrek@gmx.de wrote:

> the file name length can be configured at compile time.
> look for the NAMELEN enum.
>
> like the cwfs64x in 9front uses 144 instead of 56.
>
> you mentioned windows. theres a -o trspaces option in cifsd
> that will convert space characters to non breaking space
> and back for you.
>
> hope this is usefull...
>
> --
> cinap
>




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] long filenames in cwfs
  2013-06-23 13:36   ` arisawa
@ 2013-06-23 14:08     ` erik quanstrom
  2013-06-23 21:59       ` Skip Tavakkolian
  2013-06-24 10:57       ` arisawa
  0 siblings, 2 replies; 11+ messages in thread
From: erik quanstrom @ 2013-06-23 14:08 UTC (permalink / raw)
  To: 9fans

On Sun Jun 23 09:38:01 EDT 2013, arisawa@ar.aichi-u.ac.jp wrote:
> Thank you cinap,
>
> I tried to copy all my  Dropbox data to cwfs.
> the number of files that exceeded 144B name limit was only 3 in 40000 files.
> I will be happy if cwfs64x natively accepts longer name, but the requirement
> is almost endless. for example, OSX support 1024B names.
> I wonder if making NAMELEN larger is the only way  to handle the problem.

without a different structure, it is the only way to handle the problem.

a few things to keep in mind about file names.  file names when they
appear in 9p messages can't be split between messages.  this applies
to walk, create, stat or read (of parent directory).  i think this places
the restriction that maxnamelen <= IOUNIT - 43 bytes.  the distribution
limits IOUNIT through the mnt driver to 8192+24.  (9atom uses
6*8k+24)

there are two basic ways to change the format to deal with this
1.  provide an escape to point to auxillary storage.  this is kind to
existing storage.
2.  make the name (and thus the directory entry) variable length.

on our fs (which has python and some other nasties), the average
file length is 11.  making the blocks variable length could save 25%
(62 directory entries per buffer).  but it might be annoying to have
to migrate the whole fs.

so since there are so few long names, why not waste a whole block
on them?  if using the "standard" (ish) 8k raw block size (8180 for
data), the expansion of the header could be nil (through creative
encoding) and there would be 3 extra blocks taken for indirect names.
for your case, the cost for 144-byte file names would be that DIRPERBUF
goes from 47 to 31.  so most directories > 31 entries will take
1.5 x (in the big-O sense) their original space even if there are
no long names.

- erik



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] long filenames in cwfs
  2013-06-23 14:08     ` erik quanstrom
@ 2013-06-23 21:59       ` Skip Tavakkolian
  2013-06-24 13:03         ` erik quanstrom
  2013-06-24 10:57       ` arisawa
  1 sibling, 1 reply; 11+ messages in thread
From: Skip Tavakkolian @ 2013-06-23 21:59 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 2386 bytes --]

with 8K names, using base64, one could encode 6111 bytes of data in the
name. i just did a quick inventory[*] of my $home; 74% of my files have
less than 6112 bytes of data.

[*]
% fn pctfileslessthan6k () {
x=`{du -na $home|wc -l}
y=`{du -na $home|awk '$1 < 6112 {print $0}'|wc -l}
pct=`{echo '2k ' $y $x ' / 100 * p' | dc}
echo $pct
}
% pctfileslessthan6k
74.00
%


On Sun, Jun 23, 2013 at 7:08 AM, erik quanstrom <quanstro@quanstro.net>wrote:

> On Sun Jun 23 09:38:01 EDT 2013, arisawa@ar.aichi-u.ac.jp wrote:
> > Thank you cinap,
> >
> > I tried to copy all my  Dropbox data to cwfs.
> > the number of files that exceeded 144B name limit was only 3 in 40000
> files.
> > I will be happy if cwfs64x natively accepts longer name, but the
> requirement
> > is almost endless. for example, OSX support 1024B names.
> > I wonder if making NAMELEN larger is the only way  to handle the problem.
>
> without a different structure, it is the only way to handle the problem.
>
> a few things to keep in mind about file names.  file names when they
> appear in 9p messages can't be split between messages.  this applies
> to walk, create, stat or read (of parent directory).  i think this places
> the restriction that maxnamelen <= IOUNIT - 43 bytes.  the distribution
> limits IOUNIT through the mnt driver to 8192+24.  (9atom uses
> 6*8k+24)
>
> there are two basic ways to change the format to deal with this
> 1.  provide an escape to point to auxillary storage.  this is kind to
> existing storage.
> 2.  make the name (and thus the directory entry) variable length.
>
> on our fs (which has python and some other nasties), the average
> file length is 11.  making the blocks variable length could save 25%
> (62 directory entries per buffer).  but it might be annoying to have
> to migrate the whole fs.
>
> so since there are so few long names, why not waste a whole block
> on them?  if using the "standard" (ish) 8k raw block size (8180 for
> data), the expansion of the header could be nil (through creative
> encoding) and there would be 3 extra blocks taken for indirect names.
> for your case, the cost for 144-byte file names would be that DIRPERBUF
> goes from 47 to 31.  so most directories > 31 entries will take
> 1.5 x (in the big-O sense) their original space even if there are
> no long names.
>
> - erik
>
>

[-- Attachment #2: Type: text/html, Size: 3313 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] long filenames in cwfs
  2013-06-23 14:08     ` erik quanstrom
  2013-06-23 21:59       ` Skip Tavakkolian
@ 2013-06-24 10:57       ` arisawa
  2013-06-24 11:21         ` hiro
  1 sibling, 1 reply; 11+ messages in thread
From: arisawa @ 2013-06-24 10:57 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

might be better idea than the one I have been considering.

On 2013/06/23, at 23:08, erik quanstrom <quanstro@quanstro.net> wrote:

> On Sun Jun 23 09:38:01 EDT 2013, arisawa@ar.aichi-u.ac.jp wrote:
>> Thank you cinap,
>> 
>> I tried to copy all my  Dropbox data to cwfs.
>> the number of files that exceeded 144B name limit was only 3 in 40000 files.
>> I will be happy if cwfs64x natively accepts longer name, but the requirement
>> is almost endless. for example, OSX support 1024B names.
>> I wonder if making NAMELEN larger is the only way  to handle the problem.
> 
> without a different structure, it is the only way to handle the problem.
> 
> a few things to keep in mind about file names.  file names when they
> appear in 9p messages can't be split between messages.  this applies
> to walk, create, stat or read (of parent directory).  i think this places
> the restriction that maxnamelen <= IOUNIT - 43 bytes.  the distribution
> limits IOUNIT through the mnt driver to 8192+24.  (9atom uses
> 6*8k+24)
> 
> there are two basic ways to change the format to deal with this
> 1.  provide an escape to point to auxillary storage.  this is kind to
> existing storage.
> 2.  make the name (and thus the directory entry) variable length.
> 
> on our fs (which has python and some other nasties), the average
> file length is 11.  making the blocks variable length could save 25%
> (62 directory entries per buffer).  but it might be annoying to have
> to migrate the whole fs.
> 
> so since there are so few long names, why not waste a whole block
> on them?  if using the "standard" (ish) 8k raw block size (8180 for
> data), the expansion of the header could be nil (through creative
> encoding) and there would be 3 extra blocks taken for indirect names.
> for your case, the cost for 144-byte file names would be that DIRPERBUF
> goes from 47 to 31.  so most directories > 31 entries will take 
> 1.5 x (in the big-O sense) their original space even if there are
> no long names.
> 
> - erik
> 




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] long filenames in cwfs
  2013-06-24 10:57       ` arisawa
@ 2013-06-24 11:21         ` hiro
  2013-06-24 12:43           ` arisawa
  0 siblings, 1 reply; 11+ messages in thread
From: hiro @ 2013-06-24 11:21 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

why do you have such long file names?
does dropbox put a preview of the file into the filename or what?



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] long filenames in cwfs
  2013-06-24 11:21         ` hiro
@ 2013-06-24 12:43           ` arisawa
  0 siblings, 0 replies; 11+ messages in thread
From: arisawa @ 2013-06-24 12:43 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Hello,

On 2013/06/24, at 20:21, hiro <23hiro@gmail.com> wrote:

> why do you have such long file names?
> does dropbox put a preview of the file into the filename or what?
> 


the exceeded names are all <  200B length.

(a) attached official document file in mail

【報告書様式】4-1.教育内容・方法・成果(①教育目標、学位授与方針、教育課程の編成・実施方針)(学部).doc

(b) catalog file of PDF format that was on the net.

CsI(Tl)固体シンチレータ 5.5x5.5x5.5mm_ センサ一般 秋月電子通商 電子部品 ネット通販.pdf

(c) saved web page file by google chrome (the file name is the title of the page)

世界初、MOSイメージセンサのダイナミックレンジ拡大技術を開発   プレスリリース   ニュース   パナソニック企業情報   Panasonic.html

they are all japanese name, sorry!
brief abstract of contents is in the file name.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] long filenames in cwfs
  2013-06-23 21:59       ` Skip Tavakkolian
@ 2013-06-24 13:03         ` erik quanstrom
  0 siblings, 0 replies; 11+ messages in thread
From: erik quanstrom @ 2013-06-24 13:03 UTC (permalink / raw)
  To: 9fans

On Sun Jun 23 18:01:11 EDT 2013, skip.tavakkolian@gmail.com wrote:

> with 8K names, using base64, one could encode 6111 bytes of data in the
> name. i just did a quick inventory[*] of my $home; 74% of my files have
> less than 6112 bytes of data.

i was proposing a pointer, which could be nil.
i would put the pointer where the file name should
be—perhaps by setting the first 8 bytes to non-unicode
magic and the next 8 bytes to a pointer to the
direct name block.

so why not change nothing.  the first 8180 bytes go
in the first direct block anyway.

btw, i have far fewer large files than you, but still 25%
of the ones bigger than 6112 bytes are smaller than
8180

 ; du -a .>[2=]|awk '$1>8180 {s++}END{print s/NR}'
0.00325015
; du -a .>[2=]|awk '$1>6112 {s++}END{print s/NR}'
0.00416286

- erik



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] long filenames in cwfs
  2013-06-23  2:11 Erik Quanstrom
@ 2013-06-23  6:19 ` arisawa
  0 siblings, 0 replies; 11+ messages in thread
From: arisawa @ 2013-06-23  6:19 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

thank you erik.

I forgot the existence of lnfs.
I will try your lnfs.

On 2013/06/23, at 11:11, Erik Quanstrom <quanstro@quanstro.net> wrote:

> Lnfs works fine.  Since fses vary in max length I modified it to take a max length parameter.  See the atom man pages.
> 
> - erik
> 
> 
> arisawa <arisawa@ar.aichi-u.ac.jp> wrote:
> 
>> Hello,
>> 
>> importing files from outer world (mainly OSX or Win),  we are sometimes obliged to handle 
>> long filenames.
>> fossil is no problem.
>> cwfs has limit of name length that is insufficient to handle these long names.
>> any idea?
>> 
>> Kenji Arisawa
>> 
>> 
>> 




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] long filenames in cwfs
@ 2013-06-23  2:11 Erik Quanstrom
  2013-06-23  6:19 ` arisawa
  0 siblings, 1 reply; 11+ messages in thread
From: Erik Quanstrom @ 2013-06-23  2:11 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Lnfs works fine.  Since fses vary in max length I modified it to take a max length parameter.  See the atom man pages.

- erik


arisawa <arisawa@ar.aichi-u.ac.jp> wrote:

>Hello,
>
>importing files from outer world (mainly OSX or Win),  we are sometimes obliged to handle 
>long filenames.
>fossil is no problem.
>cwfs has limit of name length that is insufficient to handle these long names.
>any idea?
>
>Kenji Arisawa
>
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-06-24 13:03 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-23  1:48 [9fans] long filenames in cwfs arisawa
2013-06-23  7:45 ` cinap_lenrek
2013-06-23 13:36   ` arisawa
2013-06-23 14:08     ` erik quanstrom
2013-06-23 21:59       ` Skip Tavakkolian
2013-06-24 13:03         ` erik quanstrom
2013-06-24 10:57       ` arisawa
2013-06-24 11:21         ` hiro
2013-06-24 12:43           ` arisawa
2013-06-23  2:11 Erik Quanstrom
2013-06-23  6:19 ` arisawa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).