From mboxrd@z Thu Jan  1 00:00:00 1970
MIME-Version: 1.0
In-Reply-To: <0fd845b9ba300f41b4f0607ede960285@lyons.quanstro.net>
References: <dc3bbe9092506858e4f1c8a639c1996f@gmx.de>
	<96413B2B-5F5B-4C27-B2B5-483FB9B811D2@ar.aichi-u.ac.jp>
	<0fd845b9ba300f41b4f0607ede960285@lyons.quanstro.net>
Date: Sun, 23 Jun 2013 14:59:52 -0700
Message-ID: <CAJSxfmKg-7NnBy=MKnaZTW7uTE6eV026H+jq7per_voEvuGD=g@mail.gmail.com>
From: Skip Tavakkolian <skip.tavakkolian@gmail.com>
To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net>
Content-Type: multipart/alternative; boundary=089e0149cd4a645d6a04dfd96aba
Subject: Re: [9fans] long filenames in cwfs
Topicbox-Message-UUID: 67ade722-ead8-11e9-9d60-3106f5b1d025

--089e0149cd4a645d6a04dfd96aba
Content-Type: text/plain; charset=ISO-8859-1

with 8K names, using base64, one could encode 6111 bytes of data in the
name. i just did a quick inventory[*] of my $home; 74% of my files have
less than 6112 bytes of data.

[*]
% fn pctfileslessthan6k () {
x=`{du -na $home|wc -l}
y=`{du -na $home|awk '$1 < 6112 {print $0}'|wc -l}
pct=`{echo '2k ' $y $x ' / 100 * p' | dc}
echo $pct
}
% pctfileslessthan6k
74.00
%


On Sun, Jun 23, 2013 at 7:08 AM, erik quanstrom <quanstro@quanstro.net>wrote:

> On Sun Jun 23 09:38:01 EDT 2013, arisawa@ar.aichi-u.ac.jp wrote:
> > Thank you cinap,
> >
> > I tried to copy all my  Dropbox data to cwfs.
> > the number of files that exceeded 144B name limit was only 3 in 40000
> files.
> > I will be happy if cwfs64x natively accepts longer name, but the
> requirement
> > is almost endless. for example, OSX support 1024B names.
> > I wonder if making NAMELEN larger is the only way  to handle the problem.
>
> without a different structure, it is the only way to handle the problem.
>
> a few things to keep in mind about file names.  file names when they
> appear in 9p messages can't be split between messages.  this applies
> to walk, create, stat or read (of parent directory).  i think this places
> the restriction that maxnamelen <= IOUNIT - 43 bytes.  the distribution
> limits IOUNIT through the mnt driver to 8192+24.  (9atom uses
> 6*8k+24)
>
> there are two basic ways to change the format to deal with this
> 1.  provide an escape to point to auxillary storage.  this is kind to
> existing storage.
> 2.  make the name (and thus the directory entry) variable length.
>
> on our fs (which has python and some other nasties), the average
> file length is 11.  making the blocks variable length could save 25%
> (62 directory entries per buffer).  but it might be annoying to have
> to migrate the whole fs.
>
> so since there are so few long names, why not waste a whole block
> on them?  if using the "standard" (ish) 8k raw block size (8180 for
> data), the expansion of the header could be nil (through creative
> encoding) and there would be 3 extra blocks taken for indirect names.
> for your case, the cost for 144-byte file names would be that DIRPERBUF
> goes from 47 to 31.  so most directories > 31 entries will take
> 1.5 x (in the big-O sense) their original space even if there are
> no long names.
>
> - erik
>
>

--089e0149cd4a645d6a04dfd96aba
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">with 8K names, using base64, one could encode 6111 bytes o=
f data in the name. i just did a quick inventory[*] of my $home; 74% of my =
files have less than 6112 bytes of data.<div><br></div><div>[*]</div><div>
<div>% fn pctfileslessthan6k () {</div><div><span class=3D"" style=3D"white=
-space:pre">		</span>x=3D`{du -na $home|wc -l}</div><div><span class=3D"" s=
tyle=3D"white-space:pre">		</span>y=3D`{du -na $home|awk &#39;$1 &lt; 6112 =
{print $0}&#39;|wc -l}</div>
<div><span class=3D"" style=3D"white-space:pre">		</span>pct=3D`{echo &#39;=
2k &#39; $y $x &#39; / 100 * p&#39; | dc}</div><div><span class=3D"" style=
=3D"white-space:pre">		</span>echo $pct</div><div><span class=3D"" style=3D=
"white-space:pre">		</span>}</div>
<div>% pctfileslessthan6k</div><div>74.00</div><div>%=A0</div></div></div><=
div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Sun, Jun 23=
, 2013 at 7:08 AM, erik quanstrom <span dir=3D"ltr">&lt;<a href=3D"mailto:q=
uanstro@quanstro.net" target=3D"_blank">quanstro@quanstro.net</a>&gt;</span=
> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div class=3D"im">On Sun Jun 23 09:38:01 EDT=
 2013, <a href=3D"mailto:arisawa@ar.aichi-u.ac.jp">arisawa@ar.aichi-u.ac.jp=
</a> wrote:<br>

&gt; Thank you cinap,<br>
&gt;<br>
&gt; I tried to copy all my =A0Dropbox data to cwfs.<br>
&gt; the number of files that exceeded 144B name limit was only 3 in 40000 =
files.<br>
&gt; I will be happy if cwfs64x natively accepts longer name, but the requi=
rement<br>
&gt; is almost endless. for example, OSX support 1024B names.<br>
&gt; I wonder if making NAMELEN larger is the only way =A0to handle the pro=
blem.<br>
<br>
</div>without a different structure, it is the only way to handle the probl=
em.<br>
<br>
a few things to keep in mind about file names. =A0file names when they<br>
appear in 9p messages can&#39;t be split between messages. =A0this applies<=
br>
to walk, create, stat or read (of parent directory). =A0i think this places=
<br>
the restriction that maxnamelen &lt;=3D IOUNIT - 43 bytes. =A0the distribut=
ion<br>
limits IOUNIT through the mnt driver to 8192+24. =A0(9atom uses<br>
6*8k+24)<br>
<br>
there are two basic ways to change the format to deal with this<br>
1. =A0provide an escape to point to auxillary storage. =A0this is kind to<b=
r>
existing storage.<br>
2. =A0make the name (and thus the directory entry) variable length.<br>
<br>
on our fs (which has python and some other nasties), the average<br>
file length is 11. =A0making the blocks variable length could save 25%<br>
(62 directory entries per buffer). =A0but it might be annoying to have<br>
to migrate the whole fs.<br>
<br>
so since there are so few long names, why not waste a whole block<br>
on them? =A0if using the &quot;standard&quot; (ish) 8k raw block size (8180=
 for<br>
data), the expansion of the header could be nil (through creative<br>
encoding) and there would be 3 extra blocks taken for indirect names.<br>
for your case, the cost for 144-byte file names would be that DIRPERBUF<br>
goes from 47 to 31. =A0so most directories &gt; 31 entries will take<br>
1.5 x (in the big-O sense) their original space even if there are<br>
no long names.<br>
<span class=3D"HOEnZb"><font color=3D"#888888"><br>
- erik<br>
<br>
</font></span></blockquote></div><br></div>

--089e0149cd4a645d6a04dfd96aba--