On 18 February 2013 19:23, andrey mirtchovski <mirtchovski@gmail.com> wrote:
sorry, not all source files, just the 'import' section.

In nemo's example, it seemed to have wandered off into go/src/cmd, reading 4k from everything there each time, which began to add up,
if I've read the iostats correctly.

As to caching, the kernel does cache (/sys/src/9/port/cache.c), but not as much as needed for that example.
Streaming probably isn't interesting if you're anyway caching, unless the files are huge.
The directory structure and metadata also matters. With floren's simpler example, I got a 0.30 elapsed time when I tried it:

subito% time go clean github.com/floren/ellipsoid
0.03u 0.01s 0.30r go clean github.com/floren/ellipsoid

With a caching system in the way (fscfs), the first time, the time goes up from that:

subito% time go clean github.com/floren/ellipsoid
0.02u 0.04s 0.44r go clean github.com/floren/ellipsoid

subito% cat /n/cached/cfsctl
        Client                          Server
   #calls     Δ  ms/call    Δ      #calls     Δ  ms/call    Δ
      1       1   0.670   0.670       1       1   0.660   0.660 Tversion
      1       1   0.365   0.365       1       1   0.343   0.343 Tauth
      1       1   0.495   0.495       1       1   0.484   0.484 Tattach
    433     433   0.188   0.188     352     352   0.219   0.219 Twalk
    333     333   0.223   0.223     332     332   0.115   0.115 Topen
    359     359   0.202   0.202     359     359   0.189   0.189 Tread
      3       3   0.242   0.242       3       3   0.236   0.236 Twrite
    427     427   0.009   0.009       0       0                 Tclunk
     93      93   0.122   0.122      93      93   0.116   0.116 Tstat
     36      36 ndirread
    323     323 ndelegateread
      0       0 ninsert
      0       0 ndelete
      0       0 nupdate
 833411  833411 bytesread
    113     113 byteswritten
 777224  777224 bytesfromserver
  56187   56187 bytesfromdirs
      0       0 bytesfromcache
 777050  777050 bytestocache

The cache doesn't save any reading, because there is no redundancy, but it cuts
the number of walk requests (because it's caching directory structure and there is
overlap there).

The next time

subito% time go clean github.com/floren/ellipsoid
0.01u 0.01s 0.25r go clean github.com/floren/ellipsoid

It has cut the calls through to the server dramatically, at the risk
of getting out-of-date information, which could be a nuisance on a shared project.
(Linux vfs doesn't have to worry about that because it's all local, and dangerous anyway.)
The only opens and reads that do go through are for directory contents,
because fscfs doesn't cache them (I didn't need that on Blue Gene).

subito% cat /n/cached/cfsctl
        Client                          Server
   #calls     Δ  ms/call    Δ      #calls     Δ  ms/call    Δ
      1       0   0.670               1       0   0.660         Tversion
      1       0   0.365               1       0   0.343         Tauth
      1       0   0.495               1       0   0.484         Tattach
    864     431   0.098   0.007     352       0   0.219         Twalk
    666     333   0.119   0.016     349      17   0.115   0.112 Topen
    717     358   0.153   0.105     603     244   0.172   0.147 Tread
      3       0   0.242               3       0   0.236         Twrite
    854     427   0.008   0.008       0       0                 Tclunk
    186      93   0.122   0.122     186      93   0.116   0.116 Tstat
     72      36 ndirread
    531     208 ndelegateread
      0       0 ninsert
      0       0 ndelete
      0       0 nupdate
1666648  833237 bytesread
    113       0 byteswritten
 777224       0 bytesfromserver
 112374   56187 bytesfromdirs
 777050  777050 bytesfromcache
 777050       0 bytestocache

This is with IL/IP which is ever-so-slightly faster than TCP/IP (I tried them both), because there's less overhead,
especially on the reads (that get through).

I don't think streaming will help you, really; the main difference is not fetching it at all,
and only if you're working with the same contents a lot;
and you need to cache directory contents and other metadata, with all that entails.

The iostats wasn't particularly strange for that one, so I wonder if there's something about nemo's example
that gets {go clean ...} wandering through all the src/cmd stuff so often.