Filename generation: sorting by inode number

zsh-workers
 help / color / mirror / code / Atom feed

* Filename generation: sorting by inode number
@ 2015-04-25  0:17 Vincent Lefevre
  2015-04-25  1:01 ` Bart Schaefer
  2015-04-25  5:11 ` Mikael Magnusson
  0 siblings, 2 replies; 7+ messages in thread
From: Vincent Lefevre @ 2015-04-25  0:17 UTC (permalink / raw)
  To: zsh-workers

With the "o" glob qualifier for filename generation, it is not possible
to sort by inode number. Such a feature could be very useful to speed up
file reading on an ext3 file system when the files are not in the cache.

See:
  https://lists.debian.org/debian-user/2015/04/msg01310.html
  http://comments.gmane.org/gmane.mail.mutt.devel/4089

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Filename generation: sorting by inode number
  2015-04-25  0:17 Filename generation: sorting by inode number Vincent Lefevre
@ 2015-04-25  1:01 ` Bart Schaefer
  2015-04-28 15:06   ` Vincent Lefevre
  2015-04-25  5:11 ` Mikael Magnusson
  1 sibling, 1 reply; 7+ messages in thread
From: Bart Schaefer @ 2015-04-25  1:01 UTC (permalink / raw)
  To: zsh-workers

On Apr 25,  2:17am, Vincent Lefevre wrote:
} Subject: Filename generation: sorting by inode number
}
} With the "o" glob qualifier for filename generation, it is not possible
} to sort by inode number. Such a feature could be very useful to speed up
} file reading on an ext3 file system when the files are not in the cache.

Hrm.  Does the inode ordering also affect stat() times?

This sounds less like something to expose as a sort ordering for the o/O
qualifiers and more like something to hide under the covers, e.g., so
that "oN" (unsorted) actually produces order by inode, instead of the
apparently pseudo-random order of readdir().

How close does "oc" get you to this?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Filename generation: sorting by inode number
  2015-04-25  0:17 Filename generation: sorting by inode number Vincent Lefevre
  2015-04-25  1:01 ` Bart Schaefer
@ 2015-04-25  5:11 ` Mikael Magnusson
  2015-04-25 18:14   ` Bart Schaefer
  1 sibling, 1 reply; 7+ messages in thread
From: Mikael Magnusson @ 2015-04-25  5:11 UTC (permalink / raw)
  To: zsh workers

On Sat, Apr 25, 2015 at 2:17 AM, Vincent Lefevre <vincent@vinc17.net> wrote:
> With the "o" glob qualifier for filename generation, it is not possible
> to sort by inode number. Such a feature could be very useful to speed up
> file reading on an ext3 file system when the files are not in the cache.
>
> See:
>   https://lists.debian.org/debian-user/2015/04/msg01310.html
>   http://comments.gmane.org/gmane.mail.mutt.devel/4089

Everything is possible with the "o" glob qualifier by virtue of the
"e" specifier;
zmodload -aF zsh/stat -b:stat b:zstat
ls -Udi *(noe:'zstat -L -A REPLY +inode $REPLY:')

You can also stick that in a helper function ins and use *(no+ins).

-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Filename generation: sorting by inode number
  2015-04-25  5:11 ` Mikael Magnusson
@ 2015-04-25 18:14   ` Bart Schaefer
  2015-04-28 15:19     ` Vincent Lefevre
  0 siblings, 1 reply; 7+ messages in thread
From: Bart Schaefer @ 2015-04-25 18:14 UTC (permalink / raw)
  To: zsh workers

On Apr 25,  7:11am, Mikael Magnusson wrote:
}
} Everything is possible with the "o" glob qualifier by virtue of the
} "e" specifier;
} zmodload -aF zsh/stat -b:stat b:zstat
} ls -Udi *(noe:'zstat -L -A REPLY +inode $REPLY:')

That's why I asked about whether the hash-ordering affected efficiency
of stat().  Although the inode number is available this way, we could
also get it from the dirent structure.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Filename generation: sorting by inode number
  2015-04-25  1:01 ` Bart Schaefer
@ 2015-04-28 15:06   ` Vincent Lefevre
  0 siblings, 0 replies; 7+ messages in thread
From: Vincent Lefevre @ 2015-04-28 15:06 UTC (permalink / raw)
  To: zsh-workers

On 2015-04-24 18:01:09 -0700, Bart Schaefer wrote:
> On Apr 25,  2:17am, Vincent Lefevre wrote:
> } Subject: Filename generation: sorting by inode number
> }
> } With the "o" glob qualifier for filename generation, it is not possible
> } to sort by inode number. Such a feature could be very useful to speed up
> } file reading on an ext3 file system when the files are not in the cache.
> 
> Hrm.  Does the inode ordering also affect stat() times?

I've just done a test, and stat() in directory order is very fast on
ext3. I think that the reason is that inode information is grouped at
some specific place on the partition, and in a compact way I assume,
so that there are few blocks to read.

> This sounds less like something to expose as a sort ordering for the o/O
> qualifiers and more like something to hide under the covers, e.g., so
> that "oN" (unsorted) actually produces order by inode, instead of the
> apparently pseudo-random order of readdir().

I don't know. Perhaps unsorted (= directory order) may have an
advantage in some contexts (for some file systems).

> How close does "oc" get you to this?

It's difficult to say in practice. But in my case, many files are
often created on the same second (due to copies by block of files).
If the ctime resolution is the second, I fear that this won't solve
the problem.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Filename generation: sorting by inode number
  2015-04-25 18:14   ` Bart Schaefer
@ 2015-04-28 15:19     ` Vincent Lefevre
  2015-04-28 16:34       ` Bart Schaefer
  0 siblings, 1 reply; 7+ messages in thread
From: Vincent Lefevre @ 2015-04-28 15:19 UTC (permalink / raw)
  To: zsh-workers

On 2015-04-25 11:14:51 -0700, Bart Schaefer wrote:
> On Apr 25,  7:11am, Mikael Magnusson wrote:
> }
> } Everything is possible with the "o" glob qualifier by virtue of the
> } "e" specifier;
> } zmodload -aF zsh/stat -b:stat b:zstat
> } ls -Udi *(noe:'zstat -L -A REPLY +inode $REPLY:')
> 
> That's why I asked about whether the hash-ordering affected efficiency
> of stat().

Indeed:

$ sudo drop-caches && time grep -q zzz 1000*(oN)
grep -q zzz 1000*(oN)  0.03s user 0.32s system 0% cpu 40.726 total

$ sudo drop-caches && time grep -q zzz 1000*(noe:'zstat -L -A REPLY +inode $REPLY:')
grep -q zzz 1000*(noe:'zstat -L -A REPLY +inode $REPLY:')  0.06s user 0.14s system 12% cpu 1.590 total

but "oc" is slower (confirmed by 3 tests each), even though the files
have been created with no inode change, e.g.:

$ sudo drop-caches && time grep -q zzz 1000*(oc)
grep -q zzz 1000*(oc)  0.01s user 0.19s system 7% cpu 2.589 total

The probable cause is the low ctime resolution (1 second?).

> Although the inode number is available this way, we could
> also get it from the dirent structure.

Yes, this would be more efficient, but perhaps not noticeable.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Filename generation: sorting by inode number
  2015-04-28 15:19     ` Vincent Lefevre
@ 2015-04-28 16:34       ` Bart Schaefer
  0 siblings, 0 replies; 7+ messages in thread
From: Bart Schaefer @ 2015-04-28 16:34 UTC (permalink / raw)
  To: zsh-workers

[Sorry for the long context / short reply below]

On Apr 28,  5:19pm, Vincent Lefevre wrote:
}
} $ sudo drop-caches && time grep -q zzz 1000*(oN)
} grep -q zzz 1000*(oN)  0.03s user 0.32s system 0% cpu 40.726 total
} 
} $ sudo drop-caches && time grep -q zzz 1000*(noe:'zstat -L -A REPLY +inode $REPLY:')
} grep -q zzz 1000*(noe:'zstat -L -A REPLY +inode $REPLY:')  0.06s user 0.14s system 12% cpu 1.590 total
} 
} but "oc" is slower (confirmed by 3 tests each), even though the files
} have been created with no inode change, e.g.:
} 
} $ sudo drop-caches && time grep -q zzz 1000*(oc)
} grep -q zzz 1000*(oc)  0.01s user 0.19s system 7% cpu 2.589 total
} 
} The probable cause is the low ctime resolution (1 second?).

Looks like (oc) is significantly faster than (oN), though, even though it
is slower than (o+byinode).  So it may be a reasonable compromise.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-04-28 16:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-25  0:17 Filename generation: sorting by inode number Vincent Lefevre
2015-04-25  1:01 ` Bart Schaefer
2015-04-28 15:06   ` Vincent Lefevre
2015-04-25  5:11 ` Mikael Magnusson
2015-04-25 18:14   ` Bart Schaefer
2015-04-28 15:19     ` Vincent Lefevre
2015-04-28 16:34       ` Bart Schaefer

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).