* O_EXEC and O_SEARCH @ 2013-02-22 0:45 Rich Felker 2013-02-23 3:05 ` KOSAKI Motohiro 2013-02-23 4:54 ` KOSAKI Motohiro 0 siblings, 2 replies; 12+ messages in thread From: Rich Felker @ 2013-02-22 0:45 UTC (permalink / raw) To: libc-alpha; +Cc: musl Hi, I'd like to have a conversation with the glibc team about O_EXEC and O_SEARCH in the interest of hopefully developing a unified plan for supporting them on Linux. Presumably the reason glibc still does not have them is that Linux O_PATH does not exactly match their semantics in some cases, and O_PATH is sufficiently broken on many kernel versions to make offering it problematic. In particular, current coreutils break badly on most kernel versions around 2.6.39-3.6 or so if O_SEARCH and O_EXEC are defined as O_PATH. Right now, we're offering O_EXEC and O_SEARCH in musl libc, defining them as O_PATH. As long as recent Linux is used, this gives nearly correct semantics, except that combined with O_NOFOLLOW they do not fail when the final component is a symbolic link. I believe it's possible to work around this issue on sufficiently modern kernels where fstat works on O_PATH file descriptors, but adding the workaround whenever O_PATH|O_NOFOLLOW is in the flags would change the semantics when O_PATH is used by the caller rather than O_EXEC or O_SEARCH, since the value is equal. I'm not sure this is desirable. What should the long-term plan for supporting O_SEARCH and O_EXEC on Linux be? Should we assume Linux is aiming for O_PATH to eventually provide compatible semantics, and thus just define O_SEARCH and O_EXEC as O_PATH? Or is there a need to define a different value (perhaps 3, the unused access mode) for O_SEARCH and O_EXEC and have open/fcntl remap it and handle workarounds for Linux semantics that don't match the POSIX semantics? Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: O_EXEC and O_SEARCH 2013-02-22 0:45 O_EXEC and O_SEARCH Rich Felker @ 2013-02-23 3:05 ` KOSAKI Motohiro 2013-02-23 3:17 ` Rich Felker 2013-02-23 4:54 ` KOSAKI Motohiro 1 sibling, 1 reply; 12+ messages in thread From: KOSAKI Motohiro @ 2013-02-23 3:05 UTC (permalink / raw) To: Rich Felker; +Cc: libc-alpha, musl > I'd like to have a conversation with the glibc team about O_EXEC and > O_SEARCH in the interest of hopefully developing a unified plan for > supporting them on Linux. Presumably the reason glibc still does not > have them is that Linux O_PATH does not exactly match their semantics > in some cases, and O_PATH is sufficiently broken on many kernel > versions to make offering it problematic. In particular, current > coreutils break badly on most kernel versions around 2.6.39-3.6 or so > if O_SEARCH and O_EXEC are defined as O_PATH. I'm curious why don't you implement them in kernel directly? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: O_EXEC and O_SEARCH 2013-02-23 3:05 ` KOSAKI Motohiro @ 2013-02-23 3:17 ` Rich Felker 2013-02-23 3:58 ` KOSAKI Motohiro 0 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2013-02-23 3:17 UTC (permalink / raw) To: KOSAKI Motohiro; +Cc: libc-alpha, musl On Fri, Feb 22, 2013 at 10:05:03PM -0500, KOSAKI Motohiro wrote: > > I'd like to have a conversation with the glibc team about O_EXEC and > > O_SEARCH in the interest of hopefully developing a unified plan for > > supporting them on Linux. Presumably the reason glibc still does not > > have them is that Linux O_PATH does not exactly match their semantics > > in some cases, and O_PATH is sufficiently broken on many kernel > > versions to make offering it problematic. In particular, current > > coreutils break badly on most kernel versions around 2.6.39-3.6 or so > > if O_SEARCH and O_EXEC are defined as O_PATH. > > I'm curious why don't you implement them in kernel directly? See this thread for Linus's opinion on why O_SEARCH was not added: http://comments.gmane.org/gmane.linux.file-systems/33611 O_NODE seems to have been renamed to O_PATH, or perhaps O_PATH was a later independent implementation of the same idea; it's not clear to me which happened. But the idea is that the kernel folks did not want to do O_SEARCH and O_EXEC properly in kernelspace but instead wanted to provide a more general flag that could be used to implement both O_SEARCH and O_EXEC. Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: O_EXEC and O_SEARCH 2013-02-23 3:17 ` Rich Felker @ 2013-02-23 3:58 ` KOSAKI Motohiro 2013-02-23 4:33 ` Rich Felker 0 siblings, 1 reply; 12+ messages in thread From: KOSAKI Motohiro @ 2013-02-23 3:58 UTC (permalink / raw) To: Rich Felker; +Cc: libc-alpha, musl On Fri, Feb 22, 2013 at 10:17 PM, Rich Felker <dalias@aerifal.cx> wrote: > On Fri, Feb 22, 2013 at 10:05:03PM -0500, KOSAKI Motohiro wrote: >> > I'd like to have a conversation with the glibc team about O_EXEC and >> > O_SEARCH in the interest of hopefully developing a unified plan for >> > supporting them on Linux. Presumably the reason glibc still does not >> > have them is that Linux O_PATH does not exactly match their semantics >> > in some cases, and O_PATH is sufficiently broken on many kernel >> > versions to make offering it problematic. In particular, current >> > coreutils break badly on most kernel versions around 2.6.39-3.6 or so >> > if O_SEARCH and O_EXEC are defined as O_PATH. >> >> I'm curious why don't you implement them in kernel directly? > > See this thread for Linus's opinion on why O_SEARCH was not added: > > http://comments.gmane.org/gmane.linux.file-systems/33611 > > O_NODE seems to have been renamed to O_PATH, or perhaps O_PATH was a > later independent implementation of the same idea; it's not clear to > me which happened. But the idea is that the kernel folks did not want > to do O_SEARCH and O_EXEC properly in kernelspace but instead wanted > to provide a more general flag that could be used to implement both > O_SEARCH and O_EXEC. Do you mean following response? >I suspect that what we _could_ possibly do is to have something like >O_NODE, and after that - if the semantics (for directories) match what >O_SEARCH/O_EXEC wants, we could just do > >#define O_SEARCH O_NODE > >but my point is that we should _not_ start from O_SEARCH and make that the >"core" part, since its semantics are badly defined (undefined) to begin >with. I so, I don't think "start" mean refusing at all. However, I agree kernel folks dislike to hear "because it's posix". As far as no concrete good use case, any proposal may be going to be get negative response. However, if O_SEARCH is really useful, I think in kernel implementation is better because all other flags are implemented in kernel and it may prevent to create ugly corner case. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: O_EXEC and O_SEARCH 2013-02-23 3:58 ` KOSAKI Motohiro @ 2013-02-23 4:33 ` Rich Felker 2013-02-23 5:01 ` KOSAKI Motohiro 0 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2013-02-23 4:33 UTC (permalink / raw) To: KOSAKI Motohiro; +Cc: libc-alpha, musl On Fri, Feb 22, 2013 at 10:58:18PM -0500, KOSAKI Motohiro wrote: > On Fri, Feb 22, 2013 at 10:17 PM, Rich Felker <dalias@aerifal.cx> wrote: > > On Fri, Feb 22, 2013 at 10:05:03PM -0500, KOSAKI Motohiro wrote: > >> > I'd like to have a conversation with the glibc team about O_EXEC and > >> > O_SEARCH in the interest of hopefully developing a unified plan for > >> > supporting them on Linux. Presumably the reason glibc still does not > >> > have them is that Linux O_PATH does not exactly match their semantics > >> > in some cases, and O_PATH is sufficiently broken on many kernel > >> > versions to make offering it problematic. In particular, current > >> > coreutils break badly on most kernel versions around 2.6.39-3.6 or so > >> > if O_SEARCH and O_EXEC are defined as O_PATH. > >> > >> I'm curious why don't you implement them in kernel directly? > > > > See this thread for Linus's opinion on why O_SEARCH was not added: > > > > http://comments.gmane.org/gmane.linux.file-systems/33611 > > > > O_NODE seems to have been renamed to O_PATH, or perhaps O_PATH was a > > later independent implementation of the same idea; it's not clear to > > me which happened. But the idea is that the kernel folks did not want > > to do O_SEARCH and O_EXEC properly in kernelspace but instead wanted > > to provide a more general flag that could be used to implement both > > O_SEARCH and O_EXEC. > > Do you mean following response? > > >I suspect that what we _could_ possibly do is to have something like > >O_NODE, and after that - if the semantics (for directories) match what > >O_SEARCH/O_EXEC wants, we could just do > > > >#define O_SEARCH O_NODE > > > >but my point is that we should _not_ start from O_SEARCH and make that the > >"core" part, since its semantics are badly defined (undefined) to begin > >with. > > I so, I don't think "start" mean refusing at all. However, I agree > kernel folks dislike > to hear "because it's posix". As far as no concrete good use case, any proposal > may be going to be get negative response. Yes, this is what I was referring to. > However, if O_SEARCH is really useful, I think in kernel Regardless of whether it's useful, it's mandatory, and you can't claim conformance if it's omitted. With that said, I think it is useful with the *at functions. Otherwise, the *at functions have reduced functionality in some sense because you can't use them with directories you don't have read access to. > implementation is better > because all other flags are implemented in kernel and it may prevent to create > ugly corner case. I agree, I really wish the kernel had just defined access mode 3 to be O_SEARCH and O_EXEC. However, getting them to do this seems unlikely. Even if they did do it, glibc and musl both intend to support kernels older than 3.9, so we'd have to have some fallback mechanism for avoiding serious breakage in apps when the kernel does not support them right. Note that this breakage is happening ALREADY in coreutils if O_SEARCH and O_EXEC are defined; the only reason coreutils is working on glibc is that glibc does not define them. Anyway, at this point (3.8 kernel, maybe some older ones too), I believe O_PATH is sufficient to implement O_SEARCH and O_EXEC with one caveat: If O_NOFOLLOW is also specified, fstat must be used after the open to determine if the file descriptor obtained refers to a symbolic link, and if so, it should be closed and failure simulated. I'm not aware of any other cases where O_PATH would give the wrong behavior. If we want to offer O_SEARCH and O_EXEC but avoid breakage on broken kernels (2.6.39 through 3.5 or so), then perhaps fstat should always be checked whenever O_SEARCH or O_EXEC is used. If it fails (which means the kernel is broken), the file descriptor could be closed and open reattempted in O_RDONLY mode. Alternatively, we could define O_SEARCH and O_EXEC to the value 3, and always do the following remapping in userspace: 1. Try to open with O_RDONLY. If it succeeds, we're done. This is REALLY nice because it means O_SEARCH and O_EXEC "just work" even on ancient or broken kernels as long as the target file is readable. 2. Else, add O_PATH and try again. If it still fails, we have a pre-2.6.39 kernel and there's nothing we can do, so just report failure. 3. If open succeeds with O_PATH, then if O_NOFOLLOW is also specified, check fstat, and close the file and report error if fstat succeeded and the obtained fd was a symbolic link. 4. If fstat failed, we have a buggy kernel, so either close and report an error, or just ignore the failure (possibly ignoring the requirements of O_NOFOLLOW), as there seems to be no way to handle it correctly on such kernels. If the kernel developers ever add O_SEARCH/O_EXEC at the kernel level with our proposed value of 3, a step 0, just passing the value to the kernel directly and seeing if it works, could also be added. Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: O_EXEC and O_SEARCH 2013-02-23 4:33 ` Rich Felker @ 2013-02-23 5:01 ` KOSAKI Motohiro 2013-02-23 5:05 ` Rich Felker 0 siblings, 1 reply; 12+ messages in thread From: KOSAKI Motohiro @ 2013-02-23 5:01 UTC (permalink / raw) To: Rich Felker; +Cc: libc-alpha, musl > 1. Try to open with O_RDONLY. If it succeeds, we're done. This is > REALLY nice because it means O_SEARCH and O_EXEC "just work" even on > ancient or broken kernels as long as the target file is readable. Hmm.. This algorithm seems slightly strange to me. Why do you want to try O_RDONLY at first? O_RDONLY require read permission and O_SEARCH, if i understand correctly, doesn't. I think you should try O_PATH at first. > > 2. Else, add O_PATH and try again. If it still fails, we have a > pre-2.6.39 kernel and there's nothing we can do, so just report > failure. > > 3. If open succeeds with O_PATH, then if O_NOFOLLOW is also specified, > check fstat, and close the file and report error if fstat succeeded > and the obtained fd was a symbolic link. > > 4. If fstat failed, we have a buggy kernel, so either close and report > an error, or just ignore the failure (possibly ignoring the > requirements of O_NOFOLLOW), as there seems to be no way to handle it > correctly on such kernels. > > If the kernel developers ever add O_SEARCH/O_EXEC at the kernel level > with our proposed value of 3, a step 0, just passing the value to the > kernel directly and seeing if it works, could also be added. > > Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: O_EXEC and O_SEARCH 2013-02-23 5:01 ` KOSAKI Motohiro @ 2013-02-23 5:05 ` Rich Felker 2013-02-23 5:21 ` KOSAKI Motohiro 0 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2013-02-23 5:05 UTC (permalink / raw) To: KOSAKI Motohiro; +Cc: libc-alpha, musl On Sat, Feb 23, 2013 at 12:01:39AM -0500, KOSAKI Motohiro wrote: > > 1. Try to open with O_RDONLY. If it succeeds, we're done. This is > > REALLY nice because it means O_SEARCH and O_EXEC "just work" even on > > ancient or broken kernels as long as the target file is readable. > > Hmm.. > This algorithm seems slightly strange to me. Why do you want to try O_RDONLY at > first? > O_RDONLY require read permission and O_SEARCH, if i understand correctly, > doesn't. > I think you should try O_PATH at first. If the file is readable, O_RDONLY will succeed and provides the necessary semantics for O_EXEC and O_SEARCH. The only time O_EXEC or O_SEARCH needs special support is when the file is not readable; these modes were specifically designed for supporting that case. Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: O_EXEC and O_SEARCH 2013-02-23 5:05 ` Rich Felker @ 2013-02-23 5:21 ` KOSAKI Motohiro 0 siblings, 0 replies; 12+ messages in thread From: KOSAKI Motohiro @ 2013-02-23 5:21 UTC (permalink / raw) To: Rich Felker; +Cc: libc-alpha, musl On Sat, Feb 23, 2013 at 12:05 AM, Rich Felker <dalias@aerifal.cx> wrote: > On Sat, Feb 23, 2013 at 12:01:39AM -0500, KOSAKI Motohiro wrote: >> > 1. Try to open with O_RDONLY. If it succeeds, we're done. This is >> > REALLY nice because it means O_SEARCH and O_EXEC "just work" even on >> > ancient or broken kernels as long as the target file is readable. >> >> Hmm.. >> This algorithm seems slightly strange to me. Why do you want to try O_RDONLY at >> first? >> O_RDONLY require read permission and O_SEARCH, if i understand correctly, >> doesn't. >> I think you should try O_PATH at first. > > If the file is readable, O_RDONLY will succeed and provides the > necessary semantics for O_EXEC and O_SEARCH. The only time O_EXEC or > O_SEARCH needs special support is when the file is not readable; these > modes were specifically designed for supporting that case. Ah, ok. got it. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: O_EXEC and O_SEARCH 2013-02-22 0:45 O_EXEC and O_SEARCH Rich Felker 2013-02-23 3:05 ` KOSAKI Motohiro @ 2013-02-23 4:54 ` KOSAKI Motohiro 2013-02-23 5:03 ` Rich Felker 1 sibling, 1 reply; 12+ messages in thread From: KOSAKI Motohiro @ 2013-02-23 4:54 UTC (permalink / raw) To: Rich Felker; +Cc: libc-alpha, musl > Right now, we're offering O_EXEC and O_SEARCH in musl libc, defining > them as O_PATH. As long as recent Linux is used, this gives nearly > correct semantics, except that combined with O_NOFOLLOW they do not > fail when the final component is a symbolic link. I believe it's > possible to work around this issue on sufficiently modern kernels > where fstat works on O_PATH file descriptors, but adding the > workaround whenever O_PATH|O_NOFOLLOW is in the flags would change the > semantics when O_PATH is used by the caller rather than O_EXEC or > O_SEARCH, since the value is equal. I'm not sure this is desirable. I have one more question. If I understand correctly, O_NOFOLLOW is unspecified in POSIX. Why do you think the current behavior is not correct? And, as far as I observed, current linux man pages don't tell us O_PATH|O_NOFOLLOW behavior. Is this really intentional result? How do you confirmed? I mean the current behavior is not natural to me and I doubt it is not intentional one. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: O_EXEC and O_SEARCH 2013-02-23 4:54 ` KOSAKI Motohiro @ 2013-02-23 5:03 ` Rich Felker 2013-02-23 5:20 ` KOSAKI Motohiro 0 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2013-02-23 5:03 UTC (permalink / raw) To: KOSAKI Motohiro; +Cc: libc-alpha, musl On Fri, Feb 22, 2013 at 11:54:17PM -0500, KOSAKI Motohiro wrote: > > Right now, we're offering O_EXEC and O_SEARCH in musl libc, defining > > them as O_PATH. As long as recent Linux is used, this gives nearly > > correct semantics, except that combined with O_NOFOLLOW they do not > > fail when the final component is a symbolic link. I believe it's > > possible to work around this issue on sufficiently modern kernels > > where fstat works on O_PATH file descriptors, but adding the > > workaround whenever O_PATH|O_NOFOLLOW is in the flags would change the > > semantics when O_PATH is used by the caller rather than O_EXEC or > > O_SEARCH, since the value is equal. I'm not sure this is desirable. > > I have one more question. If I understand correctly, O_NOFOLLOW is > unspecified in > POSIX. Wrong. > Why do you think the current behavior is not correct? O_NOFOLLOW If path names a symbolic link, fail and set errno to [ELOOP]. See http://pubs.opengroup.org/onlinepubs/9699919799/functions/open.html > And, as far as I observed, current linux man pages don't tell us > O_PATH|O_NOFOLLOW > behavior. Is this really intentional result? How do you confirmed? Yes, it seems intentional. O_PATH without O_NOFOLLOW would resolve the symbolic link and open a file descriptor referring to the target inode. O_PATH|O_NOFOLLOW opens a file descriptor to the symbolic link inode itself. As far as I can see, this behavior is desirable and intentional with O_PATH but wrong for O_SEARCH or O_EXEC. Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: O_EXEC and O_SEARCH 2013-02-23 5:03 ` Rich Felker @ 2013-02-23 5:20 ` KOSAKI Motohiro 2013-02-23 5:28 ` KOSAKI Motohiro 0 siblings, 1 reply; 12+ messages in thread From: KOSAKI Motohiro @ 2013-02-23 5:20 UTC (permalink / raw) To: Rich Felker; +Cc: libc-alpha, musl On Sat, Feb 23, 2013 at 12:03 AM, Rich Felker <dalias@aerifal.cx> wrote: > On Fri, Feb 22, 2013 at 11:54:17PM -0500, KOSAKI Motohiro wrote: >> > Right now, we're offering O_EXEC and O_SEARCH in musl libc, defining >> > them as O_PATH. As long as recent Linux is used, this gives nearly >> > correct semantics, except that combined with O_NOFOLLOW they do not >> > fail when the final component is a symbolic link. I believe it's >> > possible to work around this issue on sufficiently modern kernels >> > where fstat works on O_PATH file descriptors, but adding the >> > workaround whenever O_PATH|O_NOFOLLOW is in the flags would change the >> > semantics when O_PATH is used by the caller rather than O_EXEC or >> > O_SEARCH, since the value is equal. I'm not sure this is desirable. >> >> I have one more question. If I understand correctly, O_NOFOLLOW is >> unspecified in >> POSIX. > > Wrong. > >> Why do you think the current behavior is not correct? > > O_NOFOLLOW > If path names a symbolic link, fail and set errno to [ELOOP]. > > See http://pubs.opengroup.org/onlinepubs/9699919799/functions/open.html ok. this is linux kernel man pages mistake. http://man7.org/linux/man-pages/man2/open.2.html > O_NOFOLLOW > If pathname is a symbolic link, then the open fails. This is a > FreeBSD extension, which was added to Linux in version 2.1.126. > Symbolic links in earlier components of the pathname will still be > followed. >> And, as far as I observed, current linux man pages don't tell us >> O_PATH|O_NOFOLLOW >> behavior. Is this really intentional result? How do you confirmed? > > Yes, it seems intentional. O_PATH without O_NOFOLLOW would resolve the > symbolic link and open a file descriptor referring to the target > inode. O_PATH|O_NOFOLLOW opens a file descriptor to the symbolic link > inode itself. As far as I can see, this behavior is desirable and > intentional with O_PATH but wrong for O_SEARCH or O_EXEC. Hmm... Why? It doesn't match linux man nor posix. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: O_EXEC and O_SEARCH 2013-02-23 5:20 ` KOSAKI Motohiro @ 2013-02-23 5:28 ` KOSAKI Motohiro 0 siblings, 0 replies; 12+ messages in thread From: KOSAKI Motohiro @ 2013-02-23 5:28 UTC (permalink / raw) To: Rich Felker; +Cc: libc-alpha, musl >>> And, as far as I observed, current linux man pages don't tell us >>> O_PATH|O_NOFOLLOW >>> behavior. Is this really intentional result? How do you confirmed? >> >> Yes, it seems intentional. O_PATH without O_NOFOLLOW would resolve the >> symbolic link and open a file descriptor referring to the target >> inode. O_PATH|O_NOFOLLOW opens a file descriptor to the symbolic link >> inode itself. As far as I can see, this behavior is desirable and >> intentional with O_PATH but wrong for O_SEARCH or O_EXEC. > > Hmm... Why? > It doesn't match linux man nor posix. So, I suggest to don't guess and discuss in LKML directly instead. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2013-02-23 5:28 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-02-22 0:45 O_EXEC and O_SEARCH Rich Felker 2013-02-23 3:05 ` KOSAKI Motohiro 2013-02-23 3:17 ` Rich Felker 2013-02-23 3:58 ` KOSAKI Motohiro 2013-02-23 4:33 ` Rich Felker 2013-02-23 5:01 ` KOSAKI Motohiro 2013-02-23 5:05 ` Rich Felker 2013-02-23 5:21 ` KOSAKI Motohiro 2013-02-23 4:54 ` KOSAKI Motohiro 2013-02-23 5:03 ` Rich Felker 2013-02-23 5:20 ` KOSAKI Motohiro 2013-02-23 5:28 ` KOSAKI Motohiro
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/musl/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).