From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/6519 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.comp.lib.glibc.alpha,gmane.linux.kernel.api,gmane.linux.lib.musl.general Subject: Re: [RFC] Possible new execveat(2) Linux syscall Date: Sun, 16 Nov 2014 14:52:46 -0500 Message-ID: <20141116195246.GX22465@brightrain.aerifal.cx> References: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1416167586 15625 80.91.229.3 (16 Nov 2014 19:53:06 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 16 Nov 2014 19:53:06 +0000 (UTC) Cc: libc-alpha@sourceware.org, Andrew Morton , Christoph Hellwig , Linux API , Andy Lutomirski , musl@lists.openwall.com To: David Drysdale Original-X-From: libc-alpha-return-54410-glibc-alpha=m.gmane.org@sourceware.org Sun Nov 16 20:52:59 2014 Return-path: Envelope-to: glibc-alpha@plane.gmane.org Original-Received: from server1.sourceware.org ([209.132.180.131] helo=sourceware.org) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Xq5sY-0001sr-3K for glibc-alpha@plane.gmane.org; Sun, 16 Nov 2014 20:52:58 +0100 DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:from:to:cc:subject:message-id:references :mime-version:content-type:content-transfer-encoding :in-reply-to; q=dns; s=default; b=HYnC8/bem+YxY5XoN0HMEklQd++BOQ 3lRCtYhvxAFwgADpSSHsHP1i5OArDVwVqRufi3w60nnoQtMyj+C0U+xglLGR8SYu cwZSummnmzddfycP2eBoXGn5WKH+brO0lQWWsSrZJKl+uwunQIQiM1pZHeAPT+TT dtPHtL5zisqfs= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:from:to:cc:subject:message-id:references :mime-version:content-type:content-transfer-encoding :in-reply-to; s=default; bh=Ps6raqcXp46HIp8kRzYw7h1hYiI=; b=Zj90 QxBrhJ8wXzMdPIqy1PkThttflx2tmxdRy43tfWaXrE/db9oC17f5bXq60Eu+CPp1 yvQ19tpAs83H7X38Nz7N28x8SOv8rP/zD2tlcPpw6Ti6IP+J72SS/6cR3y03nay3 +rnPHlnORvpGS9a5b63+kbPbFzYUz5ygRSHLE8c= Original-Received: (qmail 26406 invoked by alias); 16 Nov 2014 19:52:53 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Original-Sender: libc-alpha-owner@sourceware.org Original-Received: (qmail 26395 invoked by uid 89); 16 Nov 2014 19:52:53 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.4 required=5.0 tests=AWL,BAYES_00,RDNS_DYNAMIC,TVD_RCVD_IP autolearn=no version=3.3.2 X-HELO: brightrain.aerifal.cx Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Xref: news.gmane.org gmane.comp.lib.glibc.alpha:46706 gmane.linux.kernel.api:6162 gmane.linux.lib.musl.general:6519 Archived-At: On Fri, Nov 14, 2014 at 02:54:19PM +0000, David Drysdale wrote: > Hi, > > Over at the LKML[1] we've been discussing a possible new syscall, execveat(2), > and it would be good to hear a glibc perspective about it (and whether there > are any interface changes that would make it easier to use from userspace). > > The syscall prototype is: > int execveat(int fd, const char *pathname, > char *const argv[], char *const envp[], > int flags); /* AT_EMPTY_PATH, AT_SYMLINK_NOFOLLOW */ > and it works similarly to execve(2) except: > - the executable to run is identified by the combination of fd+pathname, like > other *at(2) syscalls > - there's an extra flags field to control behaviour. > (I've attached a text version of the suggested man page below) > > One particular benefit of this is that it allows an fexecve(3) implementation > that doesn't rely on /proc being accessible, which is useful for sandboxed > applications. (However, that does only work for non-interpreted programs: > the name passed to a script interpreter is of the form "/dev/fd//" > or "/dev/fd/", so the executed interpreter will normally still need /proc > access to load the script file). > > How does this sound from a glibc perspective? I've been following the discussions so far and everything looks mostly okay. There are still issues to be resolved with the different semantics between Linux O_PATH and what POSIX requires for O_EXEC (and O_SEARCH) but as long as the intent is that, once O_EXEC is defined to save the permissions at the time of open and cause them to be used in place of the current file permissions at the time of execveat One major issue however is FD_CLOEXEC with scripts. Last I checked, this didn't work because the file is already closed by the time the interpreted runs. The intended usage of fexecve is almost certainly to call it with the file descriptor set close-on-exec; otherwise, there would be no clean way to close it, since the program being executed doesn't know that it's being executed via fexecve. So this is a serious problem that needs to be solved if it hasn't already. I have some ideas I could offer, but I'm not an expert on the kernel side things so I'm not sure they'd be correct. Rich > Thanks, > David > > [1] https://lkml.org/lkml/2014/11/7/512, with earlier discussions at > https://lkml.org/lkml/2014/11/6/469, https://lkml.org/lkml/2014/10/22/275 > and https://lkml.org/lkml/2014/10/17/428 > > ---- > > EXECVEAT(2) Linux Programmer's Manual EXECVEAT(2) > > NAME > execveat - execute program relative to a directory file descriptor > > SYNOPSIS > #include > > int execveat(int fd, const char *pathname, > char *const argv[], char *const envp[], > int flags); > > DESCRIPTION > The execveat() system call executes the program pointed to by the > combination of fd and pathname. The execveat() system call oper‐ > ates in exactly the same way as execve(2), except for the differ‐ > ences described in this manual page. > > If the pathname given in pathname is relative, then it is inter‐ > preted relative to the directory referred to by the file descriptor > fd (rather than relative to the current working directory of the > calling process, as is done by execve(2) for a relative pathname). > > If pathname is relative and fd is the special value AT_FDCWD, then > pathname is interpreted relative to the current working directory > of the calling process (like execve(2)). > > If pathname is absolute, then fd is ignored. > > If pathname is an empty string and the AT_EMPTY_PATH flag is speci‐ > fied, then the file descriptor fd specifies the file to be exe‐ > cuted. > > flags can either be 0, or include the following flags: > > AT_EMPTY_PATH > If pathname is an empty string, operate on the file referred > to by fd (which may have been obtained using the open(2) > O_PATH flag). > > AT_SYMLINK_NOFOLLOW > If the file identified by fd and a non-NULL pathname is a > symbolic link, then the call fails with the error EINVAL. > > RETURN VALUE > On success, execveat() does not return. On error -1 is returned, > and errno is set appropriately. > > ERRORS > The same errors that occur for execve(2) can also occur for > execveat(). The following additional errors can occur for > execveat(): > > EBADF fd is not a valid file descriptor. > > ENOENT The program identified by fd and pathname requires the use > of an interpreter program (such as a script starting with > "#!") but the file descriptor fd was opened with the > O_CLOEXEC flag and so the program file is inaccessible to > the launched interpreter. > > EINVAL Invalid flag specified in flags. > > ENOTDIR > pathname is relative and fd is a file descriptor referring > to a file other than a directory. > > VERSIONS > execveat() was added to Linux in kernel 3.???. > > NOTES > In addition to the reasons explained in openat(2), the execveat() > system call is also needed to allow fexecve(3) to be implemented on > systems that do not have the /proc filesystem mounted. > > SEE ALSO > execve(2), fexecve(3) > > Linux 2014-04-02 EXECVEAT(2)