From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.1 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: from second.openwall.net (second.openwall.net [193.110.157.125]) by inbox.vuxu.org (Postfix) with SMTP id C257021598 for ; Thu, 18 Jan 2024 16:54:43 +0100 (CET) Received: (qmail 18152 invoked by uid 550); 18 Jan 2024 15:52:47 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 18111 invoked from network); 18 Jan 2024 15:52:47 -0000 Date: Thu, 18 Jan 2024 10:54:44 -0500 From: Rich Felker To: Leah Neukirchen Cc: musl@lists.openwall.com, enh Message-ID: <20240118155443.GW4163@brightrain.aerifal.cx> References: <20211020181836.GP7074@brightrain.aerifal.cx> <27LHJ7Y2LTSTA.218670SJVIHRW@hera.home.vuxu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <27LHJ7Y2LTSTA.218670SJVIHRW@hera.home.vuxu.org> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] preadv2/pwritev2 On Wed, Jan 17, 2024 at 03:08:29PM +0100, Leah Neukirchen wrote: > Rich Felker wrote: > > On Tue, Oct 19, 2021 at 07:24:26PM -0700, enh wrote: > > > i've recently added preadv2(2) and pwritev2(2) wrappers to bionic, since we > > > had our first real prospective user come along, and they're mildly annoying > > > to use via syscall(3). unfortunately, this particular user also wants to be > > > able to compile for the host, and our glibc is years out of date, and our > > > current plan is to move to musl for the host[1]. > > > > > > anyway ... musl doesn't have preadv2/pwritev2. i couldn't see any > > > discussion on the mailing list, so i thought i'd ask whether this is just > > > because no-one's got round to it yet, or there's some policy[2] i'm not > > > aware of, or what? happy to send a patch if it's just a case of "we haven't > > > got round to/had a need for it yet". > > > > > > ____ > > > 1. TL;DR: being able to statically link without worrying about licensing is > > > very enticing, and gets us out of a lot of the compatibility issues we have > > > that made our last glibc update more trouble than it was worth, and means i > > > have no intention of getting us embroiled in another glibc update. > > > 2. i've been maintaining bionic for years now, and don't think i've written > > > down our policy explicitly. this was definitely a borderline case from the > > > "number of users" perspective, but for me the "annoying to use with > > > syscall(2)" tipped me over the edge into adding them. amusingly [or not, > > > depending on how you feel about "bugs you get away with"], it also made me > > > realize that our pread/pwrite implementations for LP64 were wrong in that > > > they weren't zeroing the unused half of the register pair. so that was a > > > bonus :-) > > > > There is high level policy for decision-making process for > > inclusion/exclusion. For new sycalls that are "safe" to use directly > > via syscall() it's not terribly urgent to take any action, but some > > like these would benefit from being cancellation points, which makes > > them somewhat compelling. If we do add them, I want to make sure we > > don't conflict with glibc's way of exposing them to applications (if > > they have one yet) -- things like the function signatures and how the > > flags are exposed. None of this looks hard to get right though. So I > > think it should be pretty straightforward to add these. > > Bumping this, as bcachefs-tools now uses pwritev2. > > glibc wraps the syscall with a cancellation point and also tries to > fall back to pwritev/writev when flags is zero and the original call > failed with ENOSYS. Vice versa for preadv2. > > I didn't bother with the fallback since the call is there since Linux 4.6: > > ssize_t pwritev2(int fd, const struct iovec *iov, int count, off_t ofs, int flags) > { > return syscall_cp(SYS_pwritev2, fd, iov, count, > (long)(ofs), (long)(ofs>>32), flags); > } "Since Linux 4.6" isn't really an indication for not needing fallback. Normally I'd say check !flags first and just use the old syscall, but here I'm not really sure. Eventually it's going to be the other way around -- pwrite() needs to be implemented in terms of SYS_pwritev2 once my patch "vfs: add RWF_NOAPPEND flag for pwritev2" is upstream, because currently it's dangerously misbehaving. At that point I'm not sure what the right thing to do with the Linux-specific pwritev() would be, but my leaning would be that it should behave like flags==RWF_NOAPPEND rather than flags==0. When that's done, plain pwrite() (and possibly pwritev?) will have to probe for O_APPEND flag in the fallback case if SYS_pwritev2 returns -ENOSYS, and issue some kind of error in that case (which it really should already be doing), so there's a mess of stuff to be done here. I'm not saying we have to solve all this now, just as context. I think it will be future-proof if we just use the raw SYS_pwritev syscall in the !flags case either before or after attempting SYS_pwritev2. Rich