From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/11452 Path: news.gmane.org!.POSTED!not-for-mail From: Benjamin Slade Newsgroups: gmane.linux.lib.musl.general Subject: Re: ENOSYS/EOPNOTSUPP fallback? Date: Sun, 11 Jun 2017 14:57:59 -0600 Message-ID: <8760g2utrc.fsf@jnanam.net> References: <87d1ajdsp8.fsf@utah.edu> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: blaine.gmane.org 1497218685 16986 195.159.176.226 (11 Jun 2017 22:04:45 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sun, 11 Jun 2017 22:04:45 +0000 (UTC) User-Agent: mu4e 0.9.18; emacs 25.2.1 Cc: musl@lists.openwall.com To: Joakim Sindholt Original-X-From: musl-return-11465-gllmg-musl=m.gmane.org@lists.openwall.com Mon Jun 12 00:04:40 2017 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1dKAyO-0004Bt-A8 for gllmg-musl@m.gmane.org; Mon, 12 Jun 2017 00:04:40 +0200 Original-Received: (qmail 25682 invoked by uid 550); 11 Jun 2017 22:04:43 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 31854 invoked from network); 11 Jun 2017 20:58:14 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:references:user-agent:from:to:cc:subject:in-reply-to:date :message-id:mime-version; bh=4RmHOi8fgNHffZtY8NMxsEYz6cCZlQ8pz8RMt24XOX0=; b=GG8oQKQgFIZd67zhX6UoHF7N9tNCT03Ixexq92f2ielFulz9LyA0Deea7BwWu7HftS hRnlX20+u/9wNDqyN0K+oKr1Ykfyixam46U/xVlRonZg6CW1GRU3z4sEY6OjFpA+hUyX J7BXki3XvwEi7vOhMHpx9hrHXIwqWmzo+B3WR0kQkhqep3ImsbPafxRLJ13YqJ1ltiTQ Iln9m4EQfGLby8gSuzzvl0/a2Imcj+QBIUng6vNYdWvBUpnxMrZtYpxnhQTWk3YSCo2n 4P990bisAoJxQx8KyJ8G0iyxO7gQLjSvEEmwwRvQ273X4cDFu1mL884ZVhg879CxoNmd rf0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:references:user-agent:from:to:cc:subject :in-reply-to:date:message-id:mime-version; bh=4RmHOi8fgNHffZtY8NMxsEYz6cCZlQ8pz8RMt24XOX0=; b=nuECq+tusv7Wa6OpvXgQO7JrHuJoJPh36g/7WXbHBnbVNQINboO3JrlT5S5zZzgPxN D4CC/JgwYAJNwKjaX2IeMceMb4dl1CHai0NAC0cHVhqw/t20RVUG0phhSfKzbT69ghNI KDhfB7hFtSw95WejgxFmLp33bCU4igWimHH3CTNjRMpzA6stEhSR94E4p10zGP+P1LO/ hO8nYEw7GNcl2ioC5XOS4z7uo6pFn3ww9wJBClJwISAUzvhGRiYbDNIVKmUfqAp8qeLF 5Q6pWwbB2iClfOI2d9cs+eL581MB8yKLlc0aBrTUCaSvROgNC+4VcU65JpKiF1aPX+5c KG9A== X-Gm-Message-State: AKS2vOy759hAUSn62Kl9f63dAMDruev9ZX82BYXnUb4t2kit/Be2OS3e 0WPfqSgCqr7RljmtFEI= X-Received: by 10.107.22.1 with SMTP id 1mr2550627iow.117.1497214681496; Sun, 11 Jun 2017 13:58:01 -0700 (PDT) Original-Sender: Benjamin Slade In-reply-to: Xref: news.gmane.org gmane.linux.lib.musl.general:11452 Archived-At: Thank you for the extensive reply. Just to be clear: I'm just an end-user of flatpak, &c. As far as I can tell, flatpak is making use of `ostree` which assumes that the libc will take care of handling `dd` fallback (I got the impression that flatpak isn't directly calling `fallocate` itself). Do you think there's an obvious avenue for following up on this? Admittedly this is an edge-case that won't necessarily affect musl users on ext4, but it will affect musl users on zfs (and I believe f2fs). Do you think `ostree` shouldn't rely on the libc for fallback? Or should ZFS on Linux implement a fallback for fallocate? -- Benjamin Slade `(pgp_fp: ,(21BA 2AE1 28F6 DF36 110A 0E9C A320 BBE8 2B52 EE19)) '(sent by mu4e on Emacs running under GNU/Linux . https://gnu.org ) '(Choose Linux, Choose Freedom . https://linux.com ) On 2017-06-05T06:46:33-0600, Joakim Sindholt wrote: > On Sun, Jun 04, 2017 at 09:22:27PM -0600, Benjamin Slade wrote: > > I ran into what is perhaps a weird edge case. I'm running a system with > > musl that uses a ZFS root fs. When I was trying to install some > > flatpaks, I got an `fallocate` failure, with no `dd` fallback. Querying > > the flatpak team, the fallback to `dd` seems to be something which glibc > > does (and so the other components assume will be taken care). > > > > Here is the exchange regarding this issue: > > https://github.com/flatpak/flatpak/issues/802 > To quote the glibc source file linked in the bug: > /* Minimize data transfer for network file systems, by issuing > single-byte write requests spaced by the file system block size. > (Most local file systems have fallocate support, so this fallback > code is not used there.) */ > /* NFS clients do not propagate the block size of the underlying > storage and may report a much larger value which would still > leave holes after the loop below, so we cap the increment at > 4096. */ > /* Write a null byte to every block. This is racy; we currently > lack a better option. Compare-and-swap against a file mapping > might address local races, but requires interposition of a signal > handler to catch SIGBUS. */ > Which leaves 2 massive bugs: > 1) the leaving of unallocated gaps both because of the NFS thing but > also because other file systems may work on entirely different > principles that are not accounted for here and > 2) overwriting data currently being written to the file as it's being > forcibly allocated (which might be doing nothing, think deduplication). > This is not a viable general solution and furthermore fallocate is > mostly just an optimization hint. If it's a hard requirement of your > software I would suggest implementing it in your file system. These > operations can only be safely implemented in the kernel. > An example: > MyFS uses write time deduplication on unused blocks (and blocks with all > zeroes fall under the umbrella of unused). Glibc starts its dance where > it writes a zero byte to the beginning of each block it perceives and > for now let's just say it has the right block size. MyFS just trashes > these writes immediately without touching the disk and updates the size > metadata which gets lazily written at some point. There's only 400k left > on the disk and your fallocate of 16G will succeed and run exceptionally > fast to boot, but it will have allocated nothing and your next write > fails with ENOSPC. > Another example: > myutil has 2 threads running. One thread is constantly writing things to > a file. The other thread sometimes writes large chunks of data to the > file and so it hints the kernel to allocate these large chunks by > calling fallocate, and only then taking the lock(s) held internally to > synchronize the threads. The first thread finds it needs to update > something in the section currently being fallocated by glibc's > algorithm. Suddenly zero bytes appear at 4k intervals for no discernible > reason, overwriting the data. > Personally I would look into seeing to it that flatpak only uses > fallocate as an optimization. The most reliable thing I can think of > otherwise would be to do the locking necessary (if any) in the program > and filling the entire target section of the file with data from > /dev/urandom, but even that may fail spectacularly with transparent > compression (albeit unlikely). > Hope this was at least somewhat helpful.