From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/11398 Path: news.gmane.org!.POSTED!not-for-mail From: Joakim Sindholt Newsgroups: gmane.linux.lib.musl.general Subject: Re: ENOSYS/EOPNOTSUPP fallback? Date: Mon, 5 Jun 2017 14:46:33 +0200 Message-ID: <20170605124633.GC1214367@wirbelwind> References: <87d1ajdsp8.fsf@utah.edu> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1496666807 20127 195.159.176.226 (5 Jun 2017 12:46:47 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 5 Jun 2017 12:46:47 +0000 (UTC) User-Agent: Mutt/1.5.24 (2015-08-30) To: musl@lists.openwall.com, slade@jnanam.net Original-X-From: musl-return-11411-gllmg-musl=m.gmane.org@lists.openwall.com Mon Jun 05 14:46:43 2017 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1dHrP9-0004tW-1h for gllmg-musl@m.gmane.org; Mon, 05 Jun 2017 14:46:43 +0200 Original-Received: (qmail 28521 invoked by uid 550); 5 Jun 2017 12:46:46 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 28503 invoked from network); 5 Jun 2017 12:46:46 -0000 Content-Disposition: inline In-Reply-To: <87d1ajdsp8.fsf@utah.edu> Xref: news.gmane.org gmane.linux.lib.musl.general:11398 Archived-At: On Sun, Jun 04, 2017 at 09:22:27PM -0600, Benjamin Slade wrote: > I ran into what is perhaps a weird edge case. I'm running a system with > musl that uses a ZFS root fs. When I was trying to install some > flatpaks, I got an `fallocate` failure, with no `dd` fallback. Querying > the flatpak team, the fallback to `dd` seems to be something which glibc > does (and so the other components assume will be taken care). > > Here is the exchange regarding this issue: > https://github.com/flatpak/flatpak/issues/802 To quote the glibc source file linked in the bug: /* Minimize data transfer for network file systems, by issuing single-byte write requests spaced by the file system block size. (Most local file systems have fallocate support, so this fallback code is not used there.) */ /* NFS clients do not propagate the block size of the underlying storage and may report a much larger value which would still leave holes after the loop below, so we cap the increment at 4096. */ /* Write a null byte to every block. This is racy; we currently lack a better option. Compare-and-swap against a file mapping might address local races, but requires interposition of a signal handler to catch SIGBUS. */ Which leaves 2 massive bugs: 1) the leaving of unallocated gaps both because of the NFS thing but also because other file systems may work on entirely different principles that are not accounted for here and 2) overwriting data currently being written to the file as it's being forcibly allocated (which might be doing nothing, think deduplication). This is not a viable general solution and furthermore fallocate is mostly just an optimization hint. If it's a hard requirement of your software I would suggest implementing it in your file system. These operations can only be safely implemented in the kernel. An example: MyFS uses write time deduplication on unused blocks (and blocks with all zeroes fall under the umbrella of unused). Glibc starts its dance where it writes a zero byte to the beginning of each block it perceives and for now let's just say it has the right block size. MyFS just trashes these writes immediately without touching the disk and updates the size metadata which gets lazily written at some point. There's only 400k left on the disk and your fallocate of 16G will succeed and run exceptionally fast to boot, but it will have allocated nothing and your next write fails with ENOSPC. Another example: myutil has 2 threads running. One thread is constantly writing things to a file. The other thread sometimes writes large chunks of data to the file and so it hints the kernel to allocate these large chunks by calling fallocate, and only then taking the lock(s) held internally to synchronize the threads. The first thread finds it needs to update something in the section currently being fallocated by glibc's algorithm. Suddenly zero bytes appear at 4k intervals for no discernible reason, overwriting the data. Personally I would look into seeing to it that flatpak only uses fallocate as an optimization. The most reliable thing I can think of otherwise would be to do the locking necessary (if any) in the program and filling the entire target section of the file with data from /dev/urandom, but even that may fail spectacularly with transparent compression (albeit unlikely). Hope this was at least somewhat helpful.