* shutdown hang with zpools on ISCSI devices @ 2024-08-05 20:50 Carsten Grzemba 2024-08-05 21:46 ` [discuss] " Joshua M. Clulow ` (2 more replies) 0 siblings, 3 replies; 34+ messages in thread From: Carsten Grzemba @ 2024-08-05 20:50 UTC (permalink / raw) To: illumos-discuss [-- Attachment #1: Type: text/plain, Size: 998 bytes --] I have a problem shutting down a server with Zpool on iscsi devices. The shutdown gets stuck. In the console log I see the message: WARNING: Pool 'zones' has encountered an uncorrectable I/O failure and has been suspended; `zpool clear` will be required before the pool can be written to. The message looks to me like the IP connection to the storage server was already terminated before the Zpool was unmounted. I forced the shutdown with an NMI reset so that I have a crash dump. In the crash dump I see that a zone on the pool in question is still in shutting_down. Is it possible to get information from crash dump like *svcs* of a running system? How can I see in the crash dump whether the IP interfaces are still working? Normally the SMF svc:/network/iscsi/initiator:default should wait for the zones to shutdown and reach the state installed. How can I see in the crash dump that the SMF dependencies are correctly taken into account? Many thanks! Carsten [-- Attachment #2: Type: text/html, Size: 2860 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] shutdown hang with zpools on ISCSI devices 2024-08-05 20:50 shutdown hang with zpools on ISCSI devices Carsten Grzemba @ 2024-08-05 21:46 ` Joshua M. Clulow 2024-08-06 5:17 ` Carsten Grzemba 2024-08-05 22:45 ` [discuss] shutdown hang with zpools on ISCSI devices John D Groenveld 2024-08-06 15:56 ` Gordon Ross 2 siblings, 1 reply; 34+ messages in thread From: Joshua M. Clulow @ 2024-08-05 21:46 UTC (permalink / raw) To: illumos-discuss On Mon, 5 Aug 2024 at 13:50, Carsten Grzemba via illumos-discuss <discuss@lists.illumos.org> wrote: > I have a problem shutting down a server with Zpool on iscsi devices. The shutdown gets stuck. In the console log I see the message: > WARNING: Pool 'zones' has encountered an uncorrectable I/O failure and has been suspended; `zpool clear` will be required before the pool can be written to. > The message looks to me like the IP connection to the storage server was already terminated before the Zpool was unmounted. > I forced the shutdown with an NMI reset so that I have a crash dump. In the crash dump I see that a zone on the pool in question is still in shutting_down. > Is it possible to get information from crash dump like *svcs* of a running system? Not really easily, no, but you can at least see the tree of processes that is running; e.g., > ::ptree fffffffffbc93760 sched fffffeb1cedab008 zpool-extra fffffeb1c8c2c000 fsflush fffffeb1c8c2f020 pageout fffffeb1c8c34018 init fffffeb38ccb4020 screen fffffeb5aaddd020 bash fffffeb1f123e010 init fffffeb3a3070000 image-builder fffffeb263501020 pkg fffffeb1cea71018 sac fffffeb1ca8a7018 ttymon ... > How can I see in the crash dump whether the IP interfaces are still working? Normally the SMF > svc:/network/iscsi/initiator:default > should wait for the zones to shutdown and reach the state installed. I'm not sure that it does do that. Looking at the manifest, I don't see any interaction with zones at all: https://github.com/illumos/illumos-gate/blob/master/usr/src/cmd/iscsid/iscsi-initiator.xml There is some special handling in the shutdown code today for unmounting file systems that are listed in vfstab(5) as being mounted from iSCSI disks. That obviously doesn't help you with ZFS pools, though, and I expect the real bug here is that the "umount_iscsi()" function in that method script doesn't do anything to export ZFS pools that live on iSCSI disks: https://github.com/illumos/illumos-gate/blob/master/usr/src/cmd/iscsid/iscsi-initiator#L172-L199 How do you _import_ these pools today? It seems likely that pools that come from iSCSI devices should actually not appear in the global zpool cache file (i.e., "/etc/zfs/zpool.cache"), but rather should get imported transiently somehow by the iSCSI initiator service in "mount_iscsi()". Then they should get exported on the way to shutdown. In short: importing and exporting those pools should probably work like vfstab(5) entries with "iscsi" set in the automount field. Otherwise there does not appear to be any mechanism in place to handle the network going away as part of shutdown. Cheers. -- Joshua M. Clulow http://blog.sysmgr.org ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] shutdown hang with zpools on ISCSI devices 2024-08-05 21:46 ` [discuss] " Joshua M. Clulow @ 2024-08-06 5:17 ` Carsten Grzemba 2024-08-06 5:29 ` Carsten Grzemba 0 siblings, 1 reply; 34+ messages in thread From: Carsten Grzemba @ 2024-08-06 5:17 UTC (permalink / raw) To: illumos-discuss [-- Attachment #1: Type: text/plain, Size: 3581 bytes --] Am 05.08.24 23:48 schrieb "Joshua M. Clulow via illumos-discuss" <discuss@lists.illumos.org>: > > On Mon, 5 Aug 2024 at 13:50, Carsten Grzemba via illumos-discuss > <discuss@lists.illumos.org> wrote: > > I have a problem shutting down a server with Zpool on iscsi devices. The shutdown gets stuck. In the console log I see the message: > > WARNING: Pool 'zones' has encountered an uncorrectable I/O failure and has been suspended; `zpool clear` will be required before the pool can be written to. > > The message looks to me like the IP connection to the storage server was already terminated before the Zpool was unmounted. > > I forced the shutdown with an NMI reset so that I have a crash dump. In the crash dump I see that a zone on the pool in question is still in shutting_down. > > Is it possible to get information from crash dump like *svcs* of a running system? > > Not really easily, no, but you can at least see the tree of processes > that is running; e.g., > > > ::ptree > fffffffffbc93760 sched > fffffeb1cedab008 zpool-extra > fffffeb1c8c2c000 fsflush > fffffeb1c8c2f020 pageout > fffffeb1c8c34018 init > fffffeb38ccb4020 screen > fffffeb5aaddd020 bash > fffffeb1f123e010 init > fffffeb3a3070000 image-builder > fffffeb263501020 pkg > fffffeb1cea71018 sac > fffffeb1ca8a7018 ttymon > ... > > > How can I see in the crash dump whether the IP interfaces are still working? Normally the SMF > > svc:/network/iscsi/initiator:default > > should wait for the zones to shutdown and reach the state installed. > > I'm not sure that it does do that. Looking at the manifest, I don't > see any interaction with zones at all: > > https://github.com/illumos/illumos-gate/blob/master/usr/src/cmd/iscsid/iscsi-initiator.xml > > There is some special handling in the shutdown code today for > unmounting file systems that are listed in vfstab(5) as being mounted > from iSCSI disks. That obviously doesn't help you with ZFS pools, > though, and I expect the real bug here is that the "umount_iscsi()" > function in that method script doesn't do anything to export ZFS pools > that live on iSCSI disks: > > https://github.com/illumos/illumos-gate/blob/master/usr/src/cmd/iscsid/iscsi-initiator#L172-L199 > interesting, I didn't know that "mount at boot" could have values other than "yes" or "no". > > (https://github.com/illumos/illumos-gate/blob/master/usr/src/cmd/iscsid/iscsi-initiator#L172-L199 ) > > How do you _import_ these pools today? It seems likely that pools > that come from iSCSI devices should actually not appear in the global > zpool cache file (i.e., "/etc/zfs/zpool.cache"), but rather should get > imported transiently somehow by the iSCSI initiator service in > "mount_iscsi()". Then they should get exported on the way to > shutdown. In short: importing and exporting those pools should > probably work like vfstab(5) entries with "iscsi" set in the automount > field. Otherwise there does not appear to be any mechanism in place > to handle the network going away as part of shutdown. > > > Cheers. > > -- > Joshua M. Clulow > http://blog.sysmgr.org > > ------------------------------------------ > illumos: illumos-discuss > Permalink: https://illumos.topicbox.com/groups/discuss/T494e7618fdacd18b-Mffbe5a8bca2b03942d98d534 > Delivery options: https://illumos.topicbox.com/groups/discuss/subscription > -- Carsten Grzemba Tel.: +49 3677 64740 Mobil: +49 171 9749479 Email: carsten.grzemba@contac-dt.de contac Datentechnik GmbH [-- Attachment #2: Type: text/html, Size: 5306 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] shutdown hang with zpools on ISCSI devices 2024-08-06 5:17 ` Carsten Grzemba @ 2024-08-06 5:29 ` Carsten Grzemba 2024-08-06 5:46 ` Joshua M. Clulow 0 siblings, 1 reply; 34+ messages in thread From: Carsten Grzemba @ 2024-08-06 5:29 UTC (permalink / raw) To: illumos-discuss [-- Attachment #1: Type: text/plain, Size: 552 bytes --] I have an additional SMF service in place which should export the zpool on shutdown before iscsi-initiator is gone and on boot import after iscsi-initiator is in place. But if the shutdown not work in that kind, the zpools are in zfs cache at next boot. For zones the SMF dependency should be, that SMF zones need SMF milestone multiuser-server and for multiuser-server is iscs-initiator an prerequisite. But I noticed that zones will also attempt to shutdown if I run 'reboot' where I thought that no attempt was being made to stop any services. [-- Attachment #2: Type: text/html, Size: 759 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] shutdown hang with zpools on ISCSI devices 2024-08-06 5:29 ` Carsten Grzemba @ 2024-08-06 5:46 ` Joshua M. Clulow 2024-08-06 6:36 ` Carsten Grzemba 0 siblings, 1 reply; 34+ messages in thread From: Joshua M. Clulow @ 2024-08-06 5:46 UTC (permalink / raw) To: illumos-discuss On Mon, 5 Aug 2024 at 22:29, Carsten Grzemba via illumos-discuss <discuss@lists.illumos.org> wrote: > I have an additional SMF service in place which should export the zpool on shutdown before iscsi-initiator is gone and on boot import after iscsi-initiator is in place. But if the shutdown not work in that kind, the zpools are in zfs cache at next boot. If you want to do this on your own, then I suspect you would want to set the "cachefile" property to "none". According to zpool(8) this would prevent the kernel from writing the pool's configuration to the system cache file. Presumably that would prevent it being imported automatically. At boot you could explicitly import the pools you want, and at shutdown you could presumably export them before networking goes down. > But I noticed that zones will also attempt to shutdown if I run 'reboot' where I thought that no attempt was being made to stop any services. I believe to get that behaviour, you need to use "reboot -q". As per reboot(8): -q Quick. Reboot quickly and ungracefully, without shutting down running processes first. Cheers. -- Joshua M. Clulow http://blog.sysmgr.org ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] shutdown hang with zpools on ISCSI devices 2024-08-06 5:46 ` Joshua M. Clulow @ 2024-08-06 6:36 ` Carsten Grzemba 2024-08-06 6:46 ` Joshua M. Clulow 0 siblings, 1 reply; 34+ messages in thread From: Carsten Grzemba @ 2024-08-06 6:36 UTC (permalink / raw) To: illumos-discuss [-- Attachment #1: Type: text/plain, Size: 105 bytes --] What features do I lose if I work with zpool setting cachefile=none, beside the automatic import on boot? [-- Attachment #2: Type: text/html, Size: 161 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] shutdown hang with zpools on ISCSI devices 2024-08-06 6:36 ` Carsten Grzemba @ 2024-08-06 6:46 ` Joshua M. Clulow 2024-08-06 14:27 ` Carsten Grzemba 0 siblings, 1 reply; 34+ messages in thread From: Joshua M. Clulow @ 2024-08-06 6:46 UTC (permalink / raw) To: illumos-discuss On Mon, 5 Aug 2024 at 23:36, Carsten Grzemba via illumos-discuss <discuss@lists.illumos.org> wrote: > What features do I lose if I work with zpool setting cachefile=none, beside the automatic import on boot? You ought to test it to make sure it works for you, but according to the zpool(8) manual page, that's the focus of the property. Cheers. -- Joshua M. Clulow http://blog.sysmgr.org ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] shutdown hang with zpools on ISCSI devices 2024-08-06 6:46 ` Joshua M. Clulow @ 2024-08-06 14:27 ` Carsten Grzemba 2024-08-06 15:16 ` Carsten Grzemba ` (2 more replies) 0 siblings, 3 replies; 34+ messages in thread From: Carsten Grzemba @ 2024-08-06 14:27 UTC (permalink / raw) To: illumos-discuss [-- Attachment #1: Type: text/plain, Size: 136 bytes --] What does the shutdown process do with the zpools? It is not a zfs umount or zpool export. If so the zpool.cache file would be deleted. [-- Attachment #2: Type: text/html, Size: 205 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] shutdown hang with zpools on ISCSI devices 2024-08-06 14:27 ` Carsten Grzemba @ 2024-08-06 15:16 ` Carsten Grzemba 2024-08-06 19:04 ` Joshua M. Clulow 2024-09-06 11:53 ` Illumos future and compatibility with Open-ZFS gea 2 siblings, 0 replies; 34+ messages in thread From: Carsten Grzemba @ 2024-08-06 15:16 UTC (permalink / raw) To: illumos-discuss [-- Attachment #1: Type: text/plain, Size: 136 bytes --] What does the shutdown process do with the zpools? It is not a zfs umount or zpool export. If so the zpool.cache file would be deleted. [-- Attachment #2: Type: text/html, Size: 205 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] shutdown hang with zpools on ISCSI devices 2024-08-06 14:27 ` Carsten Grzemba 2024-08-06 15:16 ` Carsten Grzemba @ 2024-08-06 19:04 ` Joshua M. Clulow 2024-09-06 11:53 ` Illumos future and compatibility with Open-ZFS gea 2 siblings, 0 replies; 34+ messages in thread From: Joshua M. Clulow @ 2024-08-06 19:04 UTC (permalink / raw) To: illumos-discuss On Tue, 6 Aug 2024 at 07:27, Carsten Grzemba via illumos-discuss <discuss@lists.illumos.org> wrote: > What does the shutdown process do with the zpools? Pools that are still imported at the very last point, when the kernel is preparing to reboot, are like any other file system: we try to sync outstanding changes to disk, which usually generates at least a little bit of I/O. Nothing is unmounted, just made consistent. The reason this is a challenge for network file systems and block devices, of course, is that we've already brought down the IP stack. That's why the iSCSI initiator service unmounts file systems from vfstab(5), and why your iSCSI pools likely need to be exported prior to that point. > It is not a zfs umount or zpool export. If so the zpool.cache file would be deleted. Correct. An explicit export is required to remove a pool from the cache file that was used when it was imported. -- Joshua M. Clulow http://blog.sysmgr.org ^ permalink raw reply [flat|nested] 34+ messages in thread
* Illumos future and compatibility with Open-ZFS 2024-08-06 14:27 ` Carsten Grzemba 2024-08-06 15:16 ` Carsten Grzemba 2024-08-06 19:04 ` Joshua M. Clulow @ 2024-09-06 11:53 ` gea 2024-09-07 8:30 ` [discuss] " Joshua M. Clulow 2 siblings, 1 reply; 34+ messages in thread From: gea @ 2024-09-06 11:53 UTC (permalink / raw) To: illumos-discuss [-- Attachment #1: Type: text/plain, Size: 2519 bytes --] *There are very good reasons to prefer Illumos ZFS over Open-ZFS like* - Stability especially with OmniOS and a bloody,stable,long term stable and a repository for each - Easyness there is one current Illumos with one current ZFS not the versions chaos of Open-ZFS with a douzen of distributions, each with its own problems and update paths to Open-ZFS newest. - Efficiency the deep OS Integration makes ZFS on Illumos very resource efficient - Services like Kernel SMB or Comstar iSCSI especially the kernelbased SMB server is for me the only thinkable alternative to a Windows Server when it comes to ACL compatibility or simplicity. SAMBA is a pain compared to the kernelbased SMB. *But now the great BUT* Open-ZFS is where the music plays with a lot of new features like ZSTD, Draid, Raid-Z Expansion or Fast Dedup and more to come. Lack of them means that you can no longer import a current Open-ZFS pool in Illumos and more important, these are killer features in some cases and therefor a criteria to use or not to use Illumos. If you look at the flavours of "Open-ZFS" with independent repositories, there are mainly three: 1. Open-ZFS Master (BSD, Linux), currently 2.2.6 no longer a common roof for "Open" ZFS development but the place where development happens 2. Based on an older Open-ZFS and incompatible to newest Open-ZFS Illumos and Qnap 3. Open-ZFS on Windows (and OSX), beta/release candidates This is fork of Open-ZFS that is updated to Open-ZFS Master from release candidate to release candidate. While I suppose Jorgen Lundman (maintainer) originally planned a full integration of Windows ZFS directly into Open-ZFS, it seems that he now intends to use Open-ZFS simply as upstream, just like Illumos was upstream for a long time for BSD and Linux. * I want to ask.* When I look at the Illumos Issue tracker, I see mainly small fixes, hardly new features does not matter regarding Illumos services or Open-ZFS features. I suppose number of devs is limited. *What is the future idea of Illumos? * -Be like Qnap and do not care about Open-ZFS and add one or the other new feature from time to time? -Try something like BSD or Windows or OSX (switch to Open-ZFS as full upstream)? I know a switch to Open-ZFS as upstream is not easy and can last and I am not even sure if it is really wanted or possible with current resources. But maybe this is the only option to make Illumos future proof? Gea [-- Attachment #2: Type: text/html, Size: 3829 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] Illumos future and compatibility with Open-ZFS 2024-09-06 11:53 ` Illumos future and compatibility with Open-ZFS gea @ 2024-09-07 8:30 ` Joshua M. Clulow 2024-09-07 11:12 ` Gea 2024-09-07 14:32 ` Miles Nordin 0 siblings, 2 replies; 34+ messages in thread From: Joshua M. Clulow @ 2024-09-07 8:30 UTC (permalink / raw) To: illumos-discuss On Fri, 6 Sept 2024 at 04:53, gea@napp-it.org <gea@napp-it.org> wrote: > Open-ZFS is where the music plays with a lot of new features like ZSTD, Draid, Raid-Z Expansion or Fast Dedup and more to come. Lack of them means that you can no longer import a current Open-ZFS pool in Illumos and more important, these are killer features in some cases and therefor a criteria to use or not to use Illumos. There is certainly much interesting work occurring in the OpenZFS project! There is in fact _so much_ work going on that it's frankly difficult to be fully across all of it, and to ensure all of it has received adequate review and testing. At some point there is also the fact that every change introduces risk, even if it is not obvious what that risk would be. We try to balance many different competing concerns when we're importing code from the OpenZFS project, or any other source base that we share with; e.g., FreeBSD. Amongst those concerns is that the OpenZFS development model centres around an unstable master branch, with a stabilisation process to get to periodic release branches. This is quite different from our own model here; critically, we want every commit that goes into our master branch to be something you could ship to a customer. If we make a change to the ZFS pool format in a master branch, it's there to stay forever. In practice, no regressions on master is always something of a stretch goal, but it's one we aspire to! Importing code from OpenZFS without introducing regressions is a large and time-consuming undertaking. One must be absolutely sure to understand not just the patch one wants to import, but all of the many, many patches that fly in around that patch and touch other parts of the code. The impact of importing a patch without importing prior commits that patch depends on, or without importing _subsequent_ fixes that have been made after the patch landed, can be the difference between uneventful success, or rampant and immediate data corruption on user's machines. It's the file system -- we must be cautious! > 1. Open-ZFS Master (BSD, Linux), currently 2.2.6 > no longer a common roof for "Open" ZFS development but the place where development happens To be clear, it's the place where development _on the OpenZFS project_ happens. While our rate of change is demonstrably less frenzied, part of that is because at a project level we generally place a higher value on stability (in every sense of the word) over new features; that's what it means to be dependable infrastructure software! > I want to ask. > When I look at the Illumos Issue tracker, I see mainly small fixes, hardly new features does not matter regarding Illumos services or Open-ZFS features. I suppose number of devs is limited. What sort of features would you like to see? Flicking back through the last six months of commits, I've attempted to pick just based on the synopses, things that are (as far as I recall) new feature working going into the OS: 16705 Want lifetime extensions for TCP_MD5SIG 16708 SMBIOS 3.8 Support 16675 want syncfs(3C) 16654 nvme: expose internal counters via kstats 16655 nvme: introduce kstats for admin commands 16636 want ptsname_r 16635 want SO_PROTOCOL socket option support 14237 Want support for pthread_cond_clockwait() and friends 16624 Want support for FD_CLOFORK and friends 14736 zfs: Improve sorted scan memory accounting 16627 Flesh out AMD Turin chip and Zen 5 uarch revisions. 16076 pwdx should work on core files 16475 loader.efi: replace comconsole with efiserialio 13967 diskinfo(8) should show serial numbers in more cases 14151 truss: truss should recognize VMM ioctl codes 16547 savecore should report progress when saving compressed dump 16524 driver for 38xx HBA in illumos 16491 netcat should support setting the outgoing and minimum TTL 16501 netcat should support setting IPV6_TCLASS for ipv6 sockets 16455 want TCP_MD5SIG socket option 16454 want IP_MINTTL socket option 15977 smbadm option to show group member SIDs 15976 List users connected via SMB 16452 want strerrordesc_np and strerrorname_np 16459 want emlxs to support Oracle branded LP adapters 16408 AMD Zen 5 CPC support 16056 want fmdump ability to AND event property filters 16423 Import fletcher-4 algorithms from OpenZFS 14919 tem: implement xenl 16348 e1000g I219 V17 does not attach 16347 e1000g LM+V24-27,29 support 16327 wdc nvme assertion clearing support 16326 Update NVMe error status codes for 2.x 16325 nvmeadm: want ability to write log page to raw file 16324 Want Micron 7300, 7400, 7450 log support At the end of the day, we're an open source project with a finite set of engineering resources just like any other project. When feature work happens, it's generally because somebody has a problem they need to solve and they're motivated to do the work. I work at Oxide, and we have obviously a very strong motivation to keep working on illumos and to ensure the OS continues to meet our goals for stability, security, performance, and features. The same is generally true of all other contributors, whether it's a corporate or an unaffiliated community member. The bug tracker is also not the full picture. We also have the "illumos Project Discussion" (IPD) repository, in which we try to work through longer term plans that might span many integrations into master: https://github.com/illumos/ipd Anybody is welcome to file and work on bugs, or to write and circulate IPDs, if they have a project they'd like to spearhead or work they'd like to do! > What is the future idea of Illumos? > I know a switch to Open-ZFS as upstream is not easy and can last and I am not even sure if it is really wanted or possible with current resources. But maybe this is the only option to make Illumos future proof? I don't think there is currently a plan to abandon maintaining our own ZFS and switch to importing OpenZFS each time they cut a release branch. As long as there is an illumos project -- and, after 14 years, I think it's safe to say that we're in it for the long haul! -- we'll be looking after it, and the rest of the code, as a cohesive whole. We will continue to import things from OpenZFS where we believe they will not impact the stability of our own software, and we'll continue to do work on our ZFS that may only be of interest to us (e.g., we're the only OS with zones, another subsystem that has integration with ZFS). As with any truly open source project, you, and indeed anybody else, are welcome to roll up your sleeves and work on the changes that would benefit you and your use cases personally! We depend on motivated individuals scratching their own itches as much as anything else. If you can demonstrate genuine motivation and engagement, I know that both the core team and the community at large will do our best to help with the work you want to do. Cheers! -- Joshua M. Clulow http://blog.sysmgr.org ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] Illumos future and compatibility with Open-ZFS 2024-09-07 8:30 ` [discuss] " Joshua M. Clulow @ 2024-09-07 11:12 ` Gea 2024-09-07 16:38 ` Joshua M. Clulow 2024-09-09 11:13 ` Jorgen Lundman 2024-09-07 14:32 ` Miles Nordin 1 sibling, 2 replies; 34+ messages in thread From: Gea @ 2024-09-07 11:12 UTC (permalink / raw) To: discuss Thank you for your detailled answer. I understand the stabilty aspect and value it very high and indeed Illumos ZFS has not seen as many problems up to dataloss in 10 years as one or the other Linux ZFS distribution in the last year. Unlike Linux every Illumos distribution is up to date with a clear and easy way to update to newest or downgrade. The OmniOS model of a bloody (current Illumos), stable (a freeze every 6 months) and a long term stable, each with its own repository undelines this strong focus on stability. This is something I don't want to miss so preserve an independent fork. Using code from newest Open-ZFS master is surely not a good idea, the question is how to take over code from a stable branch ex 2.2.6 or better a former with a better stability like 2.2.3 or what TrueNAS is using as stable branch. The real question from a user view is how to preserve compatibility with a ZFS pool from a current BSD, Debian, Ubuntu or TrueNAS and when to get important Open-ZFS features like ZSTD, Draid, Raid-Z exansion, Fast Dedup, direct io or higher limits ex on recsize and much more to come while value stability as item2. Stability and compatibility are not either or. Find a way to get newer Open-ZFS features including bugfix commits with the best achievable stability. Not to have newer ZFS features is not an option. Have your cake and eat it. In a perfect world with endless resources you can check, include and maintain every of these features on your own including bugfixes when they become known but is this thinkable? I suppose no, so the consequence is that you loose compatibility and lack more and more newer ZFS features. In the end >95% of all ZFS users are now using Open-ZFS and with such many users problem rate is not as bad and bugs are fixed in a quite short time. Maybe a staging model can be one solution like an older more stable Open-ZFS branch > Illumos testing > Illumos stable (similar to what current Illumos is), just like we see it in Open-ZFS or the OmniOS approach or in the Open-ZFS world with Open-ZFS forks on OSX or Windows. This would allow newest Open-ZFS features to appear with a delay but to appear for sure. Maybe newer long awaited bits like the recent NFS and SMB improvements have also a chance to appear in a Illumos testing branch for wider testings. Just one idea but a workable idea how to follow Open-ZFS development is critical or user base will lower not from year to year but month to month. I can see it in my own user base where most of my former Solaris/OI/OmniOS users from some years ago switched to Open-ZFS in the meantime, some to Qnap most to Debian/Proxmox/TrueNAS/Ubuntu. This is why I also switched my new ZFS client-server cs web-gui to Open-ZFS (any OS) without further development in the Solaris/Illumos SE edition beside bugfixes. Gea ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] Illumos future and compatibility with Open-ZFS 2024-09-07 11:12 ` Gea @ 2024-09-07 16:38 ` Joshua M. Clulow 2024-09-07 21:37 ` Gea 2024-09-09 11:13 ` Jorgen Lundman 1 sibling, 1 reply; 34+ messages in thread From: Joshua M. Clulow @ 2024-09-07 16:38 UTC (permalink / raw) To: illumos-discuss On Sat, 7 Sept 2024 at 04:12, Gea <gea@napp-it.org> wrote: > Illumos ZFS has not seen as many problems up to dataloss in 10 years as > one or the other Linux ZFS distribution in the last year. Right, and this is the most critical metric for us. Any increase in data loss is unacceptable. > is how to take over code from a stable branch ex 2.2.6 or better a > former with a better stability like 2.2.3 or what TrueNAS is using as > stable branch. It's worrying to me that you're able to characterise 2.2.3 as having "better stability" than 2.2.6! Why is that? Surely each micro release should work better than the previous one in a particular train. > The real question from a user view is how to preserve compatibility with > a ZFS pool from a current BSD, Debian, Ubuntu or TrueNAS and when to get > important Open-ZFS features like ZSTD, Draid, Raid-Z exansion, Fast > Dedup, direct io or higher limits ex on recsize and much more to come > while value stability as item2. Stability and compatibility are not > either or. Find a way to get newer Open-ZFS features including bugfix > commits with the best achievable stability. Not to have newer ZFS > features is not an option. Have your cake and eat it. To crib from Wikipedia: | The proverb literally means "you cannot simultaneously | retain possession of a cake and eat it, too". Once the | cake is eaten, it is gone. It can be used to say that | one cannot have two incompatible things, or that one | should not try to have more than is reasonable. https://en.wikipedia.org/wiki/You_can%27t_have_your_cake_and_eat_it I am very much in favour of having new features and bug fixes imported from OpenZFS, and we do import them over time. It's important for a project to have a strong sense of what it values, which often means prioritising those values when making decisions. For us, the hierarchy has to be something like this: 1. not shipping regressions, not creating data loss 2. importing bug fixes 3. importing new features and performance improvements It's something of an aphorism in software engineering at this point, but we're essentially trading between constraints. If we assume that we can't magically produce extra contributors willing to work on this (i.e., cost/staffing is fixed) then we're trading off between: - scope of features (e.g., compatibility with OpenZFS 2.2.6) - quality of implementation (e.g., no regressions or data loss!) - when the work is likely to be completed We are unwilling to sacrifice quality, which means we can only pick one other constraint in this triangle. If the goal is more features, then it will, as you're noticing, take longer to get done. > Maybe a staging model can be one solution like an older more stable > Open-ZFS branch > Illumos testing > Illumos stable (similar to what > current Illumos is), just like we see it in Open-ZFS or the OmniOS > approach or in the Open-ZFS world with Open-ZFS forks on OSX or Windows. > This would allow newest Open-ZFS features to appear with a delay but to > appear for sure. Maybe newer long awaited bits like the recent NFS and > SMB improvements have also a chance to appear in a Illumos testing > branch for wider testings. There are several challenges with a testing/unstable branch approach in general; e.g., - Who is going to maintain the testing/unstable branch? - What are the rules around integration of changes into the testing branch? Obviously there must be fewer rules and lower standards, otherwise you'd just go straight to the master branch. - Who will be running the testing/unstable day-to-day on their computers instead of master? - What is the process for promoting changes from the testing/unstable branch to master? Ultimately this question is: who is going to do the work (that is already not being done today) to get those changes into the master branch? If you're saying that you're stepping up and are going to work on porting a current OpenZFS release branch back to illumos, that's certainly exciting news! I think it would be worth drafting an IPD with your plans to get started, which we can review and help with! Cheers. -- Joshua M. Clulow http://blog.sysmgr.org ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] Illumos future and compatibility with Open-ZFS 2024-09-07 16:38 ` Joshua M. Clulow @ 2024-09-07 21:37 ` Gea 2024-09-08 8:50 ` Jorge Schrauwen 0 siblings, 1 reply; 34+ messages in thread From: Gea @ 2024-09-07 21:37 UTC (permalink / raw) To: discuss Thank you for bringing the discussion forward.. Have your cake... This is a term used by the Brits on Brexit in Europe. While BS it basically says look for a way to get something new from controversial demands. As you say a project must have a focus on certain points be it stability or compatibility and yes stability has priority but in the end stability is nothing without compatibility when there are not enough users to justify the whole effort beside a single company use case. This is not different to the end of Sun that lost focus on users. Dying virtuously is not a serious option for any OSS project. From a user view, I do not see how the current Illumos follows Open-ZFS development or see that important new Open-ZFS features find their way to Illumos in time, so it cannot be more of the same. As I am not a Illumos dev this is my personal outside view. There may be other solutions from inside view. Gea gea@napp-it.org ------------------------- ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] Illumos future and compatibility with Open-ZFS 2024-09-07 21:37 ` Gea @ 2024-09-08 8:50 ` Jorge Schrauwen 2024-09-08 9:55 ` d 2024-09-09 9:29 ` Volker A. Brandt 0 siblings, 2 replies; 34+ messages in thread From: Jorge Schrauwen @ 2024-09-08 8:50 UTC (permalink / raw) To: illumos-discuss Since I am on mobile I’ll top post… Although some of the new features are indeed enticing, I value lack of dataloss and stability more. Once data has been lost, all trust is gone. Work is a linux shop and most of my colleagues don’t care for ZFS because they tried it and had all sorts of issues/loss of data. They rather use XFS or ext4. (They don’t care much for btrfs either for the same reason, and prefer LVM to fill the snapshot/zvol gap.) There is probably a fair share of users that chose illumos for those values that align closer with theirs. I’m sure illumos loses some of the ‘new’ users that try out illumos because of some of these choices of stability, don’t break master,… those users values may closer align with they way openzfs does things, and that is fine. They have the option to use it and seem to do so. Ultimately picking an OS is always a mix of features and values the project has. I think illumos is uniquely positioned here. ~ sjorge > On 7 Sep 2024, at 23:39, Gea <gea@napp-it.org> wrote: > > Thank you for bringing the discussion forward.. > > Have your cake... > This is a term used by the Brits on Brexit in Europe. > While BS it basically says look for a way to get something new from controversial demands. > > As you say a project must have a focus on certain points be it stability or compatibility and yes stability has priority but in the end stability is nothing without compatibility when there are not enough users to justify the whole effort beside a single company use case. This is not different to the end of Sun that lost focus on users. Dying virtuously is not a serious option for any OSS project. > > From a user view, I do not see how the current Illumos follows Open-ZFS development or see that important new Open-ZFS features find their way to Illumos in time, so it cannot be more of the same. > > As I am not a Illumos dev this is my personal outside view. There may be other solutions from inside view. > > Gea > > gea@napp-it.org > ------------------------- > > > ------------------------------------------ > illumos: illumos-discuss > Permalink: https://illumos.topicbox.com/groups/discuss/T627f77e1b29a7b53-M4164ea5e0a4a904f881090a8 > Delivery options: https://illumos.topicbox.com/groups/discuss/subscription ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] Illumos future and compatibility with Open-ZFS 2024-09-08 8:50 ` Jorge Schrauwen @ 2024-09-08 9:55 ` d 2024-09-09 10:41 ` Gea 2024-09-09 9:29 ` Volker A. Brandt 1 sibling, 1 reply; 34+ messages in thread From: d @ 2024-09-08 9:55 UTC (permalink / raw) To: Jorge Schrauwen via illumos-discuss This post enticed me to do a little research on zfs compression, which lead me to some interesting info an a question: https://indico.fnal.gov/event/16264/contributions/36466/attachments/22610/28037/Zstd__LZ4.pdf TLDR: A couple really interesting bits: On their test data set, zstd-2 (4.8) beat out other compression ratios soundly for performance with minimal difference in the compression ratio. lz4-1 (145 MB/s) beat out lz4-9 (45 MB/s) on speed with no difference in the compression ratio with their data set. Question: zfs was written to optimize usage largely for spinning rust. What would be different in a version of zfs optimized for NVMe or SSDs? How much can you tune zfs to optimize it for ssds? Thanks On 9/8/24 01:50, Jorge Schrauwen via illumos-discuss wrote: > Since I am on mobile I’ll top post… > > Although some of the new features are indeed enticing, I value lack of dataloss and stability more. > > Once data has been lost, all trust is gone. > > Work is a linux shop and most of my colleagues don’t care for ZFS because they tried it and had all sorts of issues/loss of data. They rather use XFS or ext4. (They don’t care much for btrfs either for the same reason, and prefer LVM to fill the snapshot/zvol gap.) > > There is probably a fair share of users that chose illumos for those values that align closer with theirs. > > I’m sure illumos loses some of the ‘new’ users that try out illumos because of some of these choices of stability, don’t break master,… those users values may closer align with they way openzfs does things, and that is fine. They have the option to use it and seem to do so. > > Ultimately picking an OS is always a mix of features and values the project has. > > I think illumos is uniquely positioned here. > > ~ sjorge > >> On 7 Sep 2024, at 23:39, Gea <gea@napp-it.org> wrote: >> >> Thank you for bringing the discussion forward.. >> >> Have your cake... >> This is a term used by the Brits on Brexit in Europe. >> While BS it basically says look for a way to get something new from controversial demands. >> >> As you say a project must have a focus on certain points be it stability or compatibility and yes stability has priority but in the end stability is nothing without compatibility when there are not enough users to justify the whole effort beside a single company use case. This is not different to the end of Sun that lost focus on users. Dying virtuously is not a serious option for any OSS project. >> >> From a user view, I do not see how the current Illumos follows Open-ZFS development or see that important new Open-ZFS features find their way to Illumos in time, so it cannot be more of the same. >> >> As I am not a Illumos dev this is my personal outside view. There may be other solutions from inside view. >> >> Gea >> >> gea@napp-it.org >> ------------------------- >> > ------------------------------------------ > illumos: illumos-discuss > Permalink: https://illumos.topicbox.com/groups/discuss/T627f77e1b29a7b53-M69804f13feba105e940a7d98 > Delivery options: https://illumos.topicbox.com/groups/discuss/subscription ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] Illumos future and compatibility with Open-ZFS 2024-09-08 9:55 ` d @ 2024-09-09 10:41 ` Gea 2024-09-09 11:32 ` Toomas Soome 0 siblings, 1 reply; 34+ messages in thread From: Gea @ 2024-09-09 10:41 UTC (permalink / raw) To: discuss On 08.09.2024 11:55, d wrote: > This post enticed me to do a little research on zfs compression, which > lead me to some interesting info an a question: > https://indico.fnal.gov/event/16264/contributions/36466/attachments/22610/28037/Zstd__LZ4.pdf > The simplified result seems lz4 is faster while zstd compress data more efficient. When I look at my main pool, refcompressratio with lz4 is 1.04 for my main data filesystem with mixed data, some at 1.02 with mainly iso and media and up to 1.12 in rare cases. With expensive NVMe or hybrid pools with a special vdev, I would opt for zstd and better compressratio as performance is better than good enough. But I have this choice only with Open-ZFS and this for quite a long time similar to draid with many advantages on very large pools. The upcoming Fast Dedup in Open-ZFS where you can limit dedup table size with a quota, place the dedup table not only on a dedicated dedup vdev but also a special vdev, can shrink dedup tables from single items or cache dedup in Arc seems a killer feature. Unlike current dedup, I would expect that you can enable Fast Dedup per default, either with a special vdev that also massively improve all small file actions and metadata access or with a quota like a few GB to limit RAM usage. Hybrid pools with a special vdev become more and more attractive as suggested default. Fast dedup is beta but near to be ready. I already can play with in in Open-ZFS on Windows rc. Will this appear in Illumos in the forseeable future. I doubt. The stability aspect, ok. Software has bugs and you need to maintain does not matter if Open-ZFS or Illumos-ZFS. There have also been critical bugs in Illumos. When I remember correctly it was persistent L2Arc in Illumos where datalosses could have happened a few years ago (luckily I was not affected as silent errors could have happened). Why do anyone expects that bugs in Illumos ZFS do not happen or fixed faster than bugs in Open-ZFS with 20x the user base and number of devs and many large commercial companies behind. Even the additional tests on taking over Open-ZFS code in more and more rare cases does not guarantee that there are no bugs remained (while helpful for sure and better as using Open-ZFS master or the stable branch directly) Gea ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] Illumos future and compatibility with Open-ZFS 2024-09-09 10:41 ` Gea @ 2024-09-09 11:32 ` Toomas Soome 0 siblings, 0 replies; 34+ messages in thread From: Toomas Soome @ 2024-09-09 11:32 UTC (permalink / raw) To: illumos-discuss [-- Attachment #1: Type: text/plain, Size: 3276 bytes --] > On 9. Sep 2024, at 13:41, Gea <gea@napp-it.org> wrote: > > On 08.09.2024 11:55, d wrote: >> This post enticed me to do a little research on zfs compression, which lead me to some interesting info an a question: >> https://indico.fnal.gov/event/16264/contributions/36466/attachments/22610/28037/Zstd__LZ4.pdf > > The simplified result seems lz4 is faster while zstd compress data more efficient. > > When I look at my main pool, refcompressratio with lz4 is 1.04 for my main data filesystem with mixed data, some at 1.02 with mainly iso and media and up to 1.12 in rare cases. With expensive NVMe or hybrid pools with a special vdev, I would opt for zstd and better compressratio as performance is better than good enough. But I have this choice only with Open-ZFS and this for quite a long time similar to draid with many advantages on very large pools. tsoome@beastie:~$ zfs get compression rpool/ROOT/openindiana-2024:08:23 NAME PROPERTY VALUE SOURCE rpool/ROOT/openindiana-2024:08:23 compression zstd inherited from rpool tsoome@beastie:~$ zfs get compressratio rpool/ROOT/openindiana-2024:08:23 NAME PROPERTY VALUE SOURCE rpool/ROOT/openindiana-2024:08:23 compressratio 2.75x - tsoome@beastie:~$ zfs get compression rpool/code/illumos-gate NAME PROPERTY VALUE SOURCE rpool/code/illumos-gate compression zstd local tsoome@beastie:~$ zfs get compressratio rpool/code/illumos-gate NAME PROPERTY VALUE SOURCE rpool/code/illumos-gate compressratio 3.32x - tsoome@beastie:~$ (I probably still do have some lz4 blocks lurking around). rgds, toomas > > The upcoming Fast Dedup in Open-ZFS where you can limit dedup table size with a quota, place the dedup table not only on a dedicated dedup vdev but also a special vdev, can shrink dedup tables from single items or cache dedup in Arc seems a killer feature. Unlike current dedup, I would expect that you can enable Fast Dedup per default, either with a special vdev that also massively improve all small file actions and metadata access or with a quota like a few GB to limit RAM usage. Hybrid pools with a special vdev become more and more attractive as suggested default. Fast dedup is beta but near to be ready. I already can play with in in Open-ZFS on Windows rc. Will this appear in Illumos in the forseeable future. I doubt. > > The stability aspect, ok. Software has bugs and you need to maintain does not matter if Open-ZFS or Illumos-ZFS. There have also been critical bugs in Illumos. When I remember correctly it was persistent L2Arc in Illumos where datalosses could have happened a few years ago (luckily I was not affected as silent errors could have happened). > > Why do anyone expects that bugs in Illumos ZFS do not happen or fixed faster than bugs in Open-ZFS with 20x the user base and number of devs and many large commercial companies behind. Even the additional tests on taking over Open-ZFS code in more and more rare cases does not guarantee that there are no bugs remained (while helpful for sure and better as using Open-ZFS master or the stable branch directly) > [-- Attachment #2: Type: text/html, Size: 15469 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] Illumos future and compatibility with Open-ZFS 2024-09-08 8:50 ` Jorge Schrauwen 2024-09-08 9:55 ` d @ 2024-09-09 9:29 ` Volker A. Brandt 2024-09-09 9:49 ` bronkoo 1 sibling, 1 reply; 34+ messages in thread From: Volker A. Brandt @ 2024-09-09 9:29 UTC (permalink / raw) To: illumos-discuss Jorge Schrauwen via illumos-discuss writes: > Work is a linux shop and most of my colleagues don’t care for ZFS because > they tried it and had all sorts of issues/loss of data. They rather use XFS > or ext4. (They don’t care much for btrfs either for the same reason, and > prefer LVM to fill the snapshot/zvol gap.) That is not my experience. Most Linux power users I meet know and like ZFS. Usually, what prevents them from using it are corporate concerns about "commercial support" similar to IBM/RHEL. OpenZFS is actually quite popular, as it is an integral part of Proxmox. Cheers -- Volker -- ------------------------------------------------------------------------ Volker A. Brandt Consulting and Support for Solaris-based Systems Brandt & Brandt Computer GmbH | WWW: http://www.bb-c.de/ Am Wiesenpfad 6, 53340 Meckenheim, GERMANY | Email: vab@bb-c.de HR: Amtsgericht Bonn, HRB 10513 | Mastodon: @vab@bonn.social Geschäftsführer: Rainer J.H. Brandt und Volker A. Brandt ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] Illumos future and compatibility with Open-ZFS 2024-09-09 9:29 ` Volker A. Brandt @ 2024-09-09 9:49 ` bronkoo 0 siblings, 0 replies; 34+ messages in thread From: bronkoo @ 2024-09-09 9:49 UTC (permalink / raw) To: illumos-discuss [-- Attachment #1: Type: text/plain, Size: 265 bytes --] On Monday, 9 September 2024, at 11:29 AM, Volker A. Brandt wrote: > That is not my experience. Most Linux power users I meet know and like ZFS. Usually, what prevents them from using it are corporate concerns about "commercial support" similar to IBM/RHEL. +1 [-- Attachment #2: Type: text/html, Size: 371 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] Illumos future and compatibility with Open-ZFS 2024-09-07 11:12 ` Gea 2024-09-07 16:38 ` Joshua M. Clulow @ 2024-09-09 11:13 ` Jorgen Lundman 2024-09-09 12:27 ` Gea ` (3 more replies) 1 sibling, 4 replies; 34+ messages in thread From: Jorgen Lundman @ 2024-09-09 11:13 UTC (permalink / raw) To: illumos-discuss > 3. Open-ZFS on Windows (and OSX), beta/release candidates > This is fork of Open-ZFS that is updated to Open-ZFS Master from release candidate to release candidate. While I suppose Jorgen Lundman (maintainer) originally planned a full integration of Windows ZFS directly into Open-ZFS, it seems that he now intends to use Open-ZFS simply as upstream, just like Illumos was upstream for a long time for BSD and Linux. I’m not really part of the main-thrust of this discussion, but allow me to make a comment or two. It was more that the PRs for macOS and Windows were 4 and 3 years old, and it is a “non-zero” amount of effort to keep them synchronised all the time. Not a complaint, just how it is. I can’t just be me (this side) who wants to merge, but the other side of the aisle need to want to as well, or it isn’t for the greater good of the project. So I removed, or did I just stop updating? the PRs for them. Could re-visit this again in future when things are more quiet - but I have noticed that it is never “more quiet”. It is worth mentioning that being “just downstream” of the main project is not without its quirks. In particular, whenever a PR is merged (more so if it changes things in the os/ directory) I have to learn and understand each PR fully, to be able to add similar code to my os/ entries. Whereas the first-class platforms, Linux and FreeBSD, get that consideration during development of the PR (or it wont be merged) - and assistance with either platform, if needed. And fully tested before merge. So I have to pretty much know all PRs added, which tends to mean merging with upstream is not as fast as it could be, and sometimes they are delayed considerably - which then makes it a bit more hassle to chase down the developers if I need to - or point out how much is breaks for me. Secondary, I have to merge and re-merge some files constantly - as in manually add in my lines/changes - for example the assembly files or the zfs-tests environment map file. It can not be automated, unless I hook up to chatgpt or something. But I have redone some of these changes so many times it is now tempting to make the code less-portable, as in, just have my own versions of the files - and hope I don’t miss any crucial changes in future. Still, it would be fun, if easy, to do an illumos port of OpenZFS - but sadly, I do not have time due to the 2 ports I currently maintain. Cheers, Jorgen ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] Illumos future and compatibility with Open-ZFS 2024-09-09 11:13 ` Jorgen Lundman @ 2024-09-09 12:27 ` Gea 2024-09-09 14:47 ` gea ` (2 subsequent siblings) 3 siblings, 0 replies; 34+ messages in thread From: Gea @ 2024-09-09 12:27 UTC (permalink / raw) To: discuss Hello Jorgen I admire and follow your work in the Open-ZFS forks for OSX and Windows and at least it shows that it is possible to keep a fork up to date even with limited resources, propable more than a more and more diverging ZFS on Illumos. A quite stable and usable release seems not too far away and a good future for Open-ZFS on OSX and Windows. Open-ZFS on Windows was the reason for me to look at Open-ZFS more seriously and open my Solaris/Illumos web-GUI for ZFS servers on any OS. I suppose a few years ago it would have been even no problem to include Illumos as 1st class citizen in Open-ZFS together with BSD and Linux. Not sure if this is still an option (same to OSX and Windows integration in Open-ZFS master) but a fork is not only bad as not every new bug is included automatically and the work must be done despite, either prior or after a commit. Gea > > I’m not really part of the main-thrust of this discussion, but allow me to make a comment or two. > > ... ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] Illumos future and compatibility with Open-ZFS 2024-09-09 11:13 ` Jorgen Lundman 2024-09-09 12:27 ` Gea @ 2024-09-09 14:47 ` gea 2024-09-09 19:29 ` Joshua M. Clulow 2024-12-11 8:47 ` zt958bjm via illumos-discuss 2024-12-11 8:56 ` zt958bjm via illumos-discuss 3 siblings, 1 reply; 34+ messages in thread From: gea @ 2024-09-09 14:47 UTC (permalink / raw) To: discuss some impressions from outside view.. https://www.reddit.com/r/zfs/comments/1fcohrr/prefer_zfs_stability_of_illumos_zfs_or_best/ ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] Illumos future and compatibility with Open-ZFS 2024-09-09 14:47 ` gea @ 2024-09-09 19:29 ` Joshua M. Clulow 0 siblings, 0 replies; 34+ messages in thread From: Joshua M. Clulow @ 2024-09-09 19:29 UTC (permalink / raw) To: illumos-discuss On Mon, 9 Sept 2024 at 07:47, gea@napp-it.org <gea@napp-it.org> wrote: > some impressions from outside view.. > https://www.reddit.com/r/zfs/comments/1fcohrr/prefer_zfs_stability_of_illumos_zfs_or_best/ This thread is dragging on a bit, so I figured I would try to be a bit more direct: neither discuss list posts, nor straw polls of Internet fora, will move the needle on any of this. It's great to ask questions about the project, and we try to answer them as best we can, as in this thread -- but continuing to press and press just ties up more effort and energy in continuing to engage, lest we then be accused of being a dead project for not answering when asked difficult questions! The crux of the matter is: it's an open source project. People use it when it helps them (as it does ostensibly many people, including several companies who ship illumos in part to use the included ZFS) and people work on it when it helps them enough already to justify adding one or two things that are missing. Choosing an operating system is like asking a thousand different questions, and then being forced to answer all those questions the same way because you can only pick the one; ZFS may be your most important feature, but for others it might be zones, or DTrace, or networking features, or the ability to continue running older binary software, etc. If you're looking for a particular outcome, you need to get involved! Roll up your sleeves and work on it, or pay someone to work on it. To the extent that things get done, that's how. Cheers. PS: a few clarifications: - it's "illumos" (we never capitalise the leading "i") - we consider ourselves totally separate from Solaris at this point - it's "OpenZFS" (no hyphen) - illumos just has "ZFS", not an outdated version of OpenZFS -- Joshua M. Clulow http://blog.sysmgr.org ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] Illumos future and compatibility with Open-ZFS 2024-09-09 11:13 ` Jorgen Lundman 2024-09-09 12:27 ` Gea 2024-09-09 14:47 ` gea @ 2024-12-11 8:47 ` zt958bjm via illumos-discuss 2024-12-11 8:56 ` zt958bjm via illumos-discuss 3 siblings, 0 replies; 34+ messages in thread From: zt958bjm via illumos-discuss @ 2024-12-11 8:47 UTC (permalink / raw) To: illumos-discuss [-- Attachment #1: Type: text/plain, Size: 462 bytes --] Well, an illumos port of OpenZFS is something good to have. FreeBSD used to do so before they officially switched to OpenZFS (there was an OpenZFS port for people wanted to use it instead of the base system's ZFS). ------------------------------------------ illumos: illumos-discuss Permalink: https://illumos.topicbox.com/groups/discuss/T627f77e1b29a7b53-M2aa1a1ec94612619f165069b Delivery options: https://illumos.topicbox.com/groups/discuss/subscription [-- Attachment #2: Type: text/html, Size: 970 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] Illumos future and compatibility with Open-ZFS 2024-09-09 11:13 ` Jorgen Lundman ` (2 preceding siblings ...) 2024-12-11 8:47 ` zt958bjm via illumos-discuss @ 2024-12-11 8:56 ` zt958bjm via illumos-discuss 2024-12-11 9:01 ` zt958bjm via illumos-discuss 3 siblings, 1 reply; 34+ messages in thread From: zt958bjm via illumos-discuss @ 2024-12-11 8:56 UTC (permalink / raw) To: illumos-discuss [-- Attachment #1: Type: text/plain, Size: 839 bytes --] Btw, does the installer of OpenIndiana support forcing 4k or 8k sector size at creation of the pool? This is so easy with FreeBSD and Linux, but as I recall too hard to do on the system where ZFS originated. I recall that I need to hack /kernel/drv/sd.conf and create the pool manually. I don't know the current situation with OpenIndiana and other illumoses. I no longer care about that now. I will use Tribblix the next time I play with illumos and just accept the default setting (which I recall is ashift=9). The virtual disk image is not a real hard disk, so I don't think it will care. ------------------------------------------ illumos: illumos-discuss Permalink: https://illumos.topicbox.com/groups/discuss/T627f77e1b29a7b53-M4b539800976d73e143b4870c Delivery options: https://illumos.topicbox.com/groups/discuss/subscription [-- Attachment #2: Type: text/html, Size: 1351 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] Illumos future and compatibility with Open-ZFS 2024-12-11 8:56 ` zt958bjm via illumos-discuss @ 2024-12-11 9:01 ` zt958bjm via illumos-discuss 2024-12-11 10:09 ` Jorgen Lundman 0 siblings, 1 reply; 34+ messages in thread From: zt958bjm via illumos-discuss @ 2024-12-11 9:01 UTC (permalink / raw) To: illumos-discuss [-- Attachment #1: Type: text/plain, Size: 438 bytes --] About ZFS on Windows, such an interesting project. But WinBtrfs <https://github.com/maharmstone/btrfs> supports the MinGW-w64 compiler and doesn't force me to use MSVC like you. I hate MSVC. ------------------------------------------ illumos: illumos-discuss Permalink: https://illumos.topicbox.com/groups/discuss/T627f77e1b29a7b53-M84d52cde24dffe7bd4f4e2ce Delivery options: https://illumos.topicbox.com/groups/discuss/subscription [-- Attachment #2: Type: text/html, Size: 958 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] Illumos future and compatibility with Open-ZFS 2024-12-11 9:01 ` zt958bjm via illumos-discuss @ 2024-12-11 10:09 ` Jorgen Lundman 2024-12-11 12:17 ` zt958bjm via illumos-discuss 0 siblings, 1 reply; 34+ messages in thread From: Jorgen Lundman @ 2024-12-11 10:09 UTC (permalink / raw) To: illumos-discuss [-- Attachment #1: Type: text/plain, Size: 1142 bytes --] Ah well, I use MSVC for the deployment and remote debugging. The project is actually compiled with CMake and clang. You don’t need MSVC at all. Also, I don’t recall forcing anyone to work on the project. Lund > On Dec 11, 2024, at 18:01, zt958bjm via illumos-discuss <discuss@lists.illumos.org> wrote: > > About ZFS on Windows, such an interesting project. But WinBtrfs <https://github.com/maharmstone/btrfs> supports the MinGW-w64 compiler and doesn't force me to use MSVC like you. I hate MSVC. > illumos <https://illumos.topicbox.com/latest> / illumos-discuss / see discussions <https://illumos.topicbox.com/groups/discuss> + participants <https://illumos.topicbox.com/groups/discuss/members> + delivery options <https://illumos.topicbox.com/groups/discuss/subscription>Permalink <https://illumos.topicbox.com/groups/discuss/T627f77e1b29a7b53-M84d52cde24dffe7bd4f4e2ce> ------------------------------------------ illumos: illumos-discuss Permalink: https://illumos.topicbox.com/groups/discuss/T627f77e1b29a7b53-M402330454bee4f7a10337a1c Delivery options: https://illumos.topicbox.com/groups/discuss/subscription [-- Attachment #2: Type: text/html, Size: 1665 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] Illumos future and compatibility with Open-ZFS 2024-12-11 10:09 ` Jorgen Lundman @ 2024-12-11 12:17 ` zt958bjm via illumos-discuss 0 siblings, 0 replies; 34+ messages in thread From: zt958bjm via illumos-discuss @ 2024-12-11 12:17 UTC (permalink / raw) To: illumos-discuss [-- Attachment #1: Type: text/plain, Size: 693 bytes --] LLVM on Windows have two ports. The official is MSVC-based. You need MSVC installed to use it. The other port is based on MinGW-w64, called llvm-mingw. You can find it on Github. The Clang you are talking about is definitely the MSVC-based one. I'm not going to work on the project because I know I know nothing about coding. I have the fun of building software from source. I use MSYS2 and MinGW-w64. I don't want to install MSVC. That's it. ------------------------------------------ illumos: illumos-discuss Permalink: https://illumos.topicbox.com/groups/discuss/T627f77e1b29a7b53-M9fcfd1ee8e16d60c10da6824 Delivery options: https://illumos.topicbox.com/groups/discuss/subscription [-- Attachment #2: Type: text/html, Size: 1239 bytes --] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] Illumos future and compatibility with Open-ZFS 2024-09-07 8:30 ` [discuss] " Joshua M. Clulow 2024-09-07 11:12 ` Gea @ 2024-09-07 14:32 ` Miles Nordin 1 sibling, 0 replies; 34+ messages in thread From: Miles Nordin @ 2024-09-07 14:32 UTC (permalink / raw) To: Joshua M. Clulow via illumos-discuss > the OpenZFS development model centres around an > unstable master branch, with a stabilisation process to get to > periodic release branches. why not just sync to their release branches and ignore their development branch? Then this is not a problem. I suspect other OpenZFS users (BSD?) do exactly that, or do they not? This makes it less gratifying for illumos folk to work on ZFS since the most efficient way will be to contribute to OpenZFS, not illumos, and see the results in illumos much later. But if you take the long term view unfortunately this is the most efficient. > Importing code from OpenZFS without introducing regressions is a large > and time-consuming undertaking. One must be absolutely sure to > understand not just the patch one wants to import, but all of the > many, many patches that fly in around that patch and touch other parts > of the code. The impact of importing a patch without importing prior > commits that patch depends on, or without importing _subsequent_ fixes why not sync, ie. take all the patches, and basically become an OpenZFS user? What is the justification for our fork? Is there actually a record of our being more stable? Maybe there is, but I have not noticed it. > It's the file system -- we must be cautious! The divergence is also a problem. I have both illumos and Linux machines, to mitigate certain kinds of risk. Differences in behaviour and in stream compatibility are unwelcome. This is a general thing in free software. Branches cause wasted effort, compatibility problems, and generally add cognitive load, so there's a relatively high bar to justify them. Are you sure it's met? > While our rate of change is demonstrably less frenzied, part > of that is because at a project level we generally place a higher > value on stability (in every sense of the word) over new features; Many of the features have been stability-related, such as resilvering performance or resilvering-never-finishes bugs. You would probably not count that as "stability," but from ops perspective it is, 110%, because it impacts durability which is usually more important than availability. > What sort of features would you like to see? Flicking back through > the last six months of commits, I've attempted to pick just based on > the synopses, things that are (as far as I recall) new feature working > going into the OS: the loss of the infiniband stack was a blow to me. It's still there, but it doesn't work any more. It was a blow not to have it, and a blow to spend so much time diagnosing the hangs through which the regression manifested itself. but that's a different topic, and likely a losing argument since it didn't look well-used by mailing-list-search, it was forked from the mellanox framework, whatever that's called, and it was big. I probably need to just use Linux if I need IB, now that Sun's backing is gone. You're right generally that avoiding regressions is really important, and that trying to have fewer of them is a reason I use OmniOS rather than Linux. It's not possible to get bandwidth back by automating when you are drowning in bespoke regressions that have to be chased down and worked around. All of my planning now is around making staying current with patches cheaper. > I don't think there is currently a plan to abandon maintaining our own > ZFS and switch to importing OpenZFS each time they cut a release > branch. As long as there is an illumos project -- and, after 14 > years, I think it's safe to say that we're in it for the long haul! -- > we'll be looking after it, and the rest of the code, as a cohesive > whole. This is probably a mistake, IMHO. I think what is really driving it is that it's fun to work on ZFS, and it would be less fun to work on OpenZFS than illumos ZFS because of the rude and messy CADT culture. But as subsystems die (ex infiniband) and large new subsystems become important (ex. TCP-BBR which in linux needs their queueing system, userspace networking and network cards with fancy clocks, new kinds of virtualization) illumos and its forks will accumulate deal-breaker limitations for larger numbers of sites. I think survival depends on being extremely efficient, and hobby forks are not efficient. With anything open source, mechanical efficiency and the joy of working on the project need to be balanced, and the balance ought to tilt further toward joy than seems sane to someone accustomed to corporate environment. But I worry here it's too costly, both the direct cost in the forked ZFS implementations, and the indirect cost in that there is not enough developer bandwidth available to keep illumos relevant if any is wasted. That said thank you for your balanced and well-reasoned defense of the status quo. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] shutdown hang with zpools on ISCSI devices 2024-08-05 20:50 shutdown hang with zpools on ISCSI devices Carsten Grzemba 2024-08-05 21:46 ` [discuss] " Joshua M. Clulow @ 2024-08-05 22:45 ` John D Groenveld 2024-08-05 22:55 ` Joshua M. Clulow 2024-08-06 15:56 ` Gordon Ross 2 siblings, 1 reply; 34+ messages in thread From: John D Groenveld @ 2024-08-05 22:45 UTC (permalink / raw) To: illumos-discuss In message <17228910420.eEB8eC.492388@composer.illumos.topicbox.com>, "Carsten Grzemba via illumos-discuss" writes: >I have a problem shutting down a server with Zpool on iscsi devices. The sh= >utdown gets stuck. In the console log I see the message: ISTR someone shared a service using a dedicated zpool cachefile. It ensured that the network and the iSCSI LUNs were available before import and the pools were exported before the network was shutdown. John groenveld@acm.org ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] shutdown hang with zpools on ISCSI devices 2024-08-05 22:45 ` [discuss] shutdown hang with zpools on ISCSI devices John D Groenveld @ 2024-08-05 22:55 ` Joshua M. Clulow 0 siblings, 0 replies; 34+ messages in thread From: Joshua M. Clulow @ 2024-08-05 22:55 UTC (permalink / raw) To: illumos-discuss On Mon, 5 Aug 2024 at 15:46, John D Groenveld via illumos-discuss <discuss@lists.illumos.org> wrote: > In message <17228910420.eEB8eC.492388@composer.illumos.topicbox.com>, "Carsten > Grzemba via illumos-discuss" writes: > >I have a problem shutting down a server with Zpool on iscsi devices. The sh= > >utdown gets stuck. In the console log I see the message: > > ISTR someone shared a service using a dedicated zpool cachefile. > It ensured that the network and the iSCSI LUNs were available before > import and the pools were exported before the network was shutdown. That definitely seems like the kind of facility we would need. Really the base system should provide such a facility, like it does with vfstab(5) for file systems other than ZFS. Cheers. -- Joshua M. Clulow http://blog.sysmgr.org ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [discuss] shutdown hang with zpools on ISCSI devices 2024-08-05 20:50 shutdown hang with zpools on ISCSI devices Carsten Grzemba 2024-08-05 21:46 ` [discuss] " Joshua M. Clulow 2024-08-05 22:45 ` [discuss] shutdown hang with zpools on ISCSI devices John D Groenveld @ 2024-08-06 15:56 ` Gordon Ross 2 siblings, 0 replies; 34+ messages in thread From: Gordon Ross @ 2024-08-06 15:56 UTC (permalink / raw) To: illumos-discuss We have some improvements in this area done at RackTop by Albert Lee. Could you open an illumos issue for the shutdown hang? When I get time I'll try to get the relevant changes out where people can see them, pointed to in that issue. On Mon, Aug 5, 2024 at 4:51 PM Carsten Grzemba via illumos-discuss <discuss@lists.illumos.org> wrote: > > I have a problem shutting down a server with Zpool on iscsi devices. The shutdown gets stuck. In the console log I see the message: > WARNING: Pool 'zones' has encountered an uncorrectable I/O failure and has been suspended; `zpool clear` will be required before the pool can be written to. > The message looks to me like the IP connection to the storage server was already terminated before the Zpool was unmounted. > I forced the shutdown with an NMI reset so that I have a crash dump. In the crash dump I see that a zone on the pool in question is still in shutting_down. > Is it possible to get information from crash dump like *svcs* of a running system? > How can I see in the crash dump whether the IP interfaces are still working? Normally the SMF > svc:/network/iscsi/initiator:default > should wait for the zones to shutdown and reach the state installed. > How can I see in the crash dump that the SMF dependencies are correctly taken into account? > > Many thanks! > Carsten > > > > > > > illumos / illumos-discuss / see discussions + participants + delivery options Permalink ^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2024-12-11 12:18 UTC | newest] Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-08-05 20:50 shutdown hang with zpools on ISCSI devices Carsten Grzemba 2024-08-05 21:46 ` [discuss] " Joshua M. Clulow 2024-08-06 5:17 ` Carsten Grzemba 2024-08-06 5:29 ` Carsten Grzemba 2024-08-06 5:46 ` Joshua M. Clulow 2024-08-06 6:36 ` Carsten Grzemba 2024-08-06 6:46 ` Joshua M. Clulow 2024-08-06 14:27 ` Carsten Grzemba 2024-08-06 15:16 ` Carsten Grzemba 2024-08-06 19:04 ` Joshua M. Clulow 2024-09-06 11:53 ` Illumos future and compatibility with Open-ZFS gea 2024-09-07 8:30 ` [discuss] " Joshua M. Clulow 2024-09-07 11:12 ` Gea 2024-09-07 16:38 ` Joshua M. Clulow 2024-09-07 21:37 ` Gea 2024-09-08 8:50 ` Jorge Schrauwen 2024-09-08 9:55 ` d 2024-09-09 10:41 ` Gea 2024-09-09 11:32 ` Toomas Soome 2024-09-09 9:29 ` Volker A. Brandt 2024-09-09 9:49 ` bronkoo 2024-09-09 11:13 ` Jorgen Lundman 2024-09-09 12:27 ` Gea 2024-09-09 14:47 ` gea 2024-09-09 19:29 ` Joshua M. Clulow 2024-12-11 8:47 ` zt958bjm via illumos-discuss 2024-12-11 8:56 ` zt958bjm via illumos-discuss 2024-12-11 9:01 ` zt958bjm via illumos-discuss 2024-12-11 10:09 ` Jorgen Lundman 2024-12-11 12:17 ` zt958bjm via illumos-discuss 2024-09-07 14:32 ` Miles Nordin 2024-08-05 22:45 ` [discuss] shutdown hang with zpools on ISCSI devices John D Groenveld 2024-08-05 22:55 ` Joshua M. Clulow 2024-08-06 15:56 ` Gordon Ross
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).