From: "Joshua M. Clulow" <josh@sysmgr.org>
To: illumos-discuss <discuss@lists.illumos.org>
Subject: Re: [discuss] shutdown hang with zpools on ISCSI devices
Date: Mon, 5 Aug 2024 14:46:22 -0700 [thread overview]
Message-ID: <CAEwA5n+xmXrF-UVMwUGkzqJUMZbxVVrt7eXVJ++6NZxwy0p4eA@mail.gmail.com> (raw)
In-Reply-To: <17228910420.eEB8eC.492388@composer.illumos.topicbox.com>
On Mon, 5 Aug 2024 at 13:50, Carsten Grzemba via illumos-discuss
<discuss@lists.illumos.org> wrote:
> I have a problem shutting down a server with Zpool on iscsi devices. The shutdown gets stuck. In the console log I see the message:
> WARNING: Pool 'zones' has encountered an uncorrectable I/O failure and has been suspended; `zpool clear` will be required before the pool can be written to.
> The message looks to me like the IP connection to the storage server was already terminated before the Zpool was unmounted.
> I forced the shutdown with an NMI reset so that I have a crash dump. In the crash dump I see that a zone on the pool in question is still in shutting_down.
> Is it possible to get information from crash dump like *svcs* of a running system?
Not really easily, no, but you can at least see the tree of processes
that is running; e.g.,
> ::ptree
fffffffffbc93760 sched
fffffeb1cedab008 zpool-extra
fffffeb1c8c2c000 fsflush
fffffeb1c8c2f020 pageout
fffffeb1c8c34018 init
fffffeb38ccb4020 screen
fffffeb5aaddd020 bash
fffffeb1f123e010 init
fffffeb3a3070000 image-builder
fffffeb263501020 pkg
fffffeb1cea71018 sac
fffffeb1ca8a7018 ttymon
...
> How can I see in the crash dump whether the IP interfaces are still working? Normally the SMF
> svc:/network/iscsi/initiator:default
> should wait for the zones to shutdown and reach the state installed.
I'm not sure that it does do that. Looking at the manifest, I don't
see any interaction with zones at all:
https://github.com/illumos/illumos-gate/blob/master/usr/src/cmd/iscsid/iscsi-initiator.xml
There is some special handling in the shutdown code today for
unmounting file systems that are listed in vfstab(5) as being mounted
from iSCSI disks. That obviously doesn't help you with ZFS pools,
though, and I expect the real bug here is that the "umount_iscsi()"
function in that method script doesn't do anything to export ZFS pools
that live on iSCSI disks:
https://github.com/illumos/illumos-gate/blob/master/usr/src/cmd/iscsid/iscsi-initiator#L172-L199
How do you _import_ these pools today? It seems likely that pools
that come from iSCSI devices should actually not appear in the global
zpool cache file (i.e., "/etc/zfs/zpool.cache"), but rather should get
imported transiently somehow by the iSCSI initiator service in
"mount_iscsi()". Then they should get exported on the way to
shutdown. In short: importing and exporting those pools should
probably work like vfstab(5) entries with "iscsi" set in the automount
field. Otherwise there does not appear to be any mechanism in place
to handle the network going away as part of shutdown.
Cheers.
--
Joshua M. Clulow
http://blog.sysmgr.org
next prev parent reply other threads:[~2024-08-05 21:46 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-05 20:50 Carsten Grzemba
2024-08-05 21:46 ` Joshua M. Clulow [this message]
2024-08-06 5:17 ` [discuss] " Carsten Grzemba
2024-08-06 5:29 ` Carsten Grzemba
2024-08-06 5:46 ` Joshua M. Clulow
2024-08-06 6:36 ` Carsten Grzemba
2024-08-06 6:46 ` Joshua M. Clulow
2024-08-06 14:27 ` Carsten Grzemba
2024-08-06 15:16 ` Carsten Grzemba
2024-08-06 19:04 ` Joshua M. Clulow
2024-09-06 11:53 ` Illumos future and compatibility with Open-ZFS gea
2024-09-07 8:30 ` [discuss] " Joshua M. Clulow
2024-09-07 11:12 ` Gea
2024-09-07 16:38 ` Joshua M. Clulow
2024-09-07 21:37 ` Gea
2024-09-08 8:50 ` Jorge Schrauwen
2024-09-08 9:55 ` d
2024-09-09 10:41 ` Gea
2024-09-09 11:32 ` Toomas Soome
2024-09-09 9:29 ` Volker A. Brandt
2024-09-09 9:49 ` bronkoo
2024-09-09 11:13 ` Jorgen Lundman
2024-09-09 12:27 ` Gea
2024-09-09 14:47 ` gea
2024-09-09 19:29 ` Joshua M. Clulow
2024-12-11 8:47 ` zt958bjm via illumos-discuss
2024-12-11 8:56 ` zt958bjm via illumos-discuss
2024-12-11 9:01 ` zt958bjm via illumos-discuss
2024-12-11 10:09 ` Jorgen Lundman
2024-12-11 12:17 ` zt958bjm via illumos-discuss
2024-09-07 14:32 ` Miles Nordin
2024-08-05 22:45 ` [discuss] shutdown hang with zpools on ISCSI devices John D Groenveld
2024-08-05 22:55 ` Joshua M. Clulow
2024-08-06 15:56 ` Gordon Ross
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAEwA5n+xmXrF-UVMwUGkzqJUMZbxVVrt7eXVJ++6NZxwy0p4eA@mail.gmail.com \
--to=josh@sysmgr.org \
--cc=discuss@lists.illumos.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).