On 30. Oct 2024, at 20:57, Gary Mills <gary_mills@fastmail.fm> wrote:

I'm not sure if this is a bug or just ZFS being careful, but I got a
panic and reboot while I was doing a "pkg update".  The system
has an AMD 6-core CPU with B550 support hardware.  The next
"pkg update" completed normally, without a panic.  Here's what
I found in /var/adm/messages.  Does it look familiar?

Oct 30 09:14:31 b550 unix: [ID 836849 kern.notice]
Oct 30 09:14:31 b550 ^Mpanic[cpu4]/thread=fffffe2cc9e88780:
Oct 30 09:14:31 b550 genunix: [ID 129249 kern.notice] checksum of cached data doesn't match BP err=50 hdr=fffffe3d478f51c0 bp=fffffe0040433988 abd=fffffe3d478f7cc0 buf=fffffe3b5a6f9000
Oct 30 09:14:31 b550 unix: [ID 100000 kern.notice]
Oct 30 09:14:31 b550 genunix: [ID 655072 kern.notice] fffffe0040433760 zfs:zfs_nfsshare_inited+378b87f0 ()
Oct 30 09:14:31 b550 genunix: [ID 655072 kern.notice] fffffe0040433890 zfs:arc_read+de1 ()
Oct 30 09:14:31 b550 genunix: [ID 655072 kern.notice] fffffe00404338e0 zfs:dbuf_issue_final_prefetch+77 ()
Oct 30 09:14:31 b550 genunix: [ID 655072 kern.notice] fffffe0040433a70 zfs:dbuf_prefetch_impl+502 ()
Oct 30 09:14:31 b550 genunix: [ID 655072 kern.notice] fffffe0040433b20 zfs:dmu_zfetch+2ed ()
Oct 30 09:14:31 b550 genunix: [ID 655072 kern.notice] fffffe0040433bd0 zfs:dmu_buf_hold_array_by_dnode+321 ()
Oct 30 09:14:31 b550 genunix: [ID 655072 kern.notice] fffffe0040433c70 zfs:dmu_read_uio_dnode+54 ()
Oct 30 09:14:31 b550 genunix: [ID 655072 kern.notice] fffffe0040433cc0 zfs:dmu_read_uio_dbuf+51 ()
Oct 30 09:14:31 b550 genunix: [ID 655072 kern.notice] fffffe0040433d60 zfs:zfs_read+19c ()
Oct 30 09:14:31 b550 genunix: [ID 655072 kern.notice] fffffe0040433de0 genunix:fop_read+60 ()
Oct 30 09:14:31 b550 genunix: [ID 655072 kern.notice] fffffe0040433f00 genunix:read+2b5 ()
Oct 30 09:14:31 b550 genunix: [ID 655072 kern.notice] fffffe0040433f10 unix:brand_sys_syscall+1fe ()
Oct 30 09:14:31 b550 unix: [ID 100000 kern.notice]
Oct 30 09:14:31 b550 genunix: [ID 111219 kern.notice] dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
Oct 30 09:14:31 b550 ahci: [ID 405573 kern.info] NOTICE: ahci0: ahci_tran_reset_dport port 0 reset port
Oct 30 09:14:32 b550 ahci: [ID 405573 kern.info] NOTICE: ahci0: ahci_tran_reset_dport port 1 reset port
Oct 30 09:14:50 b550 genunix: [ID 100000 kern.notice]
Oct 30 09:14:50 b550 genunix: [ID 665016 kern.notice] ^M100% done: 859875 pages dumped,
Oct 30 09:14:50 b550 genunix: [ID 851671 kern.notice] dump succeeded
Oct 30 09:15:34 b550 genunix: [ID 107833 kern.notice] ^MOpenIndiana Hipster 2022.10 Version illumos-806838751b 64-bit



Dan got blown up while running zfs-tests (rsend), and that resulted on me picking one series of updates from OpenZFS concerning dbuf and dmu. There are still few XXX notes for myself, but so far both debug and non-debug builds have been behaving nicely (debug build used to run zfs-tests). I have seen myself also panic from arc (ASSERT fired while running zfs-tests on debuilg build — that was before the work mentioned above). Most likely need to pick some arc bits as well.

The current wip branch is: https://github.com/tsoome/illumos-gate/tree/rsend if you like to test. The problem about those panics is that they seem to be random, or at least not easily repeatable.

rgds,
toomas