On Fri, Aug 9, 2024 at 3:43 PM Gabriele Bulfon via illumos-developer <developer@lists.illumos.org> wrote:
The problem happened again, but this time the rpool was not yet full.
The pstack output shows again the same problem:

 feed68a5 _lwp_kill (5, 6, 22c4, fef45000, fef45000, c) + 15
 fee68a7b raise    (6) + 2b
 fee41cde abort    () + 10e
 08079939 fmd_panic (8081400)
 0807994b fmd_panic (8081400) + 12
 08065394 fmd_alloc (50, 1) + 81
 0806f6a5 fmd_event_create (1, d1da323a, 1bd4e8f, 0) + 18
 08073ae3 fmd_module_timeout (fb8ef100, 2a1, d1da323a) + 20
 0807bd21 fmd_timerq_exec (915db80) + 127
 0807b299 fmd_thread_start (8131030) + 5b
 feed1a3b _thrp_setup (fed82a40) + 88
 feed1bd0 _lwp_start (fed82a40, 0, 0, 0, 0, 0)
 
I can't believe this global zone is out of virtual memory, it's running various zones with a lot of processes and they all goes fine.

One thing that occurs to me - how big is the fmd process? As it's 32-bit, it can
only grow to 4G before it can't grow any further.
 
Only fmd here is going panic.
What I found is an old issue I even forgot about: an infolog_hival file is being produced continuously.
Running a tail -f on it I get a continuous output like:

port_address        w500304801d0a8808LH
PhyIdentifier88 %/pci@0,0/pci8086,2f02@1/pci15d9,808@0((
event_type      port_broadcast_sesTPclass       3resource.sysevent.EC_hba.ESC_sas_hba_port_broadcast  version  __ttl0(__todf▒'|▒,▒▒,^C
 
As I remember, this may go on for some time then it will stop.

Any idea?
G
 
 



Da: Toomas Soome via illumos-developer <developer@lists.illumos.org>
A: illumos-developer <developer@lists.illumos.org>
Data: 22 luglio 2024 16.10.42 CEST
Oggetto: Re: [developer] fmd core dump




On 22. Jul 2024, at 17:01, Gabriele Bulfon via illumos-developer <developer@lists.illumos.org> wrote:
Hi, I have a couple of systems, installed in 2012 and updated up to illumos 2019 (will have to update to 2024 later).
They periodically (every 3-4 months, sometimes earlier) create a core dump under /var/fm/fmd.
Looks like fmd core dumped, so no email notice is sent, and we end up filling the rpool.
So here I attach the pstack of one of the dumps.
 
Any idea?

 
fmd_alloc() does panic when we are out of memory:
 

        if (data == NULL)

                fmd_panic("insufficient memory (%u bytes needed)\n", size);

You can try adding some more swap space perhaps?

 
rgds,
toomas

<core.fmd.dump.pstack.txt>



--
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/