From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from tb-mx0.topicbox.com (localhost.local [127.0.0.1]) by tb-mx0.topicbox.com (Postfix) with ESMTP id D67AD2147261 for ; Fri, 9 Aug 2024 10:43:29 -0400 (EDT) (envelope-from gbulfon@sonicle.com) Received: from tb-mx0.topicbox.com (localhost [127.0.0.1]) by tb-mx0.topicbox.com (Authentication Milter) with ESMTP id B0B739159F0; Fri, 9 Aug 2024 10:43:29 -0400 ARC-Seal: i=1; a=rsa-sha256; cv=none; d=topicbox.com; s=arcseal; t= 1723214609; b=GjZdWQZjItY04IGXo8Z1ebWLtrHBR6FnkIDPdpoXrn1PYpSx67 eL5jgUbeALRhvZeDjHjlaeEApfGRKrSEggyIBr5ok3P/01Uv/ZThRJBz4CDiziWT 1w9okygwPRuqNfqsJqAfQ3L4D4O/WrMIzz4UuxEVMCQoFFDSuNTstLgaDlwZi+l9 CIkmtfsKTcwODWnW++wi6uxLROjVbAvur3GSbPUpIvCblO7oF8lKoXUhnkpXXpKY CJeerHfjdXV1P/qMc8d5g1FsshRZhv07jiy2BJkmfApATtzvvzlnWbF02/b5ciAQ dRDTOh/kPUW8fjt4Ila5/LO4/5kGsAoLbx/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d= topicbox.com; h=date:from:to:message-id:in-reply-to:references :subject:mime-version:content-type; s=arcseal; t=1723214609; bh= 3vrsCOH76tF5Ku351+VGVJOIYt2Gp970Uj1GbKQ6zd8=; b=W0PfHhpr4KKaVkR8 OXgrnf78tByXqADaihH2xaEF6FUmoIez1NBZmr1SB+j4svOL4tg0UY+v/75gnK1H v+ZrAASJfqicCaR50EJ6DECFN3cGUSfR3eNM4b6Z5Dkn/PD+BVlz3/Oqh/0UaTcX VVzeBplNi11d/YPlBjBMTl7f5G/MTCnU72n/VRXbTQ9B5e/aEZ4Es9jpbRoh1T8v 96ewVtMFMaRdNH9JimMSa46gtFtmLuIe+LUDo1Cdwh8GaLbN4X1yYPwl88yypfof QQktBv1bpZ99xVeOn2OaoA/cZPyLfumZPnkC4GofCqL4IxTFHMn9oglxiPzKiIMw 7ALYtw== ARC-Authentication-Results: i=1; tb-mx0.topicbox.com; arc=none (no signatures found); bimi=none (No BIMI records found); dkim=pass (1024-bit rsa key sha256) header.d=sonicle.com header.i=@sonicle.com header.b=fe5tEa2q header.a=rsa-sha256 header.s=dkim x-bits=1024; dmarc=pass policy.published-domain-policy=quarantine policy.applied-disposition=none policy.evaluated-disposition=none (p=quarantine,d=none,d.eval=none) policy.policy-from=p header.from=sonicle.com; iprev=pass smtp.remote-ip=109.168.117.71 (mail.sonicle.com); spf=pass smtp.mailfrom=gbulfon@sonicle.com smtp.helo=mail.sonicle.com; x-aligned-from=pass (Address match); x-me-sender=none; x-ptr=pass smtp.helo=mail.sonicle.com policy.ptr=mail.sonicle.com; x-return-mx=pass header.domain=sonicle.com policy.is_org=yes (MX Records found: mail.sonicle.com); x-return-mx=pass smtp.domain=sonicle.com policy.is_org=yes (MX Records found: mail.sonicle.com); x-tls=pass smtp.version=TLSv1.2 smtp.cipher=ECDHE-RSA-AES256-GCM-SHA384 smtp.bits=256/256; x-vs=clean score=-100 state=0 Authentication-Results: tb-mx0.topicbox.com; arc=none (no signatures found); bimi=none (No BIMI records found); dkim=pass (1024-bit rsa key sha256) header.d=sonicle.com header.i=@sonicle.com header.b=fe5tEa2q header.a=rsa-sha256 header.s=dkim x-bits=1024; dmarc=pass policy.published-domain-policy=quarantine policy.applied-disposition=none policy.evaluated-disposition=none (p=quarantine,d=none,d.eval=none) policy.policy-from=p header.from=sonicle.com; iprev=pass smtp.remote-ip=109.168.117.71 (mail.sonicle.com); spf=pass smtp.mailfrom=gbulfon@sonicle.com smtp.helo=mail.sonicle.com; x-aligned-from=pass (Address match); x-me-sender=none; x-ptr=pass smtp.helo=mail.sonicle.com policy.ptr=mail.sonicle.com; x-return-mx=pass header.domain=sonicle.com policy.is_org=yes (MX Records found: mail.sonicle.com); x-return-mx=pass smtp.domain=sonicle.com policy.is_org=yes (MX Records found: mail.sonicle.com); x-tls=pass smtp.version=TLSv1.2 smtp.cipher=ECDHE-RSA-AES256-GCM-SHA384 smtp.bits=256/256; x-vs=clean score=-100 state=0 X-ME-VSCause: gggruggvucftvghtrhhoucdtuddrgeeftddrleeggdejkecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdpuffr tefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnth hsucdlqddutddtmdenucfjughrpeffhffvkfgjfhfugggtsehmtdersgdttdejnecuhfhr ohhmpefirggsrhhivghlvgcuuehulhhfohhnuceoghgsuhhlfhhonhesshhonhhitghlvg drtghomheqnecuggftrfgrthhtvghrnhepjeffiefgiedugeffgfffhedtueevjeduudfg uefhledtleeffeejtedtgfeuieevnecuffhomhgrihhnpehshihsvghvvghnthdrvggtpd hsohhnihgtlhgvrdgtohhmpdhgrggsrhhivghlvggsuhhlfhhonhdrtghomhdpsggrnhgu tggrmhhprdgtohhmpdhorhgrtghlvgdrtghomhdpthhophhitggsohigrdgtohhmnecukf hppedutdelrdduieekrdduudejrdejudenucevlhhushhtvghrufhiiigvpedtnecurfgr rhgrmhepihhnvghtpedutdelrdduieekrdduudejrdejuddphhgvlhhopehmrghilhdrsh honhhitghlvgdrtghomhdpmhgrihhlfhhrohhmpeeoghgsuhhlfhhonhesshhonhhitghl vgdrtghomheqpdhnsggprhgtphhtthhopedupdhrtghpthhtohepoeguvghvvghlohhpvg hrsehlihhsthhsrdhilhhluhhmohhsrdhorhhgqe X-ME-VSScore: -100 X-ME-VSCategory: clean Received-SPF: pass (sonicle.com: 109.168.117.71 is authorized to use 'gbulfon@sonicle.com' in 'mfrom' identity (mechanism 'a' matched)) receiver=tb-mx0.topicbox.com; identity=mailfrom; envelope-from="gbulfon@sonicle.com"; helo=mail.sonicle.com; client-ip=109.168.117.71 Received: from mail.sonicle.com (mail.sonicle.com [109.168.117.71]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by tb-mx0.topicbox.com (Postfix) with ESMTPS for ; Fri, 9 Aug 2024 10:43:29 -0400 (EDT) (envelope-from gbulfon@sonicle.com) Received: from www (localhost [127.0.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by mail.sonicle.com (Postfix) with ESMTPS id 26CC8920C9D for ; Fri, 9 Aug 2024 16:43:28 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.sonicle.com 26CC8920C9D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sonicle.com; s=dkim; t=1723214608; bh=3vrsCOH76tF5Ku351+VGVJOIYt2Gp970Uj1GbKQ6zd8=; h=Date:From:To:In-Reply-To:References:Subject; b=fe5tEa2qB9m+WBNDLAPVIJNBNfZpBq1+pt0YcnzTquP8sUCeAhbXVwxLz5kzfMslK OC6FlJVlyktc76HYKL6IQsFf6uXUd8dkHNKV0P2vIlEQ9aNrGCB3e0mSbn9cDnhpqp skgRbg3XNeQTAL9lsdzd7Bl8S2mMX5tJtKisnw2U= Received: from www (www [192.168.222.200]) by www with SMTP (SubEthaSMTP 3.1.7) id LZMTHHC1 for developer@lists.illumos.org; Fri, 09 Aug 2024 16:43:28 +0200 (CEST) Date: Fri, 9 Aug 2024 16:43:28 +0200 (CEST) From: Gabriele Bulfon To: illumos-developer Message-ID: <148564749.1666.1723214608129@www> In-Reply-To: References: <1321589141.1162.1721656889320@www> Subject: Re: [developer] fmd core dump MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_1664_1099839289.1723214608128" Forwarded-From: Topicbox-Policy-Reasoning: allow: sender is a member Topicbox-Message-UUID: c11aa4b2-565d-11ef-af82-8b69088c7b06 ------=_Part_1664_1099839289.1723214608128 Content-Type: multipart/alternative; boundary="----=_Part_1665_1804164084.1723214608128" ------=_Part_1665_1804164084.1723214608128 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable The problem happened again, but this time the rpool was not yet full. The pstack output shows again the same problem: =C2=A0feed68a5 _lwp_kill (5, 6, 22c4, fef45000, fef45000, c) + 15 =C2=A0fee68a7b raise =C2=A0 =C2=A0(6) + 2b =C2=A0fee41cde abort =C2=A0 =C2=A0() + 10e =C2=A008079939 fmd_panic (8081400) =C2=A00807994b fmd_panic (8081400) + 12 =C2=A008065394 fmd_alloc (50, 1) + 81 =C2=A00806f6a5 fmd_event_create (1, d1da323a, 1bd4e8f, 0) + 18 =C2=A008073ae3 fmd_module_timeout (fb8ef100, 2a1, d1da323a) + 20 =C2=A00807bd21 fmd_timerq_exec (915db80) + 127 =C2=A00807b299 fmd_thread_start (8131030) + 5b =C2=A0feed1a3b _thrp_setup (fed82a40) + 88 =C2=A0feed1bd0 _lwp_start (fed82a40, 0, 0, 0, 0, 0) =C2=A0 I can't believe this global zone is out of virtual memory, it's running var= ious zones with a lot of processes and they all goes fine. Only fmd here is going panic. What I found is an old issue I even forgot about: an infolog_hival file is = being produced continuously. Running a tail -f on it I get a continuous output like: port_address =C2=A0 =C2=A0 =C2=A0 =C2=A0w500304801d0a8808LH PhyIdentifier88 %/pci@0,0/pci8086,2f02@1/pci15d9,808@0(( event_type =C2=A0 =C2=A0 =C2=A0port_broadcast_sesTPclass =C2=A0 =C2=A0 =C2= =A0 3resource.sysevent.EC_hba.ESC_sas_hba_port_broadcast =C2=A0version =C2= =A0__ttl0(__todf=E2=96=92'|=E2=96=92,=E2=96=92=E2=96=92,^C =C2=A0 As I remember, this may go on for some time then it will stop. Any idea? G =C2=A0 =C2=A0 Sonicle S.r.l.=C2=A0:=C2=A0http://www.sonicle.com Music:=C2=A0http://www.gabrielebulfon.com eXoplanets=C2=A0:=C2=A0https://gabrielebulfon.bandcamp.com/album/exoplanets =C2=A0 =C2=A0 Da: Toomas Soome via illumos-developer A: illumos-developer Data: 22 luglio 2024 16.10.42 CEST Oggetto: Re: [developer] fmd core dump On 22. Jul 2024, at 17:01, Gabriele Bulfon via illumos-developer wrote: Hi, I have a couple of systems, installed in 2012 and updated up to illumos= 2019 (will have to update to 2024 later). They periodically (every 3-4 months, sometimes earlier) create a core dump = under /var/fm/fmd. Looks like fmd core dumped, so no email notice is sent, and we end up filli= ng the rpool. I found=C2=A0 this link: https://support.oracle.com/knowledge/Sun%20Microsy= stems/1020519_1.html So here I attach the pstack of one of the dumps. =C2=A0 Any idea? =C2=A0 fmd_alloc() does panic when we are out of memory: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0if (data =3D=3D NULL) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 fmd_panic("insuffic= ient memory (%u bytes needed)\n", size); You can try adding some more swap space perhaps? =C2=A0 rgds, toomas Gabriele =C2=A0 =C2=A0 Sonicle S.r.l.=C2=A0:=C2=A0http://www.sonicle.com Music:=C2=A0http://www.gabrielebulfon.com eXoplanets=C2=A0:=C2=A0https://gabrielebulfon.bandcamp.com/album/exoplanets =C2=A0 illumos / illumos-developer / see discussions + participants + delivery=C2= =A0options Permalink ------=_Part_1665_1804164084.1723214608128 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
The problem happened again, but this time= the rpool was not yet full.
The pstack output shows again the same prob= lem:

 feed68a5 _lwp_kill (5, 6, 22c4, fef45000, fef45000, c) + = 15
 fee68a7b raise    (6) + 2b
 fee41cde abort &#= 160;  () + 10e
 08079939 fmd_panic (8081400)
 0807994b= fmd_panic (8081400) + 12
 08065394 fmd_alloc (50, 1) + 81
 = ;0806f6a5 fmd_event_create (1, d1da323a, 1bd4e8f, 0) + 18
 08073ae3= fmd_module_timeout (fb8ef100, 2a1, d1da323a) + 20
 0807bd21 fmd_ti= merq_exec (915db80) + 127
 0807b299 fmd_thread_start (8131030) + 5b=
 feed1a3b _thrp_setup (fed82a40) + 88
 feed1bd0 _lwp_start= (fed82a40, 0, 0, 0, 0, 0)
 
I can't believe this g= lobal zone is out of virtual memory, it's running various zones with a lot = of processes and they all goes fine.
Only fmd here is going panic.
Wh= at I found is an old issue I even forgot about: an infolog_hival file is be= ing produced continuously.
Running a tail -f on it I get a continuous ou= tput like:

port_address        w500304801d0a8808= LH
PhyIdentifier88 %/pci@0,0/pci8086,2f02@1/pci15d9,808@0((
event_typ= e      port_broadcast_sesTPclass       3resou= rce.sysevent.EC_hba.ESC_sas_hba_port_broadcast  version  __ttl0(_= _todf▒'|▒,▒▒,^C
 
As I remember, this ma= y go on for some time then it will stop.

Any idea?
G
 
 



Da:= Toomas Soome via illumos-developer <developer@lists.illumos.org>
= A: illumos-developer <developer@lists.illumos.org>Data: 22 luglio 2024 16.10.42 CEST
Oggetto: Re: [developer] fmd core dump




On 22. Jul 2024, at 17:01, Gabriele Bulfon via illumos-developer <d= eveloper@lists.illumos.org> wrote:
Hi, I have a couple of systems, installed in 2012 and updated up to il= lumos 2019 (will have to update to 2024 later).
They periodically (every 3-4 months, someti= mes earlier) create a core dump under /var/fm/fmd.
Looks like fmd core dumped, so no email not= ice is sent, and we end up filling the rpool.
So here I attach the pstack of one of the d= umps.
 
Any idea?

 
fmd_alloc() does panic when we are out of memory:
 

        if= (data = =3D=3D NULL)

= 60;               fmd_panic("insufficient memor= y (%u bytes needed)\n", size);

You can try adding some more swap space perhaps?

 
rgds,
toomas

Gabriele
 
<core.fmd.dump.pstack.txt>

------=_Part_1665_1804164084.1723214608128-- ------=_Part_1664_1099839289.1723214608128--