* a bit of puzzle (any ideas?)
@ 2024-03-24 8:57 Toomas Soome
2024-04-02 19:15 ` [developer] " Pramod Batni
0 siblings, 1 reply; 2+ messages in thread
From: Toomas Soome @ 2024-03-24 8:57 UTC (permalink / raw)
To: illumos-developer
[-- Attachment #1: Type: text/plain, Size: 1506 bytes --]
hi!
I’m investigating one crash case:
panic message: BAD TRAP: type=e (#pf Page fault) rp=fffffe00bc8fe480 addr=88 occurred in module "genunix" due to a NULL pointer dereference
> ::stack
canput+0x1e(0)
clnt_dispatch_send+0x5a(0, fffffe83d9cbff20, fffffe00bc8fe858, 0, 0)
connmgr_getopt_int+0xd2(ffff, 1002, fffffe00bc8fe6c4, fffffe00bc8fe858, fffffe843bbc6450, fffffe00bc8fe700)
connmgr_setbufsz+0x8c()
connmgr_connect+0x207(fffffe8422f34bf8, 0, fffffe843d88b668, 2, fffffe00bc8fe858, fffffe0000000000)
connmgr_get+0x4ab(0, fffffe00bc8fe9e0, fffffe843d88b5c0, 0)
connmgr_wrapget+0x24(0, fffffe00bc8fe9e0, fffffe843d88b5c0, 0)
On first look, it is simple - we indeed do pass NULL pointer (queue_t *) to canput() and connmgr_getopt_int() and connmgr_connect(), and this is starting from connmgr_get().
In connmgr_get(), the work to get value for queue_t * pointer is starting at:
https://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/rpc/clnt_cots.c?r=f67d64d9&mo=52441&fi=1805#2141
There we create new connection and set up the stream modules and this ends up with assignment at:
https://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/rpc/clnt_cots.c?r=f67d64d9&mo=52441&fi=1805#2141
So the obvious question is, what is the scenario to end up with wq == NULL — since we do end up using this variable with call to connmgr_connect() - the strioctl() calls before the assignment to wq should have succeeded.
Any ideas?
thanks,
toomas
[-- Attachment #2: Type: text/html, Size: 11878 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [developer] a bit of puzzle (any ideas?)
2024-03-24 8:57 a bit of puzzle (any ideas?) Toomas Soome
@ 2024-04-02 19:15 ` Pramod Batni
0 siblings, 0 replies; 2+ messages in thread
From: Pramod Batni @ 2024-04-02 19:15 UTC (permalink / raw)
To: illumos-developer
[-- Attachment #1: Type: text/plain, Size: 3856 bytes --]
This panic does seem strange in the context of the illumos' kernel rpc code
you have referenced.
Before pushing the timod stream module, the following code is executed:
2171 <https://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/rpc/clnt_cots.c?r=f67d64d9&mo=52441&fi=1805#2171>
wq <https://src.illumos.org/source/s?defs=wq&project=illumos-gate> =
tiptr <https://src.illumos.org/source/s?defs=tiptr&project=illumos-gate>->fp
<https://src.illumos.org/source/s?defs=fp&project=illumos-gate>->f_vnode
<https://src.illumos.org/source/s?defs=f_vnode&project=illumos-gate>->v_stream
<https://src.illumos.org/source/s?defs=v_stream&project=illumos-gate>->sd_wrq
<https://src.illumos.org/source/s?defs=sd_wrq&project=illumos-gate>->q_next
<https://src.illumos.org/source/s?defs=q_next&project=illumos-gate>;2172
<https://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/rpc/clnt_cots.c?r=f67d64d9&mo=52441&fi=1805#2172>
cm_entry <https://src.illumos.org/source/s?defs=cm_entry&project=illumos-gate>->x_wq
<https://src.illumos.org/source/s?defs=x_wq&project=illumos-gate> = wq
<https://src.illumos.org/source/s?defs=wq&project=illumos-gate>;2173
<https://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/rpc/clnt_cots.c?r=f67d64d9&mo=52441&fi=1805#2173>
Perhaps, you might want to verify that the wq value stored to
cm_entry->x_wq is not NULL.
cm_entry is allocated in the routine connmgr_get() and seems to be a
global data structure.
Just to clariy, the panic is seen on a vanilla illumos kernel, right?
[i,e no modifications to the stock kernel code and the kernel does not load
and use any 3rd party kernel driver
and/or kernel module]
On Sun, Mar 24, 2024 at 2:29 PM Toomas Soome via illumos-developer <
developer@lists.illumos.org> wrote:
> hi!
>
> I’m investigating one crash case:
>
> panic message: BAD TRAP: type=e (#pf Page fault) rp=fffffe00bc8fe480
> addr=88 occurred in module "genunix" due to a NULL pointer dereference
>
>
> > ::stack
>
> canput+0x1e(0)
>
> clnt_dispatch_send+0x5a(0, fffffe83d9cbff20, fffffe00bc8fe858, 0, 0)
>
> connmgr_getopt_int+0xd2(ffff, 1002, fffffe00bc8fe6c4, fffffe00bc8fe858,
> fffffe843bbc6450, fffffe00bc8fe700)
>
> connmgr_setbufsz+0x8c()
>
> connmgr_connect+0x207(fffffe8422f34bf8, 0, fffffe843d88b668, 2,
> fffffe00bc8fe858, fffffe0000000000)
>
> connmgr_get+0x4ab(0, fffffe00bc8fe9e0, fffffe843d88b5c0, 0)
>
> connmgr_wrapget+0x24(0, fffffe00bc8fe9e0, fffffe843d88b5c0, 0)
>
>
>
> On first look, it is simple - we indeed do pass NULL pointer (queue_t *)
> to canput() and connmgr_getopt_int() and connmgr_connect(), and this is
> starting from connmgr_get().
>
>
> In connmgr_get(), the work to get value for queue_t * pointer is starting
> at:
>
>
> https://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/rpc/clnt_cots.c?r=f67d64d9&mo=52441&fi=1805#2141
>
> There we create new connection and set up the stream modules and this ends
> up with assignment at:
>
>
> https://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/rpc/clnt_cots.c?r=f67d64d9&mo=52441&fi=1805#2141
>
>
> So the obvious question is, what is the scenario to end up with wq == NULL
> — since we do end up using this variable with call to connmgr_connect() -
> the strioctl() calls before the assignment to wq should have succeeded.
>
> Any ideas?
>
> thanks,
> toomas
> *illumos <https://illumos.topicbox.com/latest>* / illumos-developer / see
> discussions <https://illumos.topicbox.com/groups/developer> + participants
> <https://illumos.topicbox.com/groups/developer/members> + delivery options
> <https://illumos.topicbox.com/groups/developer/subscription> Permalink
> <https://illumos.topicbox.com/groups/developer/Tf1dae256136f333f-M60293bb71806382da9ea42a2>
>
[-- Attachment #2: Type: text/html, Size: 13227 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-04-02 19:15 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-24 8:57 a bit of puzzle (any ideas?) Toomas Soome
2024-04-02 19:15 ` [developer] " Pramod Batni
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).