* [9fans] Nix/regen: assert triggered; best way to track
@ 2025-01-28 10:21 tlaronde
2025-01-28 15:35 ` Ron Minnich
2025-01-28 15:49 ` Paul Lalonde
0 siblings, 2 replies; 15+ messages in thread
From: tlaronde @ 2025-01-28 10:21 UTC (permalink / raw)
To: 9fans
After fixing problems leading to compiler warnings---legitimate
warnings, but even the too short binary negated unsigned 32bits values
promoted to 64 bits with leading bits hence 0 as mask were harmless---
now I want to look at the stumbing block.
For me, under vmx, this is the assert in map.c:17:
assert(pa < KSEG2);
that triggers, and it should come from a call from multiboot.
My first reflex is to start adding printf() instructions to track the
problem, but is there a better way when dealing with the kernel?
Second question: since, if I'm not mistaken, 9front doesn't use
multiboot, is vmx usable (i.e. agnostic about) with the multiboot stuff?
The embedded boot stuff should handle the thing by itself without load
addresses having to be adjusted because of vmx?
--
Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T8b5b89fcf829819e-M1aa8501df3408a8c92dd8170
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] Nix/regen: assert triggered; best way to track
2025-01-28 10:21 [9fans] Nix/regen: assert triggered; best way to track tlaronde
@ 2025-01-28 15:35 ` Ron Minnich
2025-01-28 15:49 ` Paul Lalonde
1 sibling, 0 replies; 15+ messages in thread
From: Ron Minnich @ 2025-01-28 15:35 UTC (permalink / raw)
To: 9fans
vmx understands multiboot.
I really dislike asserts. That assert is not helpful: what is the
value of pa? What is the value of KSEG2? So if you want to start
somewhere, turn that into
if (pa>=KSEG2) panic("blah %p blah %p bla bl", pa, KSEG2); // still
one line, but actually useful output!
note that panic takes fmt and args.
On Tue, Jan 28, 2025 at 6:09 AM <tlaronde@kergis.com> wrote:
>
> After fixing problems leading to compiler warnings---legitimate
> warnings, but even the too short binary negated unsigned 32bits values
> promoted to 64 bits with leading bits hence 0 as mask were harmless---
> now I want to look at the stumbing block.
>
> For me, under vmx, this is the assert in map.c:17:
>
> assert(pa < KSEG2);
>
> that triggers, and it should come from a call from multiboot.
>
> My first reflex is to start adding printf() instructions to track the
> problem, but is there a better way when dealing with the kernel?
>
> Second question: since, if I'm not mistaken, 9front doesn't use
> multiboot, is vmx usable (i.e. agnostic about) with the multiboot stuff?
> The embedded boot stuff should handle the thing by itself without load
> addresses having to be adjusted because of vmx?
> --
> Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> http://www.kergis.com/
> http://kertex.kergis.com/
> Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T8b5b89fcf829819e-M4c1ed58c6116ae8cd4516860
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] Nix/regen: assert triggered; best way to track
2025-01-28 10:21 [9fans] Nix/regen: assert triggered; best way to track tlaronde
2025-01-28 15:35 ` Ron Minnich
@ 2025-01-28 15:49 ` Paul Lalonde
2025-01-28 17:07 ` tlaronde
1 sibling, 1 reply; 15+ messages in thread
From: Paul Lalonde @ 2025-01-28 15:49 UTC (permalink / raw)
To: 9fans
[-- Attachment #1: Type: text/plain, Size: 1395 bytes --]
Do you have a stack for the assert, from the ktrace?
On Tue, Jan 28, 2025 at 6:09 AM <tlaronde@kergis.com> wrote:
> After fixing problems leading to compiler warnings---legitimate
> warnings, but even the too short binary negated unsigned 32bits values
> promoted to 64 bits with leading bits hence 0 as mask were harmless---
> now I want to look at the stumbing block.
>
> For me, under vmx, this is the assert in map.c:17:
>
> assert(pa < KSEG2);
>
> that triggers, and it should come from a call from multiboot.
>
> My first reflex is to start adding printf() instructions to track the
> problem, but is there a better way when dealing with the kernel?
>
> Second question: since, if I'm not mistaken, 9front doesn't use
> multiboot, is vmx usable (i.e. agnostic about) with the multiboot stuff?
> The embedded boot stuff should handle the thing by itself without load
> addresses having to be adjusted because of vmx?
> --
> Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> http://www.kergis.com/
> http://kertex.kergis.com/
> Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T8b5b89fcf829819e-Mdee12446d89ecb5b853663a1
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
[-- Attachment #2: Type: text/html, Size: 3045 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] Nix/regen: assert triggered; best way to track
2025-01-28 15:49 ` Paul Lalonde
@ 2025-01-28 17:07 ` tlaronde
2025-01-28 17:18 ` Paul Lalonde
2025-01-28 17:27 ` ori
0 siblings, 2 replies; 15+ messages in thread
From: tlaronde @ 2025-01-28 17:07 UTC (permalink / raw)
To: 9fans
On Tue, Jan 28, 2025 at 07:49:02AM -0800, Paul Lalonde wrote:
> Do you have a stack for the assert, from the ktrace?
>
Yes, and I was wrong: it fails relatively "late" in main.c: at
mpsinit.
Here is the info (I added a bunch of print() before each function call
to know where it stumbled upon an incorrect address):
term% nix/test_vmx
NIX
mmunit...mmuinit: vmstart 0xfffffffff0000000 vmunused 0xfffffffff023d000 vmunmapped 0xfffffffff0400000 vmend 0xfffffffff4000000
sys->pd 0x108003 0x108023
cpu0: mmu l3 pte 0xfffffffff0106ff8 = 107023
cpu0: mmu l2 pte 0xfffffffff0107ff8 = 108023
cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
ioinit... multibootmemassert... kbdinit... meminit...asm: addr 0x0000000004000000 end 0x0000000004000000 type 1 size 0
cm 0: addr 0x4000000 npage 0
0 0 0
npage 0 upage 0 kpage 16384
confinit... archinit... mallocinit...base 0xfffffffff023d000 ptr 0xfffffffff023d000 nunits 4047617
acpiinit... umeminit... trapinit... printinit... i8259init... procinit... mpsinit...panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00 >= fffffe0000000000
panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00 >= fffffe0000000000
dumpstack
ktrace 9k8cpu 0xfffffffff011cdee 0xfffffffff0105d58
estackx 0xfffffffff0106000
0xfffffffff0105c70=0xfffffffff0105da8 0xfffffffff0105c78=0xfffffffff011cb91
0xfffffffff0105c80=0xfffffffff0105c98 0xfffffffff0105c98=0xfffffffff013cff7
0xfffffffff0105cb0=0xfffffffff0105cd0 0xfffffffff0105cc0=0xfffffffff0105ea7
0xfffffffff0105cc8=0xfffffffff0105df3 0xfffffffff0105ce0=0xfffffffff013d14d
0xfffffffff0105d08=0xfffffffff0105d90 0xfffffffff0105d28=0xfffffffff011cdee
0xfffffffff0105d30=0xfffffffff0105da8 0xfffffffff0105d40=0xfffffffff0105d58
0xfffffffff0105d48=0xfffffffff0105da8 0xfffffffff0105d50=0xfffffffff011cdee
0xfffffffff0105d58=0xfffffffff011cb99 0xfffffffff0105d68=0xfffffffff013d50f
0xfffffffff0105d88=0xfffffffff0105ed0 0xfffffffff0105d90=0xfffffffff013cff7
0xfffffffff0105d98=0xfffffffff0105db5 0xfffffffff0105e08=0xfffffffff013d1b8
0xfffffffff0105e10=0xfffffffff0105e00 0xfffffffff0105e20=0xfffffffff0105ea3
0xfffffffff0105e28=0xfffffffff0105e98 0xfffffffff0105e38=0xfffffffff013d1b8
0xfffffffff0105e40=0xfffffffff0105e98 0xfffffffff0105e60=0xfffffffff013d217
0xfffffffff0105e68=0xfffffffff015d9c9 0xfffffffff0105e80=0xfffffffff0105fb8
0xfffffffff0105e90=0xfffffffff015d5d9 0xfffffffff0105ea8=0xfffffffff0105ed0
0xfffffffff0105ec0=0xfffffffff0116a3b 0xfffffffff0105ef8=0xfffffffff012fe55
0xfffffffff0105f08=0xfffffffff01a1afa 0xfffffffff0105f10=0x0000000000000004
0xfffffffff0105f18=0x0000000000000046 0xfffffffff0105f20=0xfffffffff00fffd9
0xfffffffff0105f28=0x0000000000000006 0xfffffffff0105f30=0xfffffffff015d5d9
0xfffffffff0105f38=0xfffffffff0000400 0xfffffffff0105f40=0x0000000000000000
0xfffffffff0105f48=0xfffffffff012fec9 0xfffffffff0105f50=0xfffffffff01a1aff
0xfffffffff0105f58=0x0000000000000208 0xfffffffff0105f60=0x0000000000000124
0xfffffffff0105f68=0xfffffffff01149d0 0xfffffffff0105f70=0x0000000000000006
0xfffffffff0105f78=0xfffffffff0114ba7 0xfffffffff0105f80=0xfffffffff0227510
0xfffffffff0105f88=0xffffffff00000000 0xfffffffff0105f90=0x0000000000000000
0xfffffffff0105f98=0xfffffffff0105fb8 0xfffffffff0105fa0=0x0000000bf0116b0d
0xfffffffff0105fa8=0xfffffffff011622a 0xfffffffff0105fb0=0xffffffff00000400
0xfffffffff0105fb8=0xffffffff00000000 0xfffffffff0105fc0=0x0000000000000000
0xfffffffff0105fc8=0x0000000000000000 0xfffffffff0105fd0=0x0000000000000000
0xfffffffff0105fd8=0x0000000000000000 0xfffffffff0105fe0=0x0000000000000000
0xfffffffff0105fe8=0xfffffffff0110204 0xfffffffff0105ff0=0x000000002badb002
0xfffffffff0105ff8=0x000000000023b000
cpu0: exiting
>
>
> On Tue, Jan 28, 2025 at 6:09?AM <tlaronde@kergis.com> wrote:
>
> > After fixing problems leading to compiler warnings---legitimate
> > warnings, but even the too short binary negated unsigned 32bits values
> > promoted to 64 bits with leading bits hence 0 as mask were harmless---
> > now I want to look at the stumbing block.
> >
> > For me, under vmx, this is the assert in map.c:17:
> >
> > assert(pa < KSEG2);
> >
> > that triggers, and it should come from a call from multiboot.
> >
> > My first reflex is to start adding printf() instructions to track the
> > problem, but is there a better way when dealing with the kernel?
> >
> > Second question: since, if I'm not mistaken, 9front doesn't use
> > multiboot, is vmx usable (i.e. agnostic about) with the multiboot stuff?
> > The embedded boot stuff should handle the thing by itself without load
> > addresses having to be adjusted because of vmx?
> > --
> > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> > http://www.kergis.com/
> > http://kertex.kergis.com/
> > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
--
Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T8b5b89fcf829819e-Mfd99b750d696a1d8ec93a9d9
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] Nix/regen: assert triggered; best way to track
2025-01-28 17:07 ` tlaronde
@ 2025-01-28 17:18 ` Paul Lalonde
2025-01-28 18:16 ` tlaronde
2025-01-28 17:27 ` ori
1 sibling, 1 reply; 15+ messages in thread
From: Paul Lalonde @ 2025-01-28 17:18 UTC (permalink / raw)
To: 9fans
[-- Attachment #1: Type: text/plain, Size: 6634 bytes --]
ktrace can generate a stack for you from that dump. The line starting with
"ktrace" is the command line (you might change 9k8cpu to the path to the
kernel file in you're not in the directory where you built it).
Then the following lines up to but not including the "cpu0: exiting" can be
dropped into ktrace's stdin to have it generate a stack trace. You'll need
to add the ^d at the end if you're cut-and-pasting.
Though it looks like it's just triggering the page fault trap on that
0xfffffffffffffc00 address, which itself looks like a victim of
sign-extension. So back up to the fault and find the source of that
address?
Paul
On Tue, Jan 28, 2025 at 9:09 AM <tlaronde@kergis.com> wrote:
> On Tue, Jan 28, 2025 at 07:49:02AM -0800, Paul Lalonde wrote:
> > Do you have a stack for the assert, from the ktrace?
> >
>
> Yes, and I was wrong: it fails relatively "late" in main.c: at
> mpsinit.
>
> Here is the info (I added a bunch of print() before each function call
> to know where it stumbled upon an incorrect address):
>
> term% nix/test_vmx
>
> NIX
> mmunit...mmuinit: vmstart 0xfffffffff0000000 vmunused 0xfffffffff023d000
> vmunmapped 0xfffffffff0400000 vmend 0xfffffffff4000000
> sys->pd 0x108003 0x108023
> cpu0: mmu l3 pte 0xfffffffff0106ff8 = 107023
> cpu0: mmu l2 pte 0xfffffffff0107ff8 = 108023
> cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
> cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
> ioinit... multibootmemassert... kbdinit... meminit...asm: addr
> 0x0000000004000000 end 0x0000000004000000 type 1 size 0
> cm 0: addr 0x4000000 npage 0
> 0 0 0
> npage 0 upage 0 kpage 16384
> confinit... archinit... mallocinit...base 0xfffffffff023d000 ptr
> 0xfffffffff023d000 nunits 4047617
> acpiinit... umeminit... trapinit... printinit... i8259init... procinit...
> mpsinit...panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00 >=
> fffffe0000000000
> panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00 >= fffffe0000000000
>
> dumpstack
> ktrace 9k8cpu 0xfffffffff011cdee 0xfffffffff0105d58
> estackx 0xfffffffff0106000
> 0xfffffffff0105c70=0xfffffffff0105da8
> 0xfffffffff0105c78=0xfffffffff011cb91
> 0xfffffffff0105c80=0xfffffffff0105c98
> 0xfffffffff0105c98=0xfffffffff013cff7
> 0xfffffffff0105cb0=0xfffffffff0105cd0
> 0xfffffffff0105cc0=0xfffffffff0105ea7
> 0xfffffffff0105cc8=0xfffffffff0105df3
> 0xfffffffff0105ce0=0xfffffffff013d14d
> 0xfffffffff0105d08=0xfffffffff0105d90
> 0xfffffffff0105d28=0xfffffffff011cdee
> 0xfffffffff0105d30=0xfffffffff0105da8
> 0xfffffffff0105d40=0xfffffffff0105d58
> 0xfffffffff0105d48=0xfffffffff0105da8
> 0xfffffffff0105d50=0xfffffffff011cdee
> 0xfffffffff0105d58=0xfffffffff011cb99
> 0xfffffffff0105d68=0xfffffffff013d50f
> 0xfffffffff0105d88=0xfffffffff0105ed0
> 0xfffffffff0105d90=0xfffffffff013cff7
> 0xfffffffff0105d98=0xfffffffff0105db5
> 0xfffffffff0105e08=0xfffffffff013d1b8
> 0xfffffffff0105e10=0xfffffffff0105e00
> 0xfffffffff0105e20=0xfffffffff0105ea3
> 0xfffffffff0105e28=0xfffffffff0105e98
> 0xfffffffff0105e38=0xfffffffff013d1b8
> 0xfffffffff0105e40=0xfffffffff0105e98
> 0xfffffffff0105e60=0xfffffffff013d217
> 0xfffffffff0105e68=0xfffffffff015d9c9
> 0xfffffffff0105e80=0xfffffffff0105fb8
> 0xfffffffff0105e90=0xfffffffff015d5d9
> 0xfffffffff0105ea8=0xfffffffff0105ed0
> 0xfffffffff0105ec0=0xfffffffff0116a3b
> 0xfffffffff0105ef8=0xfffffffff012fe55
> 0xfffffffff0105f08=0xfffffffff01a1afa
> 0xfffffffff0105f10=0x0000000000000004
> 0xfffffffff0105f18=0x0000000000000046
> 0xfffffffff0105f20=0xfffffffff00fffd9
> 0xfffffffff0105f28=0x0000000000000006
> 0xfffffffff0105f30=0xfffffffff015d5d9
> 0xfffffffff0105f38=0xfffffffff0000400
> 0xfffffffff0105f40=0x0000000000000000
> 0xfffffffff0105f48=0xfffffffff012fec9
> 0xfffffffff0105f50=0xfffffffff01a1aff
> 0xfffffffff0105f58=0x0000000000000208
> 0xfffffffff0105f60=0x0000000000000124
> 0xfffffffff0105f68=0xfffffffff01149d0
> 0xfffffffff0105f70=0x0000000000000006
> 0xfffffffff0105f78=0xfffffffff0114ba7
> 0xfffffffff0105f80=0xfffffffff0227510
> 0xfffffffff0105f88=0xffffffff00000000
> 0xfffffffff0105f90=0x0000000000000000
> 0xfffffffff0105f98=0xfffffffff0105fb8
> 0xfffffffff0105fa0=0x0000000bf0116b0d
> 0xfffffffff0105fa8=0xfffffffff011622a
> 0xfffffffff0105fb0=0xffffffff00000400
> 0xfffffffff0105fb8=0xffffffff00000000
> 0xfffffffff0105fc0=0x0000000000000000
> 0xfffffffff0105fc8=0x0000000000000000
> 0xfffffffff0105fd0=0x0000000000000000
> 0xfffffffff0105fd8=0x0000000000000000
> 0xfffffffff0105fe0=0x0000000000000000
> 0xfffffffff0105fe8=0xfffffffff0110204
> 0xfffffffff0105ff0=0x000000002badb002
> 0xfffffffff0105ff8=0x000000000023b000
> cpu0: exiting
>
> >
> >
> > On Tue, Jan 28, 2025 at 6:09?AM <tlaronde@kergis.com> wrote:
> >
> > > After fixing problems leading to compiler warnings---legitimate
> > > warnings, but even the too short binary negated unsigned 32bits values
> > > promoted to 64 bits with leading bits hence 0 as mask were harmless---
> > > now I want to look at the stumbing block.
> > >
> > > For me, under vmx, this is the assert in map.c:17:
> > >
> > > assert(pa < KSEG2);
> > >
> > > that triggers, and it should come from a call from multiboot.
> > >
> > > My first reflex is to start adding printf() instructions to track the
> > > problem, but is there a better way when dealing with the kernel?
> > >
> > > Second question: since, if I'm not mistaken, 9front doesn't use
> > > multiboot, is vmx usable (i.e. agnostic about) with the multiboot
> stuff?
> > > The embedded boot stuff should handle the thing by itself without load
> > > addresses having to be adjusted because of vmx?
> > > --
> > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> > > http://www.kergis.com/
> > > http://kertex.kergis.com/
> > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
>
> --
> Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> http://www.kergis.com/
> http://kertex.kergis.com/
> Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T8b5b89fcf829819e-M550d6760d5cca5e351e89e55
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
[-- Attachment #2: Type: text/html, Size: 9769 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] Nix/regen: assert triggered; best way to track
2025-01-28 17:07 ` tlaronde
2025-01-28 17:18 ` Paul Lalonde
@ 2025-01-28 17:27 ` ori
1 sibling, 0 replies; 15+ messages in thread
From: ori @ 2025-01-28 17:27 UTC (permalink / raw)
To: 9fans
now, if you have the ktrace, you can get a stack trace, using the
first command printed in your ktrace:
ktrace 9k8cpu 0xfffffffff011cdee 0xfffffffff0105d58
and then pasting in the numbers.
see ktrace(1) for details.
Quoth tlaronde@kergis.com:
> On Tue, Jan 28, 2025 at 07:49:02AM -0800, Paul Lalonde wrote:
> > Do you have a stack for the assert, from the ktrace?
> >
>
> Yes, and I was wrong: it fails relatively "late" in main.c: at
> mpsinit.
>
> Here is the info (I added a bunch of print() before each function call
> to know where it stumbled upon an incorrect address):
>
> term% nix/test_vmx
>
> NIX
> mmunit...mmuinit: vmstart 0xfffffffff0000000 vmunused 0xfffffffff023d000 vmunmapped 0xfffffffff0400000 vmend 0xfffffffff4000000
> sys->pd 0x108003 0x108023
> cpu0: mmu l3 pte 0xfffffffff0106ff8 = 107023
> cpu0: mmu l2 pte 0xfffffffff0107ff8 = 108023
> cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
> cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
> ioinit... multibootmemassert... kbdinit... meminit...asm: addr 0x0000000004000000 end 0x0000000004000000 type 1 size 0
> cm 0: addr 0x4000000 npage 0
> 0 0 0
> npage 0 upage 0 kpage 16384
> confinit... archinit... mallocinit...base 0xfffffffff023d000 ptr 0xfffffffff023d000 nunits 4047617
> acpiinit... umeminit... trapinit... printinit... i8259init... procinit... mpsinit...panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00 >= fffffe0000000000
> panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00 >= fffffe0000000000
>
> dumpstack
> ktrace 9k8cpu 0xfffffffff011cdee 0xfffffffff0105d58
> estackx 0xfffffffff0106000
> 0xfffffffff0105c70=0xfffffffff0105da8 0xfffffffff0105c78=0xfffffffff011cb91
> 0xfffffffff0105c80=0xfffffffff0105c98 0xfffffffff0105c98=0xfffffffff013cff7
> 0xfffffffff0105cb0=0xfffffffff0105cd0 0xfffffffff0105cc0=0xfffffffff0105ea7
> 0xfffffffff0105cc8=0xfffffffff0105df3 0xfffffffff0105ce0=0xfffffffff013d14d
> 0xfffffffff0105d08=0xfffffffff0105d90 0xfffffffff0105d28=0xfffffffff011cdee
> 0xfffffffff0105d30=0xfffffffff0105da8 0xfffffffff0105d40=0xfffffffff0105d58
> 0xfffffffff0105d48=0xfffffffff0105da8 0xfffffffff0105d50=0xfffffffff011cdee
> 0xfffffffff0105d58=0xfffffffff011cb99 0xfffffffff0105d68=0xfffffffff013d50f
> 0xfffffffff0105d88=0xfffffffff0105ed0 0xfffffffff0105d90=0xfffffffff013cff7
> 0xfffffffff0105d98=0xfffffffff0105db5 0xfffffffff0105e08=0xfffffffff013d1b8
> 0xfffffffff0105e10=0xfffffffff0105e00 0xfffffffff0105e20=0xfffffffff0105ea3
> 0xfffffffff0105e28=0xfffffffff0105e98 0xfffffffff0105e38=0xfffffffff013d1b8
> 0xfffffffff0105e40=0xfffffffff0105e98 0xfffffffff0105e60=0xfffffffff013d217
> 0xfffffffff0105e68=0xfffffffff015d9c9 0xfffffffff0105e80=0xfffffffff0105fb8
> 0xfffffffff0105e90=0xfffffffff015d5d9 0xfffffffff0105ea8=0xfffffffff0105ed0
> 0xfffffffff0105ec0=0xfffffffff0116a3b 0xfffffffff0105ef8=0xfffffffff012fe55
> 0xfffffffff0105f08=0xfffffffff01a1afa 0xfffffffff0105f10=0x0000000000000004
> 0xfffffffff0105f18=0x0000000000000046 0xfffffffff0105f20=0xfffffffff00fffd9
> 0xfffffffff0105f28=0x0000000000000006 0xfffffffff0105f30=0xfffffffff015d5d9
> 0xfffffffff0105f38=0xfffffffff0000400 0xfffffffff0105f40=0x0000000000000000
> 0xfffffffff0105f48=0xfffffffff012fec9 0xfffffffff0105f50=0xfffffffff01a1aff
> 0xfffffffff0105f58=0x0000000000000208 0xfffffffff0105f60=0x0000000000000124
> 0xfffffffff0105f68=0xfffffffff01149d0 0xfffffffff0105f70=0x0000000000000006
> 0xfffffffff0105f78=0xfffffffff0114ba7 0xfffffffff0105f80=0xfffffffff0227510
> 0xfffffffff0105f88=0xffffffff00000000 0xfffffffff0105f90=0x0000000000000000
> 0xfffffffff0105f98=0xfffffffff0105fb8 0xfffffffff0105fa0=0x0000000bf0116b0d
> 0xfffffffff0105fa8=0xfffffffff011622a 0xfffffffff0105fb0=0xffffffff00000400
> 0xfffffffff0105fb8=0xffffffff00000000 0xfffffffff0105fc0=0x0000000000000000
> 0xfffffffff0105fc8=0x0000000000000000 0xfffffffff0105fd0=0x0000000000000000
> 0xfffffffff0105fd8=0x0000000000000000 0xfffffffff0105fe0=0x0000000000000000
> 0xfffffffff0105fe8=0xfffffffff0110204 0xfffffffff0105ff0=0x000000002badb002
> 0xfffffffff0105ff8=0x000000000023b000
> cpu0: exiting
>
> >
> >
> > On Tue, Jan 28, 2025 at 6:09?AM <tlaronde@kergis.com> wrote:
> >
> > > After fixing problems leading to compiler warnings---legitimate
> > > warnings, but even the too short binary negated unsigned 32bits values
> > > promoted to 64 bits with leading bits hence 0 as mask were harmless---
> > > now I want to look at the stumbing block.
> > >
> > > For me, under vmx, this is the assert in map.c:17:
> > >
> > > assert(pa < KSEG2);
> > >
> > > that triggers, and it should come from a call from multiboot.
> > >
> > > My first reflex is to start adding printf() instructions to track the
> > > problem, but is there a better way when dealing with the kernel?
> > >
> > > Second question: since, if I'm not mistaken, 9front doesn't use
> > > multiboot, is vmx usable (i.e. agnostic about) with the multiboot stuff?
> > > The embedded boot stuff should handle the thing by itself without load
> > > addresses having to be adjusted because of vmx?
> > > --
> > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> > > http://www.kergis.com/
> > > http://kertex.kergis.com/
> > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
>
> --
> Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> http://www.kergis.com/
> http://kertex.kergis.com/
> Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T8b5b89fcf829819e-M7128ec9427d562f171c62bd0
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] Nix/regen: assert triggered; best way to track
2025-01-28 17:18 ` Paul Lalonde
@ 2025-01-28 18:16 ` tlaronde
2025-01-28 19:23 ` Paul Lalonde
0 siblings, 1 reply; 15+ messages in thread
From: tlaronde @ 2025-01-28 18:16 UTC (permalink / raw)
To: 9fans
On Tue, Jan 28, 2025 at 09:18:29AM -0800, Paul Lalonde wrote:
> ktrace can generate a stack for you from that dump. The line starting with
> "ktrace" is the command line (you might change 9k8cpu to the path to the
> kernel file in you're not in the directory where you built it).
> Then the following lines up to but not including the "cpu0: exiting" can be
> dropped into ktrace's stdin to have it generate a stack trace. You'll need
> to add the ^d at the end if you're cut-and-pasting.
>
> Though it looks like it's just triggering the page fault trap on that
> 0xfffffffffffffc00 address, which itself looks like a victim of
> sign-extension. So back up to the fault and find the source of that
> address?
Yes:
src(0xfffffffff011cdee); // dumpstack+0x10
src(0xfffffffff013d50f); // panic+0x133
src(0xfffffffff0116a3b); // KADDR+0x55
src(0xfffffffff012fe55); // sigsearch+0xc8
src(0xfffffffff012fec9); // mpsinit+0x14
src(0xfffffffff011622a); // main+0x30b
src(0xfffffffff0110204); // ndnr
this doesn't tell me much more than what I knew already: it panics in
mpsinit, calling KADDR in map.c.
During my next wandering under Nix, I will try to track back from
where the offending address is taken or with what it is constructed.
>
> On Tue, Jan 28, 2025 at 9:09?AM <tlaronde@kergis.com> wrote:
>
> > On Tue, Jan 28, 2025 at 07:49:02AM -0800, Paul Lalonde wrote:
> > > Do you have a stack for the assert, from the ktrace?
> > >
> >
> > Yes, and I was wrong: it fails relatively "late" in main.c: at
> > mpsinit.
> >
> > Here is the info (I added a bunch of print() before each function call
> > to know where it stumbled upon an incorrect address):
> >
> > term% nix/test_vmx
> >
> > NIX
> > mmunit...mmuinit: vmstart 0xfffffffff0000000 vmunused 0xfffffffff023d000
> > vmunmapped 0xfffffffff0400000 vmend 0xfffffffff4000000
> > sys->pd 0x108003 0x108023
> > cpu0: mmu l3 pte 0xfffffffff0106ff8 = 107023
> > cpu0: mmu l2 pte 0xfffffffff0107ff8 = 108023
> > cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
> > cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
> > ioinit... multibootmemassert... kbdinit... meminit...asm: addr
> > 0x0000000004000000 end 0x0000000004000000 type 1 size 0
> > cm 0: addr 0x4000000 npage 0
> > 0 0 0
> > npage 0 upage 0 kpage 16384
> > confinit... archinit... mallocinit...base 0xfffffffff023d000 ptr
> > 0xfffffffff023d000 nunits 4047617
> > acpiinit... umeminit... trapinit... printinit... i8259init... procinit...
> > mpsinit...panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00 >=
> > fffffe0000000000
> > panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00 >= fffffe0000000000
> >
> > dumpstack
> > ktrace 9k8cpu 0xfffffffff011cdee 0xfffffffff0105d58
> > estackx 0xfffffffff0106000
> > 0xfffffffff0105c70=0xfffffffff0105da8
> > 0xfffffffff0105c78=0xfffffffff011cb91
> > 0xfffffffff0105c80=0xfffffffff0105c98
> > 0xfffffffff0105c98=0xfffffffff013cff7
> > 0xfffffffff0105cb0=0xfffffffff0105cd0
> > 0xfffffffff0105cc0=0xfffffffff0105ea7
> > 0xfffffffff0105cc8=0xfffffffff0105df3
> > 0xfffffffff0105ce0=0xfffffffff013d14d
> > 0xfffffffff0105d08=0xfffffffff0105d90
> > 0xfffffffff0105d28=0xfffffffff011cdee
> > 0xfffffffff0105d30=0xfffffffff0105da8
> > 0xfffffffff0105d40=0xfffffffff0105d58
> > 0xfffffffff0105d48=0xfffffffff0105da8
> > 0xfffffffff0105d50=0xfffffffff011cdee
> > 0xfffffffff0105d58=0xfffffffff011cb99
> > 0xfffffffff0105d68=0xfffffffff013d50f
> > 0xfffffffff0105d88=0xfffffffff0105ed0
> > 0xfffffffff0105d90=0xfffffffff013cff7
> > 0xfffffffff0105d98=0xfffffffff0105db5
> > 0xfffffffff0105e08=0xfffffffff013d1b8
> > 0xfffffffff0105e10=0xfffffffff0105e00
> > 0xfffffffff0105e20=0xfffffffff0105ea3
> > 0xfffffffff0105e28=0xfffffffff0105e98
> > 0xfffffffff0105e38=0xfffffffff013d1b8
> > 0xfffffffff0105e40=0xfffffffff0105e98
> > 0xfffffffff0105e60=0xfffffffff013d217
> > 0xfffffffff0105e68=0xfffffffff015d9c9
> > 0xfffffffff0105e80=0xfffffffff0105fb8
> > 0xfffffffff0105e90=0xfffffffff015d5d9
> > 0xfffffffff0105ea8=0xfffffffff0105ed0
> > 0xfffffffff0105ec0=0xfffffffff0116a3b
> > 0xfffffffff0105ef8=0xfffffffff012fe55
> > 0xfffffffff0105f08=0xfffffffff01a1afa
> > 0xfffffffff0105f10=0x0000000000000004
> > 0xfffffffff0105f18=0x0000000000000046
> > 0xfffffffff0105f20=0xfffffffff00fffd9
> > 0xfffffffff0105f28=0x0000000000000006
> > 0xfffffffff0105f30=0xfffffffff015d5d9
> > 0xfffffffff0105f38=0xfffffffff0000400
> > 0xfffffffff0105f40=0x0000000000000000
> > 0xfffffffff0105f48=0xfffffffff012fec9
> > 0xfffffffff0105f50=0xfffffffff01a1aff
> > 0xfffffffff0105f58=0x0000000000000208
> > 0xfffffffff0105f60=0x0000000000000124
> > 0xfffffffff0105f68=0xfffffffff01149d0
> > 0xfffffffff0105f70=0x0000000000000006
> > 0xfffffffff0105f78=0xfffffffff0114ba7
> > 0xfffffffff0105f80=0xfffffffff0227510
> > 0xfffffffff0105f88=0xffffffff00000000
> > 0xfffffffff0105f90=0x0000000000000000
> > 0xfffffffff0105f98=0xfffffffff0105fb8
> > 0xfffffffff0105fa0=0x0000000bf0116b0d
> > 0xfffffffff0105fa8=0xfffffffff011622a
> > 0xfffffffff0105fb0=0xffffffff00000400
> > 0xfffffffff0105fb8=0xffffffff00000000
> > 0xfffffffff0105fc0=0x0000000000000000
> > 0xfffffffff0105fc8=0x0000000000000000
> > 0xfffffffff0105fd0=0x0000000000000000
> > 0xfffffffff0105fd8=0x0000000000000000
> > 0xfffffffff0105fe0=0x0000000000000000
> > 0xfffffffff0105fe8=0xfffffffff0110204
> > 0xfffffffff0105ff0=0x000000002badb002
> > 0xfffffffff0105ff8=0x000000000023b000
> > cpu0: exiting
> >
> > >
> > >
> > > On Tue, Jan 28, 2025 at 6:09?AM <tlaronde@kergis.com> wrote:
> > >
> > > > After fixing problems leading to compiler warnings---legitimate
> > > > warnings, but even the too short binary negated unsigned 32bits values
> > > > promoted to 64 bits with leading bits hence 0 as mask were harmless---
> > > > now I want to look at the stumbing block.
> > > >
> > > > For me, under vmx, this is the assert in map.c:17:
> > > >
> > > > assert(pa < KSEG2);
> > > >
> > > > that triggers, and it should come from a call from multiboot.
> > > >
> > > > My first reflex is to start adding printf() instructions to track the
> > > > problem, but is there a better way when dealing with the kernel?
> > > >
> > > > Second question: since, if I'm not mistaken, 9front doesn't use
> > > > multiboot, is vmx usable (i.e. agnostic about) with the multiboot
> > stuff?
> > > > The embedded boot stuff should handle the thing by itself without load
> > > > addresses having to be adjusted because of vmx?
> > > > --
> > > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> > > > http://www.kergis.com/
> > > > http://kertex.kergis.com/
> > > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
> >
> > --
> > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> > http://www.kergis.com/
> > http://kertex.kergis.com/
> > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
--
Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T8b5b89fcf829819e-M8f6e866acaf3323d26ac35fb
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] Nix/regen: assert triggered; best way to track
2025-01-28 18:16 ` tlaronde
@ 2025-01-28 19:23 ` Paul Lalonde
2025-01-28 21:23 ` ron minnich
0 siblings, 1 reply; 15+ messages in thread
From: Paul Lalonde @ 2025-01-28 19:23 UTC (permalink / raw)
To: 9fans
[-- Attachment #1: Type: text/plain, Size: 9227 bytes --]
Ah, that's the code path that sent me to QEMU.
Vmx doesn't have any MP tables, which leads to this fault in mpsinit.
Ron provided this minimal one for me, which I think we could learn from to
adapt into vmx. The hacky version of pointing the code directly at
something like this baked in didn't excite me.
50 43 4D 50 ; "PCMP"
00 00 ; Table Length (placeholder)
04 ; Spec Revision
00 ; Checksum (placeholder)
42 4F 43 48 53 43 50 55 ; "BOCHSCPU"
30 2E 31 20 20 20 20 20 20 20 20 ; "0.1 "
00 00 00 00 ; OEM Table Pointer
00 00 ; OEM Table Size
14 00 ; Entry Count (2 CPUs + 18 = 20,
little-endian)
00 00 E0 FE ; Local APIC Address (0xfee00000)
00 00 ; Ext Table Length
00 ; Ext Table Checksum
00 ; Reserved
On Tue, Jan 28, 2025 at 11:15 AM <tlaronde@kergis.com> wrote:
> On Tue, Jan 28, 2025 at 09:18:29AM -0800, Paul Lalonde wrote:
> > ktrace can generate a stack for you from that dump. The line starting
> with
> > "ktrace" is the command line (you might change 9k8cpu to the path to the
> > kernel file in you're not in the directory where you built it).
> > Then the following lines up to but not including the "cpu0: exiting" can
> be
> > dropped into ktrace's stdin to have it generate a stack trace. You'll
> need
> > to add the ^d at the end if you're cut-and-pasting.
> >
> > Though it looks like it's just triggering the page fault trap on that
> > 0xfffffffffffffc00 address, which itself looks like a victim of
> > sign-extension. So back up to the fault and find the source of that
> > address?
>
> Yes:
>
> src(0xfffffffff011cdee); // dumpstack+0x10
> src(0xfffffffff013d50f); // panic+0x133
> src(0xfffffffff0116a3b); // KADDR+0x55
> src(0xfffffffff012fe55); // sigsearch+0xc8
> src(0xfffffffff012fec9); // mpsinit+0x14
> src(0xfffffffff011622a); // main+0x30b
> src(0xfffffffff0110204); // ndnr
>
> this doesn't tell me much more than what I knew already: it panics in
> mpsinit, calling KADDR in map.c.
>
> During my next wandering under Nix, I will try to track back from
> where the offending address is taken or with what it is constructed.
>
> >
> > On Tue, Jan 28, 2025 at 9:09?AM <tlaronde@kergis.com> wrote:
> >
> > > On Tue, Jan 28, 2025 at 07:49:02AM -0800, Paul Lalonde wrote:
> > > > Do you have a stack for the assert, from the ktrace?
> > > >
> > >
> > > Yes, and I was wrong: it fails relatively "late" in main.c: at
> > > mpsinit.
> > >
> > > Here is the info (I added a bunch of print() before each function call
> > > to know where it stumbled upon an incorrect address):
> > >
> > > term% nix/test_vmx
> > >
> > > NIX
> > > mmunit...mmuinit: vmstart 0xfffffffff0000000 vmunused
> 0xfffffffff023d000
> > > vmunmapped 0xfffffffff0400000 vmend 0xfffffffff4000000
> > > sys->pd 0x108003 0x108023
> > > cpu0: mmu l3 pte 0xfffffffff0106ff8 = 107023
> > > cpu0: mmu l2 pte 0xfffffffff0107ff8 = 108023
> > > cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
> > > cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
> > > ioinit... multibootmemassert... kbdinit... meminit...asm: addr
> > > 0x0000000004000000 end 0x0000000004000000 type 1 size 0
> > > cm 0: addr 0x4000000 npage 0
> > > 0 0 0
> > > npage 0 upage 0 kpage 16384
> > > confinit... archinit... mallocinit...base 0xfffffffff023d000 ptr
> > > 0xfffffffff023d000 nunits 4047617
> > > acpiinit... umeminit... trapinit... printinit... i8259init...
> procinit...
> > > mpsinit...panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00 >=
> > > fffffe0000000000
> > > panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00 >=
> fffffe0000000000
> > >
> > > dumpstack
> > > ktrace 9k8cpu 0xfffffffff011cdee 0xfffffffff0105d58
> > > estackx 0xfffffffff0106000
> > > 0xfffffffff0105c70=0xfffffffff0105da8
> > > 0xfffffffff0105c78=0xfffffffff011cb91
> > > 0xfffffffff0105c80=0xfffffffff0105c98
> > > 0xfffffffff0105c98=0xfffffffff013cff7
> > > 0xfffffffff0105cb0=0xfffffffff0105cd0
> > > 0xfffffffff0105cc0=0xfffffffff0105ea7
> > > 0xfffffffff0105cc8=0xfffffffff0105df3
> > > 0xfffffffff0105ce0=0xfffffffff013d14d
> > > 0xfffffffff0105d08=0xfffffffff0105d90
> > > 0xfffffffff0105d28=0xfffffffff011cdee
> > > 0xfffffffff0105d30=0xfffffffff0105da8
> > > 0xfffffffff0105d40=0xfffffffff0105d58
> > > 0xfffffffff0105d48=0xfffffffff0105da8
> > > 0xfffffffff0105d50=0xfffffffff011cdee
> > > 0xfffffffff0105d58=0xfffffffff011cb99
> > > 0xfffffffff0105d68=0xfffffffff013d50f
> > > 0xfffffffff0105d88=0xfffffffff0105ed0
> > > 0xfffffffff0105d90=0xfffffffff013cff7
> > > 0xfffffffff0105d98=0xfffffffff0105db5
> > > 0xfffffffff0105e08=0xfffffffff013d1b8
> > > 0xfffffffff0105e10=0xfffffffff0105e00
> > > 0xfffffffff0105e20=0xfffffffff0105ea3
> > > 0xfffffffff0105e28=0xfffffffff0105e98
> > > 0xfffffffff0105e38=0xfffffffff013d1b8
> > > 0xfffffffff0105e40=0xfffffffff0105e98
> > > 0xfffffffff0105e60=0xfffffffff013d217
> > > 0xfffffffff0105e68=0xfffffffff015d9c9
> > > 0xfffffffff0105e80=0xfffffffff0105fb8
> > > 0xfffffffff0105e90=0xfffffffff015d5d9
> > > 0xfffffffff0105ea8=0xfffffffff0105ed0
> > > 0xfffffffff0105ec0=0xfffffffff0116a3b
> > > 0xfffffffff0105ef8=0xfffffffff012fe55
> > > 0xfffffffff0105f08=0xfffffffff01a1afa
> > > 0xfffffffff0105f10=0x0000000000000004
> > > 0xfffffffff0105f18=0x0000000000000046
> > > 0xfffffffff0105f20=0xfffffffff00fffd9
> > > 0xfffffffff0105f28=0x0000000000000006
> > > 0xfffffffff0105f30=0xfffffffff015d5d9
> > > 0xfffffffff0105f38=0xfffffffff0000400
> > > 0xfffffffff0105f40=0x0000000000000000
> > > 0xfffffffff0105f48=0xfffffffff012fec9
> > > 0xfffffffff0105f50=0xfffffffff01a1aff
> > > 0xfffffffff0105f58=0x0000000000000208
> > > 0xfffffffff0105f60=0x0000000000000124
> > > 0xfffffffff0105f68=0xfffffffff01149d0
> > > 0xfffffffff0105f70=0x0000000000000006
> > > 0xfffffffff0105f78=0xfffffffff0114ba7
> > > 0xfffffffff0105f80=0xfffffffff0227510
> > > 0xfffffffff0105f88=0xffffffff00000000
> > > 0xfffffffff0105f90=0x0000000000000000
> > > 0xfffffffff0105f98=0xfffffffff0105fb8
> > > 0xfffffffff0105fa0=0x0000000bf0116b0d
> > > 0xfffffffff0105fa8=0xfffffffff011622a
> > > 0xfffffffff0105fb0=0xffffffff00000400
> > > 0xfffffffff0105fb8=0xffffffff00000000
> > > 0xfffffffff0105fc0=0x0000000000000000
> > > 0xfffffffff0105fc8=0x0000000000000000
> > > 0xfffffffff0105fd0=0x0000000000000000
> > > 0xfffffffff0105fd8=0x0000000000000000
> > > 0xfffffffff0105fe0=0x0000000000000000
> > > 0xfffffffff0105fe8=0xfffffffff0110204
> > > 0xfffffffff0105ff0=0x000000002badb002
> > > 0xfffffffff0105ff8=0x000000000023b000
> > > cpu0: exiting
> > >
> > > >
> > > >
> > > > On Tue, Jan 28, 2025 at 6:09?AM <tlaronde@kergis.com> wrote:
> > > >
> > > > > After fixing problems leading to compiler warnings---legitimate
> > > > > warnings, but even the too short binary negated unsigned 32bits
> values
> > > > > promoted to 64 bits with leading bits hence 0 as mask were
> harmless---
> > > > > now I want to look at the stumbing block.
> > > > >
> > > > > For me, under vmx, this is the assert in map.c:17:
> > > > >
> > > > > assert(pa < KSEG2);
> > > > >
> > > > > that triggers, and it should come from a call from multiboot.
> > > > >
> > > > > My first reflex is to start adding printf() instructions to track
> the
> > > > > problem, but is there a better way when dealing with the kernel?
> > > > >
> > > > > Second question: since, if I'm not mistaken, 9front doesn't use
> > > > > multiboot, is vmx usable (i.e. agnostic about) with the multiboot
> > > stuff?
> > > > > The embedded boot stuff should handle the thing by itself without
> load
> > > > > addresses having to be adjusted because of vmx?
> > > > > --
> > > > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> > > > > http://www.kergis.com/
> > > > > http://kertex.kergis.com/
> > > > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006
> F40C
> > >
> > > --
> > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> > > http://www.kergis.com/
> > > http://kertex.kergis.com/
> > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
>
> --
> Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> http://www.kergis.com/
> http://kertex.kergis.com/
> Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T8b5b89fcf829819e-M4e4ecc487efb27f806e7cd5c
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
[-- Attachment #2: Type: text/html, Size: 14996 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] Nix/regen: assert triggered; best way to track
2025-01-28 19:23 ` Paul Lalonde
@ 2025-01-28 21:23 ` ron minnich
2025-01-29 0:02 ` Ron Minnich
0 siblings, 1 reply; 15+ messages in thread
From: ron minnich @ 2025-01-28 21:23 UTC (permalink / raw)
To: 9fans
[-- Attachment #1: Type: text/plain, Size: 10013 bytes --]
I'd be happier to remove the mps dependency actually. the mps is long dead.
But that's a bigger story.
On Tue, Jan 28, 2025 at 11:24 AM Paul Lalonde <paul.a.lalonde@gmail.com>
wrote:
> Ah, that's the code path that sent me to QEMU.
> Vmx doesn't have any MP tables, which leads to this fault in mpsinit.
> Ron provided this minimal one for me, which I think we could learn from to
> adapt into vmx. The hacky version of pointing the code directly at
> something like this baked in didn't excite me.
>
> 50 43 4D 50 ; "PCMP"
> 00 00 ; Table Length (placeholder)
> 04 ; Spec Revision
> 00 ; Checksum (placeholder)
> 42 4F 43 48 53 43 50 55 ; "BOCHSCPU"
> 30 2E 31 20 20 20 20 20 20 20 20 ; "0.1 "
> 00 00 00 00 ; OEM Table Pointer
> 00 00 ; OEM Table Size
> 14 00 ; Entry Count (2 CPUs + 18 = 20,
> little-endian)
> 00 00 E0 FE ; Local APIC Address (0xfee00000)
> 00 00 ; Ext Table Length
> 00 ; Ext Table Checksum
> 00 ; Reserved
>
> On Tue, Jan 28, 2025 at 11:15 AM <tlaronde@kergis.com> wrote:
>
>> On Tue, Jan 28, 2025 at 09:18:29AM -0800, Paul Lalonde wrote:
>> > ktrace can generate a stack for you from that dump. The line starting
>> with
>> > "ktrace" is the command line (you might change 9k8cpu to the path to the
>> > kernel file in you're not in the directory where you built it).
>> > Then the following lines up to but not including the "cpu0: exiting"
>> can be
>> > dropped into ktrace's stdin to have it generate a stack trace. You'll
>> need
>> > to add the ^d at the end if you're cut-and-pasting.
>> >
>> > Though it looks like it's just triggering the page fault trap on that
>> > 0xfffffffffffffc00 address, which itself looks like a victim of
>> > sign-extension. So back up to the fault and find the source of that
>> > address?
>>
>> Yes:
>>
>> src(0xfffffffff011cdee); // dumpstack+0x10
>> src(0xfffffffff013d50f); // panic+0x133
>> src(0xfffffffff0116a3b); // KADDR+0x55
>> src(0xfffffffff012fe55); // sigsearch+0xc8
>> src(0xfffffffff012fec9); // mpsinit+0x14
>> src(0xfffffffff011622a); // main+0x30b
>> src(0xfffffffff0110204); // ndnr
>>
>> this doesn't tell me much more than what I knew already: it panics in
>> mpsinit, calling KADDR in map.c.
>>
>> During my next wandering under Nix, I will try to track back from
>> where the offending address is taken or with what it is constructed.
>>
>> >
>> > On Tue, Jan 28, 2025 at 9:09?AM <tlaronde@kergis.com> wrote:
>> >
>> > > On Tue, Jan 28, 2025 at 07:49:02AM -0800, Paul Lalonde wrote:
>> > > > Do you have a stack for the assert, from the ktrace?
>> > > >
>> > >
>> > > Yes, and I was wrong: it fails relatively "late" in main.c: at
>> > > mpsinit.
>> > >
>> > > Here is the info (I added a bunch of print() before each function call
>> > > to know where it stumbled upon an incorrect address):
>> > >
>> > > term% nix/test_vmx
>> > >
>> > > NIX
>> > > mmunit...mmuinit: vmstart 0xfffffffff0000000 vmunused
>> 0xfffffffff023d000
>> > > vmunmapped 0xfffffffff0400000 vmend 0xfffffffff4000000
>> > > sys->pd 0x108003 0x108023
>> > > cpu0: mmu l3 pte 0xfffffffff0106ff8 = 107023
>> > > cpu0: mmu l2 pte 0xfffffffff0107ff8 = 108023
>> > > cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
>> > > cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
>> > > ioinit... multibootmemassert... kbdinit... meminit...asm: addr
>> > > 0x0000000004000000 end 0x0000000004000000 type 1 size 0
>> > > cm 0: addr 0x4000000 npage 0
>> > > 0 0 0
>> > > npage 0 upage 0 kpage 16384
>> > > confinit... archinit... mallocinit...base 0xfffffffff023d000 ptr
>> > > 0xfffffffff023d000 nunits 4047617
>> > > acpiinit... umeminit... trapinit... printinit... i8259init...
>> procinit...
>> > > mpsinit...panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00 >=
>> > > fffffe0000000000
>> > > panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00 >=
>> fffffe0000000000
>> > >
>> > > dumpstack
>> > > ktrace 9k8cpu 0xfffffffff011cdee 0xfffffffff0105d58
>> > > estackx 0xfffffffff0106000
>> > > 0xfffffffff0105c70=0xfffffffff0105da8
>> > > 0xfffffffff0105c78=0xfffffffff011cb91
>> > > 0xfffffffff0105c80=0xfffffffff0105c98
>> > > 0xfffffffff0105c98=0xfffffffff013cff7
>> > > 0xfffffffff0105cb0=0xfffffffff0105cd0
>> > > 0xfffffffff0105cc0=0xfffffffff0105ea7
>> > > 0xfffffffff0105cc8=0xfffffffff0105df3
>> > > 0xfffffffff0105ce0=0xfffffffff013d14d
>> > > 0xfffffffff0105d08=0xfffffffff0105d90
>> > > 0xfffffffff0105d28=0xfffffffff011cdee
>> > > 0xfffffffff0105d30=0xfffffffff0105da8
>> > > 0xfffffffff0105d40=0xfffffffff0105d58
>> > > 0xfffffffff0105d48=0xfffffffff0105da8
>> > > 0xfffffffff0105d50=0xfffffffff011cdee
>> > > 0xfffffffff0105d58=0xfffffffff011cb99
>> > > 0xfffffffff0105d68=0xfffffffff013d50f
>> > > 0xfffffffff0105d88=0xfffffffff0105ed0
>> > > 0xfffffffff0105d90=0xfffffffff013cff7
>> > > 0xfffffffff0105d98=0xfffffffff0105db5
>> > > 0xfffffffff0105e08=0xfffffffff013d1b8
>> > > 0xfffffffff0105e10=0xfffffffff0105e00
>> > > 0xfffffffff0105e20=0xfffffffff0105ea3
>> > > 0xfffffffff0105e28=0xfffffffff0105e98
>> > > 0xfffffffff0105e38=0xfffffffff013d1b8
>> > > 0xfffffffff0105e40=0xfffffffff0105e98
>> > > 0xfffffffff0105e60=0xfffffffff013d217
>> > > 0xfffffffff0105e68=0xfffffffff015d9c9
>> > > 0xfffffffff0105e80=0xfffffffff0105fb8
>> > > 0xfffffffff0105e90=0xfffffffff015d5d9
>> > > 0xfffffffff0105ea8=0xfffffffff0105ed0
>> > > 0xfffffffff0105ec0=0xfffffffff0116a3b
>> > > 0xfffffffff0105ef8=0xfffffffff012fe55
>> > > 0xfffffffff0105f08=0xfffffffff01a1afa
>> > > 0xfffffffff0105f10=0x0000000000000004
>> > > 0xfffffffff0105f18=0x0000000000000046
>> > > 0xfffffffff0105f20=0xfffffffff00fffd9
>> > > 0xfffffffff0105f28=0x0000000000000006
>> > > 0xfffffffff0105f30=0xfffffffff015d5d9
>> > > 0xfffffffff0105f38=0xfffffffff0000400
>> > > 0xfffffffff0105f40=0x0000000000000000
>> > > 0xfffffffff0105f48=0xfffffffff012fec9
>> > > 0xfffffffff0105f50=0xfffffffff01a1aff
>> > > 0xfffffffff0105f58=0x0000000000000208
>> > > 0xfffffffff0105f60=0x0000000000000124
>> > > 0xfffffffff0105f68=0xfffffffff01149d0
>> > > 0xfffffffff0105f70=0x0000000000000006
>> > > 0xfffffffff0105f78=0xfffffffff0114ba7
>> > > 0xfffffffff0105f80=0xfffffffff0227510
>> > > 0xfffffffff0105f88=0xffffffff00000000
>> > > 0xfffffffff0105f90=0x0000000000000000
>> > > 0xfffffffff0105f98=0xfffffffff0105fb8
>> > > 0xfffffffff0105fa0=0x0000000bf0116b0d
>> > > 0xfffffffff0105fa8=0xfffffffff011622a
>> > > 0xfffffffff0105fb0=0xffffffff00000400
>> > > 0xfffffffff0105fb8=0xffffffff00000000
>> > > 0xfffffffff0105fc0=0x0000000000000000
>> > > 0xfffffffff0105fc8=0x0000000000000000
>> > > 0xfffffffff0105fd0=0x0000000000000000
>> > > 0xfffffffff0105fd8=0x0000000000000000
>> > > 0xfffffffff0105fe0=0x0000000000000000
>> > > 0xfffffffff0105fe8=0xfffffffff0110204
>> > > 0xfffffffff0105ff0=0x000000002badb002
>> > > 0xfffffffff0105ff8=0x000000000023b000
>> > > cpu0: exiting
>> > >
>> > > >
>> > > >
>> > > > On Tue, Jan 28, 2025 at 6:09?AM <tlaronde@kergis.com> wrote:
>> > > >
>> > > > > After fixing problems leading to compiler warnings---legitimate
>> > > > > warnings, but even the too short binary negated unsigned 32bits
>> values
>> > > > > promoted to 64 bits with leading bits hence 0 as mask were
>> harmless---
>> > > > > now I want to look at the stumbing block.
>> > > > >
>> > > > > For me, under vmx, this is the assert in map.c:17:
>> > > > >
>> > > > > assert(pa < KSEG2);
>> > > > >
>> > > > > that triggers, and it should come from a call from multiboot.
>> > > > >
>> > > > > My first reflex is to start adding printf() instructions to track
>> the
>> > > > > problem, but is there a better way when dealing with the kernel?
>> > > > >
>> > > > > Second question: since, if I'm not mistaken, 9front doesn't use
>> > > > > multiboot, is vmx usable (i.e. agnostic about) with the multiboot
>> > > stuff?
>> > > > > The embedded boot stuff should handle the thing by itself without
>> load
>> > > > > addresses having to be adjusted because of vmx?
>> > > > > --
>> > > > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
>> > > > > http://www.kergis.com/
>> > > > > http://kertex.kergis.com/
>> > > > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006
>> F40C
>> > >
>> > > --
>> > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
>> > > http://www.kergis.com/
>> > > http://kertex.kergis.com/
>> > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
>>
>> --
>> Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
>> http://www.kergis.com/
>> http://kertex.kergis.com/
>> Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
> *9fans <https://9fans.topicbox.com/latest>* / 9fans / see discussions
> <https://9fans.topicbox.com/groups/9fans> + participants
> <https://9fans.topicbox.com/groups/9fans/members> + delivery options
> <https://9fans.topicbox.com/groups/9fans/subscription> Permalink
> <https://9fans.topicbox.com/groups/9fans/T8b5b89fcf829819e-M4e4ecc487efb27f806e7cd5c>
>
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T8b5b89fcf829819e-M57afe25b2d4f553f0a50f866
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
[-- Attachment #2: Type: text/html, Size: 15599 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] Nix/regen: assert triggered; best way to track
2025-01-28 21:23 ` ron minnich
@ 2025-01-29 0:02 ` Ron Minnich
2025-01-29 1:37 ` ron minnich
2025-01-29 12:34 ` tlaronde
0 siblings, 2 replies; 15+ messages in thread
From: Ron Minnich @ 2025-01-29 0:02 UTC (permalink / raw)
To: 9fans
btw, if you
acid 9pc64
you can paste this right into acid
src(0xfffffffff011cdee); // dumpstack+0x10
src(0xfffffffff013d50f); // panic+0x133
src(0xfffffffff0116a3b); // KADDR+0x55
src(0xfffffffff012fe55); // sigsearch+0xc8
src(0xfffffffff012fec9); // mpsinit+0x14
src(0xfffffffff011622a); // main+0x30b
src(0xfffffffff0110204); // ndnr
and see the source.
Also, the ndnr is a jmk-ism: it means "no deposit, no return"
so, let's see, I can't tell if we went over this before.
What is KADDR and KADDR2? They relate to TMFM, another jmk-ism I
believe: Too Many F-ing Megabytes, where too many is "more than 2G" --
why 2G? well ...
basically, amd64, like lots of things (risc-v) uses this one simple
trick: if you sign-extend a 32-bit pointer, you get something anchored
either at the top 2G (kernel va) or the low 2G (user code).
i.e. 0x80000000 -> 0xffffffff_80000000 -> this is convenient. You can
use 32-bit pointers for lots of things, and, since the amd64 is a
pretty half-way 64-bit CPU (lots of 64-bit instructions only
completely work with RAX), this is helpful.
And it works great until you get CPUs with TMFM. Then you need to
split memory up:
physical 0 -> 2 Gb becomes virtual 0xfffffff_80000000
physical 2Gb and up becomes ... fffffe0000000000
Why fffffe0000000000? the first amd64 only had something like 41(?)
bits of virtual address:there's this giant hole in the middle,and
kernel virtual HAD to start at that address -- 64 bits - whatever gets
you to 23 bits. [I can't find the actual documents on this, I am out
of time to look, so you'll need to fill in my likely errors here]
It's a hardware mandate from opteron land.
OK, so KADDR2 is fffffe0000000000, and that error is saying code
called KADDR2 with something that's not in KADDR2. That va
fffffffffffffc00 is in the low 2GiB physical, which is KADDR.
This is a side effect of the real problem: you don't have the table it
wants. So you need to fix that, OR, start using qemu for your testing.
I hope I did not mess the details up too much here....
On Tue, Jan 28, 2025 at 1:24 PM ron minnich <rminnich@gmail.com> wrote:
>
> I'd be happier to remove the mps dependency actually. the mps is long dead. But that's a bigger story.
>
>
> On Tue, Jan 28, 2025 at 11:24 AM Paul Lalonde <paul.a.lalonde@gmail.com> wrote:
>>
>> Ah, that's the code path that sent me to QEMU.
>> Vmx doesn't have any MP tables, which leads to this fault in mpsinit.
>> Ron provided this minimal one for me, which I think we could learn from to adapt into vmx. The hacky version of pointing the code directly at something like this baked in didn't excite me.
>>
>> 50 43 4D 50 ; "PCMP"
>> 00 00 ; Table Length (placeholder)
>> 04 ; Spec Revision
>> 00 ; Checksum (placeholder)
>> 42 4F 43 48 53 43 50 55 ; "BOCHSCPU"
>> 30 2E 31 20 20 20 20 20 20 20 20 ; "0.1 "
>> 00 00 00 00 ; OEM Table Pointer
>> 00 00 ; OEM Table Size
>> 14 00 ; Entry Count (2 CPUs + 18 = 20, little-endian)
>> 00 00 E0 FE ; Local APIC Address (0xfee00000)
>> 00 00 ; Ext Table Length
>> 00 ; Ext Table Checksum
>> 00 ; Reserved
>>
>> On Tue, Jan 28, 2025 at 11:15 AM <tlaronde@kergis.com> wrote:
>>>
>>> On Tue, Jan 28, 2025 at 09:18:29AM -0800, Paul Lalonde wrote:
>>> > ktrace can generate a stack for you from that dump. The line starting with
>>> > "ktrace" is the command line (you might change 9k8cpu to the path to the
>>> > kernel file in you're not in the directory where you built it).
>>> > Then the following lines up to but not including the "cpu0: exiting" can be
>>> > dropped into ktrace's stdin to have it generate a stack trace. You'll need
>>> > to add the ^d at the end if you're cut-and-pasting.
>>> >
>>> > Though it looks like it's just triggering the page fault trap on that
>>> > 0xfffffffffffffc00 address, which itself looks like a victim of
>>> > sign-extension. So back up to the fault and find the source of that
>>> > address?
>>>
>>> Yes:
>>>
>>> src(0xfffffffff011cdee); // dumpstack+0x10
>>> src(0xfffffffff013d50f); // panic+0x133
>>> src(0xfffffffff0116a3b); // KADDR+0x55
>>> src(0xfffffffff012fe55); // sigsearch+0xc8
>>> src(0xfffffffff012fec9); // mpsinit+0x14
>>> src(0xfffffffff011622a); // main+0x30b
>>> src(0xfffffffff0110204); // ndnr
>>>
>>> this doesn't tell me much more than what I knew already: it panics in
>>> mpsinit, calling KADDR in map.c.
>>>
>>> During my next wandering under Nix, I will try to track back from
>>> where the offending address is taken or with what it is constructed.
>>>
>>> >
>>> > On Tue, Jan 28, 2025 at 9:09?AM <tlaronde@kergis.com> wrote:
>>> >
>>> > > On Tue, Jan 28, 2025 at 07:49:02AM -0800, Paul Lalonde wrote:
>>> > > > Do you have a stack for the assert, from the ktrace?
>>> > > >
>>> > >
>>> > > Yes, and I was wrong: it fails relatively "late" in main.c: at
>>> > > mpsinit.
>>> > >
>>> > > Here is the info (I added a bunch of print() before each function call
>>> > > to know where it stumbled upon an incorrect address):
>>> > >
>>> > > term% nix/test_vmx
>>> > >
>>> > > NIX
>>> > > mmunit...mmuinit: vmstart 0xfffffffff0000000 vmunused 0xfffffffff023d000
>>> > > vmunmapped 0xfffffffff0400000 vmend 0xfffffffff4000000
>>> > > sys->pd 0x108003 0x108023
>>> > > cpu0: mmu l3 pte 0xfffffffff0106ff8 = 107023
>>> > > cpu0: mmu l2 pte 0xfffffffff0107ff8 = 108023
>>> > > cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
>>> > > cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
>>> > > ioinit... multibootmemassert... kbdinit... meminit...asm: addr
>>> > > 0x0000000004000000 end 0x0000000004000000 type 1 size 0
>>> > > cm 0: addr 0x4000000 npage 0
>>> > > 0 0 0
>>> > > npage 0 upage 0 kpage 16384
>>> > > confinit... archinit... mallocinit...base 0xfffffffff023d000 ptr
>>> > > 0xfffffffff023d000 nunits 4047617
>>> > > acpiinit... umeminit... trapinit... printinit... i8259init... procinit...
>>> > > mpsinit...panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00 >=
>>> > > fffffe0000000000
>>> > > panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00 >= fffffe0000000000
>>> > >
>>> > > dumpstack
>>> > > ktrace 9k8cpu 0xfffffffff011cdee 0xfffffffff0105d58
>>> > > estackx 0xfffffffff0106000
>>> > > 0xfffffffff0105c70=0xfffffffff0105da8
>>> > > 0xfffffffff0105c78=0xfffffffff011cb91
>>> > > 0xfffffffff0105c80=0xfffffffff0105c98
>>> > > 0xfffffffff0105c98=0xfffffffff013cff7
>>> > > 0xfffffffff0105cb0=0xfffffffff0105cd0
>>> > > 0xfffffffff0105cc0=0xfffffffff0105ea7
>>> > > 0xfffffffff0105cc8=0xfffffffff0105df3
>>> > > 0xfffffffff0105ce0=0xfffffffff013d14d
>>> > > 0xfffffffff0105d08=0xfffffffff0105d90
>>> > > 0xfffffffff0105d28=0xfffffffff011cdee
>>> > > 0xfffffffff0105d30=0xfffffffff0105da8
>>> > > 0xfffffffff0105d40=0xfffffffff0105d58
>>> > > 0xfffffffff0105d48=0xfffffffff0105da8
>>> > > 0xfffffffff0105d50=0xfffffffff011cdee
>>> > > 0xfffffffff0105d58=0xfffffffff011cb99
>>> > > 0xfffffffff0105d68=0xfffffffff013d50f
>>> > > 0xfffffffff0105d88=0xfffffffff0105ed0
>>> > > 0xfffffffff0105d90=0xfffffffff013cff7
>>> > > 0xfffffffff0105d98=0xfffffffff0105db5
>>> > > 0xfffffffff0105e08=0xfffffffff013d1b8
>>> > > 0xfffffffff0105e10=0xfffffffff0105e00
>>> > > 0xfffffffff0105e20=0xfffffffff0105ea3
>>> > > 0xfffffffff0105e28=0xfffffffff0105e98
>>> > > 0xfffffffff0105e38=0xfffffffff013d1b8
>>> > > 0xfffffffff0105e40=0xfffffffff0105e98
>>> > > 0xfffffffff0105e60=0xfffffffff013d217
>>> > > 0xfffffffff0105e68=0xfffffffff015d9c9
>>> > > 0xfffffffff0105e80=0xfffffffff0105fb8
>>> > > 0xfffffffff0105e90=0xfffffffff015d5d9
>>> > > 0xfffffffff0105ea8=0xfffffffff0105ed0
>>> > > 0xfffffffff0105ec0=0xfffffffff0116a3b
>>> > > 0xfffffffff0105ef8=0xfffffffff012fe55
>>> > > 0xfffffffff0105f08=0xfffffffff01a1afa
>>> > > 0xfffffffff0105f10=0x0000000000000004
>>> > > 0xfffffffff0105f18=0x0000000000000046
>>> > > 0xfffffffff0105f20=0xfffffffff00fffd9
>>> > > 0xfffffffff0105f28=0x0000000000000006
>>> > > 0xfffffffff0105f30=0xfffffffff015d5d9
>>> > > 0xfffffffff0105f38=0xfffffffff0000400
>>> > > 0xfffffffff0105f40=0x0000000000000000
>>> > > 0xfffffffff0105f48=0xfffffffff012fec9
>>> > > 0xfffffffff0105f50=0xfffffffff01a1aff
>>> > > 0xfffffffff0105f58=0x0000000000000208
>>> > > 0xfffffffff0105f60=0x0000000000000124
>>> > > 0xfffffffff0105f68=0xfffffffff01149d0
>>> > > 0xfffffffff0105f70=0x0000000000000006
>>> > > 0xfffffffff0105f78=0xfffffffff0114ba7
>>> > > 0xfffffffff0105f80=0xfffffffff0227510
>>> > > 0xfffffffff0105f88=0xffffffff00000000
>>> > > 0xfffffffff0105f90=0x0000000000000000
>>> > > 0xfffffffff0105f98=0xfffffffff0105fb8
>>> > > 0xfffffffff0105fa0=0x0000000bf0116b0d
>>> > > 0xfffffffff0105fa8=0xfffffffff011622a
>>> > > 0xfffffffff0105fb0=0xffffffff00000400
>>> > > 0xfffffffff0105fb8=0xffffffff00000000
>>> > > 0xfffffffff0105fc0=0x0000000000000000
>>> > > 0xfffffffff0105fc8=0x0000000000000000
>>> > > 0xfffffffff0105fd0=0x0000000000000000
>>> > > 0xfffffffff0105fd8=0x0000000000000000
>>> > > 0xfffffffff0105fe0=0x0000000000000000
>>> > > 0xfffffffff0105fe8=0xfffffffff0110204
>>> > > 0xfffffffff0105ff0=0x000000002badb002
>>> > > 0xfffffffff0105ff8=0x000000000023b000
>>> > > cpu0: exiting
>>> > >
>>> > > >
>>> > > >
>>> > > > On Tue, Jan 28, 2025 at 6:09?AM <tlaronde@kergis.com> wrote:
>>> > > >
>>> > > > > After fixing problems leading to compiler warnings---legitimate
>>> > > > > warnings, but even the too short binary negated unsigned 32bits values
>>> > > > > promoted to 64 bits with leading bits hence 0 as mask were harmless---
>>> > > > > now I want to look at the stumbing block.
>>> > > > >
>>> > > > > For me, under vmx, this is the assert in map.c:17:
>>> > > > >
>>> > > > > assert(pa < KSEG2);
>>> > > > >
>>> > > > > that triggers, and it should come from a call from multiboot.
>>> > > > >
>>> > > > > My first reflex is to start adding printf() instructions to track the
>>> > > > > problem, but is there a better way when dealing with the kernel?
>>> > > > >
>>> > > > > Second question: since, if I'm not mistaken, 9front doesn't use
>>> > > > > multiboot, is vmx usable (i.e. agnostic about) with the multiboot
>>> > > stuff?
>>> > > > > The embedded boot stuff should handle the thing by itself without load
>>> > > > > addresses having to be adjusted because of vmx?
>>> > > > > --
>>> > > > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
>>> > > > > http://www.kergis.com/
>>> > > > > http://kertex.kergis.com/
>>> > > > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
>>> > >
>>> > > --
>>> > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
>>> > > http://www.kergis.com/
>>> > > http://kertex.kergis.com/
>>> > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
>>>
>>> --
>>> Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
>>> http://www.kergis.com/
>>> http://kertex.kergis.com/
>>> Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
>
> 9fans / 9fans / see discussions + participants + delivery options Permalink
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T8b5b89fcf829819e-M43ec7faffe7371da1b521c7e
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] Nix/regen: assert triggered; best way to track
2025-01-29 0:02 ` Ron Minnich
@ 2025-01-29 1:37 ` ron minnich
2025-01-29 12:34 ` tlaronde
1 sibling, 0 replies; 15+ messages in thread
From: ron minnich @ 2025-01-29 1:37 UTC (permalink / raw)
To: 9fans
[-- Attachment #1: Type: text/plain, Size: 12885 bytes --]
yeah, there's a reason it is fffffe, and not ffff8, but memory fails me.
See this for details of the address hole:
https://en.wikipedia.org/wiki/X86-64
Charles may remember, but it goes back to the port they did in 2005.
On Tue, Jan 28, 2025 at 5:08 PM Ron Minnich <rminnich@p9f.org> wrote:
> btw, if you
> acid 9pc64
> you can paste this right into acid
> src(0xfffffffff011cdee); // dumpstack+0x10
> src(0xfffffffff013d50f); // panic+0x133
> src(0xfffffffff0116a3b); // KADDR+0x55
> src(0xfffffffff012fe55); // sigsearch+0xc8
> src(0xfffffffff012fec9); // mpsinit+0x14
> src(0xfffffffff011622a); // main+0x30b
> src(0xfffffffff0110204); // ndnr
> and see the source.
>
> Also, the ndnr is a jmk-ism: it means "no deposit, no return"
>
> so, let's see, I can't tell if we went over this before.
> What is KADDR and KADDR2? They relate to TMFM, another jmk-ism I
> believe: Too Many F-ing Megabytes, where too many is "more than 2G" --
> why 2G? well ...
>
> basically, amd64, like lots of things (risc-v) uses this one simple
> trick: if you sign-extend a 32-bit pointer, you get something anchored
> either at the top 2G (kernel va) or the low 2G (user code).
>
> i.e. 0x80000000 -> 0xffffffff_80000000 -> this is convenient. You can
> use 32-bit pointers for lots of things, and, since the amd64 is a
> pretty half-way 64-bit CPU (lots of 64-bit instructions only
> completely work with RAX), this is helpful.
>
> And it works great until you get CPUs with TMFM. Then you need to
> split memory up:
> physical 0 -> 2 Gb becomes virtual 0xfffffff_80000000
> physical 2Gb and up becomes ... fffffe0000000000
> Why fffffe0000000000? the first amd64 only had something like 41(?)
> bits of virtual address:there's this giant hole in the middle,and
> kernel virtual HAD to start at that address -- 64 bits - whatever gets
> you to 23 bits. [I can't find the actual documents on this, I am out
> of time to look, so you'll need to fill in my likely errors here]
> It's a hardware mandate from opteron land.
>
> OK, so KADDR2 is fffffe0000000000, and that error is saying code
> called KADDR2 with something that's not in KADDR2. That va
> fffffffffffffc00 is in the low 2GiB physical, which is KADDR.
>
> This is a side effect of the real problem: you don't have the table it
> wants. So you need to fix that, OR, start using qemu for your testing.
>
> I hope I did not mess the details up too much here....
>
> On Tue, Jan 28, 2025 at 1:24 PM ron minnich <rminnich@gmail.com> wrote:
> >
> > I'd be happier to remove the mps dependency actually. the mps is long
> dead. But that's a bigger story.
> >
> >
> > On Tue, Jan 28, 2025 at 11:24 AM Paul Lalonde <paul.a.lalonde@gmail.com>
> wrote:
> >>
> >> Ah, that's the code path that sent me to QEMU.
> >> Vmx doesn't have any MP tables, which leads to this fault in mpsinit.
> >> Ron provided this minimal one for me, which I think we could learn from
> to adapt into vmx. The hacky version of pointing the code directly at
> something like this baked in didn't excite me.
> >>
> >> 50 43 4D 50 ; "PCMP"
> >> 00 00 ; Table Length (placeholder)
> >> 04 ; Spec Revision
> >> 00 ; Checksum (placeholder)
> >> 42 4F 43 48 53 43 50 55 ; "BOCHSCPU"
> >> 30 2E 31 20 20 20 20 20 20 20 20 ; "0.1 "
> >> 00 00 00 00 ; OEM Table Pointer
> >> 00 00 ; OEM Table Size
> >> 14 00 ; Entry Count (2 CPUs + 18 = 20,
> little-endian)
> >> 00 00 E0 FE ; Local APIC Address (0xfee00000)
> >> 00 00 ; Ext Table Length
> >> 00 ; Ext Table Checksum
> >> 00 ; Reserved
> >>
> >> On Tue, Jan 28, 2025 at 11:15 AM <tlaronde@kergis.com> wrote:
> >>>
> >>> On Tue, Jan 28, 2025 at 09:18:29AM -0800, Paul Lalonde wrote:
> >>> > ktrace can generate a stack for you from that dump. The line
> starting with
> >>> > "ktrace" is the command line (you might change 9k8cpu to the path to
> the
> >>> > kernel file in you're not in the directory where you built it).
> >>> > Then the following lines up to but not including the "cpu0: exiting"
> can be
> >>> > dropped into ktrace's stdin to have it generate a stack trace.
> You'll need
> >>> > to add the ^d at the end if you're cut-and-pasting.
> >>> >
> >>> > Though it looks like it's just triggering the page fault trap on that
> >>> > 0xfffffffffffffc00 address, which itself looks like a victim of
> >>> > sign-extension. So back up to the fault and find the source of that
> >>> > address?
> >>>
> >>> Yes:
> >>>
> >>> src(0xfffffffff011cdee); // dumpstack+0x10
> >>> src(0xfffffffff013d50f); // panic+0x133
> >>> src(0xfffffffff0116a3b); // KADDR+0x55
> >>> src(0xfffffffff012fe55); // sigsearch+0xc8
> >>> src(0xfffffffff012fec9); // mpsinit+0x14
> >>> src(0xfffffffff011622a); // main+0x30b
> >>> src(0xfffffffff0110204); // ndnr
> >>>
> >>> this doesn't tell me much more than what I knew already: it panics in
> >>> mpsinit, calling KADDR in map.c.
> >>>
> >>> During my next wandering under Nix, I will try to track back from
> >>> where the offending address is taken or with what it is constructed.
> >>>
> >>> >
> >>> > On Tue, Jan 28, 2025 at 9:09?AM <tlaronde@kergis.com> wrote:
> >>> >
> >>> > > On Tue, Jan 28, 2025 at 07:49:02AM -0800, Paul Lalonde wrote:
> >>> > > > Do you have a stack for the assert, from the ktrace?
> >>> > > >
> >>> > >
> >>> > > Yes, and I was wrong: it fails relatively "late" in main.c: at
> >>> > > mpsinit.
> >>> > >
> >>> > > Here is the info (I added a bunch of print() before each function
> call
> >>> > > to know where it stumbled upon an incorrect address):
> >>> > >
> >>> > > term% nix/test_vmx
> >>> > >
> >>> > > NIX
> >>> > > mmunit...mmuinit: vmstart 0xfffffffff0000000 vmunused
> 0xfffffffff023d000
> >>> > > vmunmapped 0xfffffffff0400000 vmend 0xfffffffff4000000
> >>> > > sys->pd 0x108003 0x108023
> >>> > > cpu0: mmu l3 pte 0xfffffffff0106ff8 = 107023
> >>> > > cpu0: mmu l2 pte 0xfffffffff0107ff8 = 108023
> >>> > > cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
> >>> > > cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
> >>> > > ioinit... multibootmemassert... kbdinit... meminit...asm: addr
> >>> > > 0x0000000004000000 end 0x0000000004000000 type 1 size 0
> >>> > > cm 0: addr 0x4000000 npage 0
> >>> > > 0 0 0
> >>> > > npage 0 upage 0 kpage 16384
> >>> > > confinit... archinit... mallocinit...base 0xfffffffff023d000 ptr
> >>> > > 0xfffffffff023d000 nunits 4047617
> >>> > > acpiinit... umeminit... trapinit... printinit... i8259init...
> procinit...
> >>> > > mpsinit...panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00
> >>> > > fffffe0000000000
> >>> > > panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00 >=
> fffffe0000000000
> >>> > >
> >>> > > dumpstack
> >>> > > ktrace 9k8cpu 0xfffffffff011cdee 0xfffffffff0105d58
> >>> > > estackx 0xfffffffff0106000
> >>> > > 0xfffffffff0105c70=0xfffffffff0105da8
> >>> > > 0xfffffffff0105c78=0xfffffffff011cb91
> >>> > > 0xfffffffff0105c80=0xfffffffff0105c98
> >>> > > 0xfffffffff0105c98=0xfffffffff013cff7
> >>> > > 0xfffffffff0105cb0=0xfffffffff0105cd0
> >>> > > 0xfffffffff0105cc0=0xfffffffff0105ea7
> >>> > > 0xfffffffff0105cc8=0xfffffffff0105df3
> >>> > > 0xfffffffff0105ce0=0xfffffffff013d14d
> >>> > > 0xfffffffff0105d08=0xfffffffff0105d90
> >>> > > 0xfffffffff0105d28=0xfffffffff011cdee
> >>> > > 0xfffffffff0105d30=0xfffffffff0105da8
> >>> > > 0xfffffffff0105d40=0xfffffffff0105d58
> >>> > > 0xfffffffff0105d48=0xfffffffff0105da8
> >>> > > 0xfffffffff0105d50=0xfffffffff011cdee
> >>> > > 0xfffffffff0105d58=0xfffffffff011cb99
> >>> > > 0xfffffffff0105d68=0xfffffffff013d50f
> >>> > > 0xfffffffff0105d88=0xfffffffff0105ed0
> >>> > > 0xfffffffff0105d90=0xfffffffff013cff7
> >>> > > 0xfffffffff0105d98=0xfffffffff0105db5
> >>> > > 0xfffffffff0105e08=0xfffffffff013d1b8
> >>> > > 0xfffffffff0105e10=0xfffffffff0105e00
> >>> > > 0xfffffffff0105e20=0xfffffffff0105ea3
> >>> > > 0xfffffffff0105e28=0xfffffffff0105e98
> >>> > > 0xfffffffff0105e38=0xfffffffff013d1b8
> >>> > > 0xfffffffff0105e40=0xfffffffff0105e98
> >>> > > 0xfffffffff0105e60=0xfffffffff013d217
> >>> > > 0xfffffffff0105e68=0xfffffffff015d9c9
> >>> > > 0xfffffffff0105e80=0xfffffffff0105fb8
> >>> > > 0xfffffffff0105e90=0xfffffffff015d5d9
> >>> > > 0xfffffffff0105ea8=0xfffffffff0105ed0
> >>> > > 0xfffffffff0105ec0=0xfffffffff0116a3b
> >>> > > 0xfffffffff0105ef8=0xfffffffff012fe55
> >>> > > 0xfffffffff0105f08=0xfffffffff01a1afa
> >>> > > 0xfffffffff0105f10=0x0000000000000004
> >>> > > 0xfffffffff0105f18=0x0000000000000046
> >>> > > 0xfffffffff0105f20=0xfffffffff00fffd9
> >>> > > 0xfffffffff0105f28=0x0000000000000006
> >>> > > 0xfffffffff0105f30=0xfffffffff015d5d9
> >>> > > 0xfffffffff0105f38=0xfffffffff0000400
> >>> > > 0xfffffffff0105f40=0x0000000000000000
> >>> > > 0xfffffffff0105f48=0xfffffffff012fec9
> >>> > > 0xfffffffff0105f50=0xfffffffff01a1aff
> >>> > > 0xfffffffff0105f58=0x0000000000000208
> >>> > > 0xfffffffff0105f60=0x0000000000000124
> >>> > > 0xfffffffff0105f68=0xfffffffff01149d0
> >>> > > 0xfffffffff0105f70=0x0000000000000006
> >>> > > 0xfffffffff0105f78=0xfffffffff0114ba7
> >>> > > 0xfffffffff0105f80=0xfffffffff0227510
> >>> > > 0xfffffffff0105f88=0xffffffff00000000
> >>> > > 0xfffffffff0105f90=0x0000000000000000
> >>> > > 0xfffffffff0105f98=0xfffffffff0105fb8
> >>> > > 0xfffffffff0105fa0=0x0000000bf0116b0d
> >>> > > 0xfffffffff0105fa8=0xfffffffff011622a
> >>> > > 0xfffffffff0105fb0=0xffffffff00000400
> >>> > > 0xfffffffff0105fb8=0xffffffff00000000
> >>> > > 0xfffffffff0105fc0=0x0000000000000000
> >>> > > 0xfffffffff0105fc8=0x0000000000000000
> >>> > > 0xfffffffff0105fd0=0x0000000000000000
> >>> > > 0xfffffffff0105fd8=0x0000000000000000
> >>> > > 0xfffffffff0105fe0=0x0000000000000000
> >>> > > 0xfffffffff0105fe8=0xfffffffff0110204
> >>> > > 0xfffffffff0105ff0=0x000000002badb002
> >>> > > 0xfffffffff0105ff8=0x000000000023b000
> >>> > > cpu0: exiting
> >>> > >
> >>> > > >
> >>> > > >
> >>> > > > On Tue, Jan 28, 2025 at 6:09?AM <tlaronde@kergis.com> wrote:
> >>> > > >
> >>> > > > > After fixing problems leading to compiler warnings---legitimate
> >>> > > > > warnings, but even the too short binary negated unsigned
> 32bits values
> >>> > > > > promoted to 64 bits with leading bits hence 0 as mask were
> harmless---
> >>> > > > > now I want to look at the stumbing block.
> >>> > > > >
> >>> > > > > For me, under vmx, this is the assert in map.c:17:
> >>> > > > >
> >>> > > > > assert(pa < KSEG2);
> >>> > > > >
> >>> > > > > that triggers, and it should come from a call from multiboot.
> >>> > > > >
> >>> > > > > My first reflex is to start adding printf() instructions to
> track the
> >>> > > > > problem, but is there a better way when dealing with the
> kernel?
> >>> > > > >
> >>> > > > > Second question: since, if I'm not mistaken, 9front doesn't use
> >>> > > > > multiboot, is vmx usable (i.e. agnostic about) with the
> multiboot
> >>> > > stuff?
> >>> > > > > The embedded boot stuff should handle the thing by itself
> without load
> >>> > > > > addresses having to be adjusted because of vmx?
> >>> > > > > --
> >>> > > > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> >>> > > > > http://www.kergis.com/
> >>> > > > > http://kertex.kergis.com/
> >>> > > > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95
> 6006 F40C
> >>> > >
> >>> > > --
> >>> > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> >>> > > http://www.kergis.com/
> >>> > > http://kertex.kergis.com/
> >>> > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006
> F40C
> >>>
> >>> --
> >>> Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> >>> http://www.kergis.com/
> >>> http://kertex.kergis.com/
> >>> Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
> >
> > 9fans / 9fans / see discussions + participants + delivery options
> Permalink
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T8b5b89fcf829819e-Meb14903b2db62f531083c7ae
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
[-- Attachment #2: Type: text/html, Size: 20846 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] Nix/regen: assert triggered; best way to track
2025-01-29 0:02 ` Ron Minnich
2025-01-29 1:37 ` ron minnich
@ 2025-01-29 12:34 ` tlaronde
2025-01-29 15:14 ` ron minnich
1 sibling, 1 reply; 15+ messages in thread
From: tlaronde @ 2025-01-29 12:34 UTC (permalink / raw)
To: 9fans
It seems to me that the best course, for now, is the following:
1) [Using qemu or booting the kernel on baremetal] Try and correct
Nix, in its present state, to achieve a running Nix, with 9front, for
objtype==amd64;
2) Once 1) is achieved, start cleaning (then, at this moment, the mps
stuff could be revised) and reorganizing code to clearly segregate
Machine Independent (M.I.) and Machine Dependent (M.D.), so that
porting Nix to other archs be possible.
And concurrently, during either step, document...
On Tue, Jan 28, 2025 at 04:02:11PM -0800, Ron Minnich wrote:
> btw, if you
> acid 9pc64
> you can paste this right into acid
> src(0xfffffffff011cdee); // dumpstack+0x10
> src(0xfffffffff013d50f); // panic+0x133
> src(0xfffffffff0116a3b); // KADDR+0x55
> src(0xfffffffff012fe55); // sigsearch+0xc8
> src(0xfffffffff012fec9); // mpsinit+0x14
> src(0xfffffffff011622a); // main+0x30b
> src(0xfffffffff0110204); // ndnr
> and see the source.
>
> Also, the ndnr is a jmk-ism: it means "no deposit, no return"
>
> so, let's see, I can't tell if we went over this before.
> What is KADDR and KADDR2? They relate to TMFM, another jmk-ism I
> believe: Too Many F-ing Megabytes, where too many is "more than 2G" --
> why 2G? well ...
>
> basically, amd64, like lots of things (risc-v) uses this one simple
> trick: if you sign-extend a 32-bit pointer, you get something anchored
> either at the top 2G (kernel va) or the low 2G (user code).
>
> i.e. 0x80000000 -> 0xffffffff_80000000 -> this is convenient. You can
> use 32-bit pointers for lots of things, and, since the amd64 is a
> pretty half-way 64-bit CPU (lots of 64-bit instructions only
> completely work with RAX), this is helpful.
>
> And it works great until you get CPUs with TMFM. Then you need to
> split memory up:
> physical 0 -> 2 Gb becomes virtual 0xfffffff_80000000
> physical 2Gb and up becomes ... fffffe0000000000
> Why fffffe0000000000? the first amd64 only had something like 41(?)
> bits of virtual address:there's this giant hole in the middle,and
> kernel virtual HAD to start at that address -- 64 bits - whatever gets
> you to 23 bits. [I can't find the actual documents on this, I am out
> of time to look, so you'll need to fill in my likely errors here]
> It's a hardware mandate from opteron land.
>
> OK, so KADDR2 is fffffe0000000000, and that error is saying code
> called KADDR2 with something that's not in KADDR2. That va
> fffffffffffffc00 is in the low 2GiB physical, which is KADDR.
>
> This is a side effect of the real problem: you don't have the table it
> wants. So you need to fix that, OR, start using qemu for your testing.
>
> I hope I did not mess the details up too much here....
>
> On Tue, Jan 28, 2025 at 1:24?PM ron minnich <rminnich@gmail.com> wrote:
> >
> > I'd be happier to remove the mps dependency actually. the mps is long dead. But that's a bigger story.
> >
> >
> > On Tue, Jan 28, 2025 at 11:24?AM Paul Lalonde <paul.a.lalonde@gmail.com> wrote:
> >>
> >> Ah, that's the code path that sent me to QEMU.
> >> Vmx doesn't have any MP tables, which leads to this fault in mpsinit.
> >> Ron provided this minimal one for me, which I think we could learn from to adapt into vmx. The hacky version of pointing the code directly at something like this baked in didn't excite me.
> >>
> >> 50 43 4D 50 ; "PCMP"
> >> 00 00 ; Table Length (placeholder)
> >> 04 ; Spec Revision
> >> 00 ; Checksum (placeholder)
> >> 42 4F 43 48 53 43 50 55 ; "BOCHSCPU"
> >> 30 2E 31 20 20 20 20 20 20 20 20 ; "0.1 "
> >> 00 00 00 00 ; OEM Table Pointer
> >> 00 00 ; OEM Table Size
> >> 14 00 ; Entry Count (2 CPUs + 18 = 20, little-endian)
> >> 00 00 E0 FE ; Local APIC Address (0xfee00000)
> >> 00 00 ; Ext Table Length
> >> 00 ; Ext Table Checksum
> >> 00 ; Reserved
> >>
> >> On Tue, Jan 28, 2025 at 11:15?AM <tlaronde@kergis.com> wrote:
> >>>
> >>> On Tue, Jan 28, 2025 at 09:18:29AM -0800, Paul Lalonde wrote:
> >>> > ktrace can generate a stack for you from that dump. The line starting with
> >>> > "ktrace" is the command line (you might change 9k8cpu to the path to the
> >>> > kernel file in you're not in the directory where you built it).
> >>> > Then the following lines up to but not including the "cpu0: exiting" can be
> >>> > dropped into ktrace's stdin to have it generate a stack trace. You'll need
> >>> > to add the ^d at the end if you're cut-and-pasting.
> >>> >
> >>> > Though it looks like it's just triggering the page fault trap on that
> >>> > 0xfffffffffffffc00 address, which itself looks like a victim of
> >>> > sign-extension. So back up to the fault and find the source of that
> >>> > address?
> >>>
> >>> Yes:
> >>>
> >>> src(0xfffffffff011cdee); // dumpstack+0x10
> >>> src(0xfffffffff013d50f); // panic+0x133
> >>> src(0xfffffffff0116a3b); // KADDR+0x55
> >>> src(0xfffffffff012fe55); // sigsearch+0xc8
> >>> src(0xfffffffff012fec9); // mpsinit+0x14
> >>> src(0xfffffffff011622a); // main+0x30b
> >>> src(0xfffffffff0110204); // ndnr
> >>>
> >>> this doesn't tell me much more than what I knew already: it panics in
> >>> mpsinit, calling KADDR in map.c.
> >>>
> >>> During my next wandering under Nix, I will try to track back from
> >>> where the offending address is taken or with what it is constructed.
> >>>
> >>> >
> >>> > On Tue, Jan 28, 2025 at 9:09?AM <tlaronde@kergis.com> wrote:
> >>> >
> >>> > > On Tue, Jan 28, 2025 at 07:49:02AM -0800, Paul Lalonde wrote:
> >>> > > > Do you have a stack for the assert, from the ktrace?
> >>> > > >
> >>> > >
> >>> > > Yes, and I was wrong: it fails relatively "late" in main.c: at
> >>> > > mpsinit.
> >>> > >
> >>> > > Here is the info (I added a bunch of print() before each function call
> >>> > > to know where it stumbled upon an incorrect address):
> >>> > >
> >>> > > term% nix/test_vmx
> >>> > >
> >>> > > NIX
> >>> > > mmunit...mmuinit: vmstart 0xfffffffff0000000 vmunused 0xfffffffff023d000
> >>> > > vmunmapped 0xfffffffff0400000 vmend 0xfffffffff4000000
> >>> > > sys->pd 0x108003 0x108023
> >>> > > cpu0: mmu l3 pte 0xfffffffff0106ff8 = 107023
> >>> > > cpu0: mmu l2 pte 0xfffffffff0107ff8 = 108023
> >>> > > cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
> >>> > > cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
> >>> > > ioinit... multibootmemassert... kbdinit... meminit...asm: addr
> >>> > > 0x0000000004000000 end 0x0000000004000000 type 1 size 0
> >>> > > cm 0: addr 0x4000000 npage 0
> >>> > > 0 0 0
> >>> > > npage 0 upage 0 kpage 16384
> >>> > > confinit... archinit... mallocinit...base 0xfffffffff023d000 ptr
> >>> > > 0xfffffffff023d000 nunits 4047617
> >>> > > acpiinit... umeminit... trapinit... printinit... i8259init... procinit...
> >>> > > mpsinit...panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00 >=
> >>> > > fffffe0000000000
> >>> > > panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00 >= fffffe0000000000
> >>> > >
> >>> > > dumpstack
> >>> > > ktrace 9k8cpu 0xfffffffff011cdee 0xfffffffff0105d58
> >>> > > estackx 0xfffffffff0106000
> >>> > > 0xfffffffff0105c70=0xfffffffff0105da8
> >>> > > 0xfffffffff0105c78=0xfffffffff011cb91
> >>> > > 0xfffffffff0105c80=0xfffffffff0105c98
> >>> > > 0xfffffffff0105c98=0xfffffffff013cff7
> >>> > > 0xfffffffff0105cb0=0xfffffffff0105cd0
> >>> > > 0xfffffffff0105cc0=0xfffffffff0105ea7
> >>> > > 0xfffffffff0105cc8=0xfffffffff0105df3
> >>> > > 0xfffffffff0105ce0=0xfffffffff013d14d
> >>> > > 0xfffffffff0105d08=0xfffffffff0105d90
> >>> > > 0xfffffffff0105d28=0xfffffffff011cdee
> >>> > > 0xfffffffff0105d30=0xfffffffff0105da8
> >>> > > 0xfffffffff0105d40=0xfffffffff0105d58
> >>> > > 0xfffffffff0105d48=0xfffffffff0105da8
> >>> > > 0xfffffffff0105d50=0xfffffffff011cdee
> >>> > > 0xfffffffff0105d58=0xfffffffff011cb99
> >>> > > 0xfffffffff0105d68=0xfffffffff013d50f
> >>> > > 0xfffffffff0105d88=0xfffffffff0105ed0
> >>> > > 0xfffffffff0105d90=0xfffffffff013cff7
> >>> > > 0xfffffffff0105d98=0xfffffffff0105db5
> >>> > > 0xfffffffff0105e08=0xfffffffff013d1b8
> >>> > > 0xfffffffff0105e10=0xfffffffff0105e00
> >>> > > 0xfffffffff0105e20=0xfffffffff0105ea3
> >>> > > 0xfffffffff0105e28=0xfffffffff0105e98
> >>> > > 0xfffffffff0105e38=0xfffffffff013d1b8
> >>> > > 0xfffffffff0105e40=0xfffffffff0105e98
> >>> > > 0xfffffffff0105e60=0xfffffffff013d217
> >>> > > 0xfffffffff0105e68=0xfffffffff015d9c9
> >>> > > 0xfffffffff0105e80=0xfffffffff0105fb8
> >>> > > 0xfffffffff0105e90=0xfffffffff015d5d9
> >>> > > 0xfffffffff0105ea8=0xfffffffff0105ed0
> >>> > > 0xfffffffff0105ec0=0xfffffffff0116a3b
> >>> > > 0xfffffffff0105ef8=0xfffffffff012fe55
> >>> > > 0xfffffffff0105f08=0xfffffffff01a1afa
> >>> > > 0xfffffffff0105f10=0x0000000000000004
> >>> > > 0xfffffffff0105f18=0x0000000000000046
> >>> > > 0xfffffffff0105f20=0xfffffffff00fffd9
> >>> > > 0xfffffffff0105f28=0x0000000000000006
> >>> > > 0xfffffffff0105f30=0xfffffffff015d5d9
> >>> > > 0xfffffffff0105f38=0xfffffffff0000400
> >>> > > 0xfffffffff0105f40=0x0000000000000000
> >>> > > 0xfffffffff0105f48=0xfffffffff012fec9
> >>> > > 0xfffffffff0105f50=0xfffffffff01a1aff
> >>> > > 0xfffffffff0105f58=0x0000000000000208
> >>> > > 0xfffffffff0105f60=0x0000000000000124
> >>> > > 0xfffffffff0105f68=0xfffffffff01149d0
> >>> > > 0xfffffffff0105f70=0x0000000000000006
> >>> > > 0xfffffffff0105f78=0xfffffffff0114ba7
> >>> > > 0xfffffffff0105f80=0xfffffffff0227510
> >>> > > 0xfffffffff0105f88=0xffffffff00000000
> >>> > > 0xfffffffff0105f90=0x0000000000000000
> >>> > > 0xfffffffff0105f98=0xfffffffff0105fb8
> >>> > > 0xfffffffff0105fa0=0x0000000bf0116b0d
> >>> > > 0xfffffffff0105fa8=0xfffffffff011622a
> >>> > > 0xfffffffff0105fb0=0xffffffff00000400
> >>> > > 0xfffffffff0105fb8=0xffffffff00000000
> >>> > > 0xfffffffff0105fc0=0x0000000000000000
> >>> > > 0xfffffffff0105fc8=0x0000000000000000
> >>> > > 0xfffffffff0105fd0=0x0000000000000000
> >>> > > 0xfffffffff0105fd8=0x0000000000000000
> >>> > > 0xfffffffff0105fe0=0x0000000000000000
> >>> > > 0xfffffffff0105fe8=0xfffffffff0110204
> >>> > > 0xfffffffff0105ff0=0x000000002badb002
> >>> > > 0xfffffffff0105ff8=0x000000000023b000
> >>> > > cpu0: exiting
> >>> > >
> >>> > > >
> >>> > > >
> >>> > > > On Tue, Jan 28, 2025 at 6:09?AM <tlaronde@kergis.com> wrote:
> >>> > > >
> >>> > > > > After fixing problems leading to compiler warnings---legitimate
> >>> > > > > warnings, but even the too short binary negated unsigned 32bits values
> >>> > > > > promoted to 64 bits with leading bits hence 0 as mask were harmless---
> >>> > > > > now I want to look at the stumbing block.
> >>> > > > >
> >>> > > > > For me, under vmx, this is the assert in map.c:17:
> >>> > > > >
> >>> > > > > assert(pa < KSEG2);
> >>> > > > >
> >>> > > > > that triggers, and it should come from a call from multiboot.
> >>> > > > >
> >>> > > > > My first reflex is to start adding printf() instructions to track the
> >>> > > > > problem, but is there a better way when dealing with the kernel?
> >>> > > > >
> >>> > > > > Second question: since, if I'm not mistaken, 9front doesn't use
> >>> > > > > multiboot, is vmx usable (i.e. agnostic about) with the multiboot
> >>> > > stuff?
> >>> > > > > The embedded boot stuff should handle the thing by itself without load
> >>> > > > > addresses having to be adjusted because of vmx?
> >>> > > > > --
> >>> > > > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> >>> > > > > http://www.kergis.com/
> >>> > > > > http://kertex.kergis.com/
> >>> > > > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
> >>> > >
> >>> > > --
> >>> > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> >>> > > http://www.kergis.com/
> >>> > > http://kertex.kergis.com/
> >>> > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
> >>>
> >>> --
> >>> Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> >>> http://www.kergis.com/
> >>> http://kertex.kergis.com/
> >>> Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
> >
> > 9fans / 9fans / see discussions + participants + delivery options Permalink
--
Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T8b5b89fcf829819e-Mddf2ea4ad8ff4832d89fde1e
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] Nix/regen: assert triggered; best way to track
2025-01-29 12:34 ` tlaronde
@ 2025-01-29 15:14 ` ron minnich
2025-01-29 17:37 ` tlaronde
0 siblings, 1 reply; 15+ messages in thread
From: ron minnich @ 2025-01-29 15:14 UTC (permalink / raw)
To: 9fans
[-- Attachment #1: Type: text/plain, Size: 14484 bytes --]
I agree with (1), vmx is not ready for nix yet.
as for (2), I'll take a slight change: the goal is to remove nix as an
independent entity, and subsume it in other plan 9 kernels (I prefer 9front
at this point)
Big picture, NIX could be a build option for 9front, or a branch in 9front;
something like that.
It makes no sense to keep NIX as its own thing, so much has improved in
plan 9 in 14 years. I looked into it and it would turn into a lot of
duplicate work, to no good effect.
On Wed, Jan 29, 2025 at 5:51 AM <tlaronde@kergis.com> wrote:
> It seems to me that the best course, for now, is the following:
>
> 1) [Using qemu or booting the kernel on baremetal] Try and correct
> Nix, in its present state, to achieve a running Nix, with 9front, for
> objtype==amd64;
>
> 2) Once 1) is achieved, start cleaning (then, at this moment, the mps
> stuff could be revised) and reorganizing code to clearly segregate
> Machine Independent (M.I.) and Machine Dependent (M.D.), so that
> porting Nix to other archs be possible.
>
> And concurrently, during either step, document...
>
> On Tue, Jan 28, 2025 at 04:02:11PM -0800, Ron Minnich wrote:
> > btw, if you
> > acid 9pc64
> > you can paste this right into acid
> > src(0xfffffffff011cdee); // dumpstack+0x10
> > src(0xfffffffff013d50f); // panic+0x133
> > src(0xfffffffff0116a3b); // KADDR+0x55
> > src(0xfffffffff012fe55); // sigsearch+0xc8
> > src(0xfffffffff012fec9); // mpsinit+0x14
> > src(0xfffffffff011622a); // main+0x30b
> > src(0xfffffffff0110204); // ndnr
> > and see the source.
> >
> > Also, the ndnr is a jmk-ism: it means "no deposit, no return"
> >
> > so, let's see, I can't tell if we went over this before.
> > What is KADDR and KADDR2? They relate to TMFM, another jmk-ism I
> > believe: Too Many F-ing Megabytes, where too many is "more than 2G" --
> > why 2G? well ...
> >
> > basically, amd64, like lots of things (risc-v) uses this one simple
> > trick: if you sign-extend a 32-bit pointer, you get something anchored
> > either at the top 2G (kernel va) or the low 2G (user code).
> >
> > i.e. 0x80000000 -> 0xffffffff_80000000 -> this is convenient. You can
> > use 32-bit pointers for lots of things, and, since the amd64 is a
> > pretty half-way 64-bit CPU (lots of 64-bit instructions only
> > completely work with RAX), this is helpful.
> >
> > And it works great until you get CPUs with TMFM. Then you need to
> > split memory up:
> > physical 0 -> 2 Gb becomes virtual 0xfffffff_80000000
> > physical 2Gb and up becomes ... fffffe0000000000
> > Why fffffe0000000000? the first amd64 only had something like 41(?)
> > bits of virtual address:there's this giant hole in the middle,and
> > kernel virtual HAD to start at that address -- 64 bits - whatever gets
> > you to 23 bits. [I can't find the actual documents on this, I am out
> > of time to look, so you'll need to fill in my likely errors here]
> > It's a hardware mandate from opteron land.
> >
> > OK, so KADDR2 is fffffe0000000000, and that error is saying code
> > called KADDR2 with something that's not in KADDR2. That va
> > fffffffffffffc00 is in the low 2GiB physical, which is KADDR.
> >
> > This is a side effect of the real problem: you don't have the table it
> > wants. So you need to fix that, OR, start using qemu for your testing.
> >
> > I hope I did not mess the details up too much here....
> >
> > On Tue, Jan 28, 2025 at 1:24?PM ron minnich <rminnich@gmail.com> wrote:
> > >
> > > I'd be happier to remove the mps dependency actually. the mps is long
> dead. But that's a bigger story.
> > >
> > >
> > > On Tue, Jan 28, 2025 at 11:24?AM Paul Lalonde <
> paul.a.lalonde@gmail.com> wrote:
> > >>
> > >> Ah, that's the code path that sent me to QEMU.
> > >> Vmx doesn't have any MP tables, which leads to this fault in mpsinit.
> > >> Ron provided this minimal one for me, which I think we could learn
> from to adapt into vmx. The hacky version of pointing the code directly at
> something like this baked in didn't excite me.
> > >>
> > >> 50 43 4D 50 ; "PCMP"
> > >> 00 00 ; Table Length (placeholder)
> > >> 04 ; Spec Revision
> > >> 00 ; Checksum (placeholder)
> > >> 42 4F 43 48 53 43 50 55 ; "BOCHSCPU"
> > >> 30 2E 31 20 20 20 20 20 20 20 20 ; "0.1 "
> > >> 00 00 00 00 ; OEM Table Pointer
> > >> 00 00 ; OEM Table Size
> > >> 14 00 ; Entry Count (2 CPUs + 18 = 20,
> little-endian)
> > >> 00 00 E0 FE ; Local APIC Address (0xfee00000)
> > >> 00 00 ; Ext Table Length
> > >> 00 ; Ext Table Checksum
> > >> 00 ; Reserved
> > >>
> > >> On Tue, Jan 28, 2025 at 11:15?AM <tlaronde@kergis.com> wrote:
> > >>>
> > >>> On Tue, Jan 28, 2025 at 09:18:29AM -0800, Paul Lalonde wrote:
> > >>> > ktrace can generate a stack for you from that dump. The line
> starting with
> > >>> > "ktrace" is the command line (you might change 9k8cpu to the path
> to the
> > >>> > kernel file in you're not in the directory where you built it).
> > >>> > Then the following lines up to but not including the "cpu0:
> exiting" can be
> > >>> > dropped into ktrace's stdin to have it generate a stack trace.
> You'll need
> > >>> > to add the ^d at the end if you're cut-and-pasting.
> > >>> >
> > >>> > Though it looks like it's just triggering the page fault trap on
> that
> > >>> > 0xfffffffffffffc00 address, which itself looks like a victim of
> > >>> > sign-extension. So back up to the fault and find the source of
> that
> > >>> > address?
> > >>>
> > >>> Yes:
> > >>>
> > >>> src(0xfffffffff011cdee); // dumpstack+0x10
> > >>> src(0xfffffffff013d50f); // panic+0x133
> > >>> src(0xfffffffff0116a3b); // KADDR+0x55
> > >>> src(0xfffffffff012fe55); // sigsearch+0xc8
> > >>> src(0xfffffffff012fec9); // mpsinit+0x14
> > >>> src(0xfffffffff011622a); // main+0x30b
> > >>> src(0xfffffffff0110204); // ndnr
> > >>>
> > >>> this doesn't tell me much more than what I knew already: it panics in
> > >>> mpsinit, calling KADDR in map.c.
> > >>>
> > >>> During my next wandering under Nix, I will try to track back from
> > >>> where the offending address is taken or with what it is constructed.
> > >>>
> > >>> >
> > >>> > On Tue, Jan 28, 2025 at 9:09?AM <tlaronde@kergis.com> wrote:
> > >>> >
> > >>> > > On Tue, Jan 28, 2025 at 07:49:02AM -0800, Paul Lalonde wrote:
> > >>> > > > Do you have a stack for the assert, from the ktrace?
> > >>> > > >
> > >>> > >
> > >>> > > Yes, and I was wrong: it fails relatively "late" in main.c: at
> > >>> > > mpsinit.
> > >>> > >
> > >>> > > Here is the info (I added a bunch of print() before each
> function call
> > >>> > > to know where it stumbled upon an incorrect address):
> > >>> > >
> > >>> > > term% nix/test_vmx
> > >>> > >
> > >>> > > NIX
> > >>> > > mmunit...mmuinit: vmstart 0xfffffffff0000000 vmunused
> 0xfffffffff023d000
> > >>> > > vmunmapped 0xfffffffff0400000 vmend 0xfffffffff4000000
> > >>> > > sys->pd 0x108003 0x108023
> > >>> > > cpu0: mmu l3 pte 0xfffffffff0106ff8 = 107023
> > >>> > > cpu0: mmu l2 pte 0xfffffffff0107ff8 = 108023
> > >>> > > cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
> > >>> > > cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
> > >>> > > ioinit... multibootmemassert... kbdinit... meminit...asm: addr
> > >>> > > 0x0000000004000000 end 0x0000000004000000 type 1 size 0
> > >>> > > cm 0: addr 0x4000000 npage 0
> > >>> > > 0 0 0
> > >>> > > npage 0 upage 0 kpage 16384
> > >>> > > confinit... archinit... mallocinit...base 0xfffffffff023d000 ptr
> > >>> > > 0xfffffffff023d000 nunits 4047617
> > >>> > > acpiinit... umeminit... trapinit... printinit... i8259init...
> procinit...
> > >>> > > mpsinit...panic: cpu0: map.c:KADDR() passed addr
> fffffffffffffc00 >=
> > >>> > > fffffe0000000000
> > >>> > > panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00 >=
> fffffe0000000000
> > >>> > >
> > >>> > > dumpstack
> > >>> > > ktrace 9k8cpu 0xfffffffff011cdee 0xfffffffff0105d58
> > >>> > > estackx 0xfffffffff0106000
> > >>> > > 0xfffffffff0105c70=0xfffffffff0105da8
> > >>> > > 0xfffffffff0105c78=0xfffffffff011cb91
> > >>> > > 0xfffffffff0105c80=0xfffffffff0105c98
> > >>> > > 0xfffffffff0105c98=0xfffffffff013cff7
> > >>> > > 0xfffffffff0105cb0=0xfffffffff0105cd0
> > >>> > > 0xfffffffff0105cc0=0xfffffffff0105ea7
> > >>> > > 0xfffffffff0105cc8=0xfffffffff0105df3
> > >>> > > 0xfffffffff0105ce0=0xfffffffff013d14d
> > >>> > > 0xfffffffff0105d08=0xfffffffff0105d90
> > >>> > > 0xfffffffff0105d28=0xfffffffff011cdee
> > >>> > > 0xfffffffff0105d30=0xfffffffff0105da8
> > >>> > > 0xfffffffff0105d40=0xfffffffff0105d58
> > >>> > > 0xfffffffff0105d48=0xfffffffff0105da8
> > >>> > > 0xfffffffff0105d50=0xfffffffff011cdee
> > >>> > > 0xfffffffff0105d58=0xfffffffff011cb99
> > >>> > > 0xfffffffff0105d68=0xfffffffff013d50f
> > >>> > > 0xfffffffff0105d88=0xfffffffff0105ed0
> > >>> > > 0xfffffffff0105d90=0xfffffffff013cff7
> > >>> > > 0xfffffffff0105d98=0xfffffffff0105db5
> > >>> > > 0xfffffffff0105e08=0xfffffffff013d1b8
> > >>> > > 0xfffffffff0105e10=0xfffffffff0105e00
> > >>> > > 0xfffffffff0105e20=0xfffffffff0105ea3
> > >>> > > 0xfffffffff0105e28=0xfffffffff0105e98
> > >>> > > 0xfffffffff0105e38=0xfffffffff013d1b8
> > >>> > > 0xfffffffff0105e40=0xfffffffff0105e98
> > >>> > > 0xfffffffff0105e60=0xfffffffff013d217
> > >>> > > 0xfffffffff0105e68=0xfffffffff015d9c9
> > >>> > > 0xfffffffff0105e80=0xfffffffff0105fb8
> > >>> > > 0xfffffffff0105e90=0xfffffffff015d5d9
> > >>> > > 0xfffffffff0105ea8=0xfffffffff0105ed0
> > >>> > > 0xfffffffff0105ec0=0xfffffffff0116a3b
> > >>> > > 0xfffffffff0105ef8=0xfffffffff012fe55
> > >>> > > 0xfffffffff0105f08=0xfffffffff01a1afa
> > >>> > > 0xfffffffff0105f10=0x0000000000000004
> > >>> > > 0xfffffffff0105f18=0x0000000000000046
> > >>> > > 0xfffffffff0105f20=0xfffffffff00fffd9
> > >>> > > 0xfffffffff0105f28=0x0000000000000006
> > >>> > > 0xfffffffff0105f30=0xfffffffff015d5d9
> > >>> > > 0xfffffffff0105f38=0xfffffffff0000400
> > >>> > > 0xfffffffff0105f40=0x0000000000000000
> > >>> > > 0xfffffffff0105f48=0xfffffffff012fec9
> > >>> > > 0xfffffffff0105f50=0xfffffffff01a1aff
> > >>> > > 0xfffffffff0105f58=0x0000000000000208
> > >>> > > 0xfffffffff0105f60=0x0000000000000124
> > >>> > > 0xfffffffff0105f68=0xfffffffff01149d0
> > >>> > > 0xfffffffff0105f70=0x0000000000000006
> > >>> > > 0xfffffffff0105f78=0xfffffffff0114ba7
> > >>> > > 0xfffffffff0105f80=0xfffffffff0227510
> > >>> > > 0xfffffffff0105f88=0xffffffff00000000
> > >>> > > 0xfffffffff0105f90=0x0000000000000000
> > >>> > > 0xfffffffff0105f98=0xfffffffff0105fb8
> > >>> > > 0xfffffffff0105fa0=0x0000000bf0116b0d
> > >>> > > 0xfffffffff0105fa8=0xfffffffff011622a
> > >>> > > 0xfffffffff0105fb0=0xffffffff00000400
> > >>> > > 0xfffffffff0105fb8=0xffffffff00000000
> > >>> > > 0xfffffffff0105fc0=0x0000000000000000
> > >>> > > 0xfffffffff0105fc8=0x0000000000000000
> > >>> > > 0xfffffffff0105fd0=0x0000000000000000
> > >>> > > 0xfffffffff0105fd8=0x0000000000000000
> > >>> > > 0xfffffffff0105fe0=0x0000000000000000
> > >>> > > 0xfffffffff0105fe8=0xfffffffff0110204
> > >>> > > 0xfffffffff0105ff0=0x000000002badb002
> > >>> > > 0xfffffffff0105ff8=0x000000000023b000
> > >>> > > cpu0: exiting
> > >>> > >
> > >>> > > >
> > >>> > > >
> > >>> > > > On Tue, Jan 28, 2025 at 6:09?AM <tlaronde@kergis.com> wrote:
> > >>> > > >
> > >>> > > > > After fixing problems leading to compiler
> warnings---legitimate
> > >>> > > > > warnings, but even the too short binary negated unsigned
> 32bits values
> > >>> > > > > promoted to 64 bits with leading bits hence 0 as mask were
> harmless---
> > >>> > > > > now I want to look at the stumbing block.
> > >>> > > > >
> > >>> > > > > For me, under vmx, this is the assert in map.c:17:
> > >>> > > > >
> > >>> > > > > assert(pa < KSEG2);
> > >>> > > > >
> > >>> > > > > that triggers, and it should come from a call from multiboot.
> > >>> > > > >
> > >>> > > > > My first reflex is to start adding printf() instructions to
> track the
> > >>> > > > > problem, but is there a better way when dealing with the
> kernel?
> > >>> > > > >
> > >>> > > > > Second question: since, if I'm not mistaken, 9front doesn't
> use
> > >>> > > > > multiboot, is vmx usable (i.e. agnostic about) with the
> multiboot
> > >>> > > stuff?
> > >>> > > > > The embedded boot stuff should handle the thing by itself
> without load
> > >>> > > > > addresses having to be adjusted because of vmx?
> > >>> > > > > --
> > >>> > > > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> > >>> > > > > http://www.kergis.com/
> > >>> > > > > http://kertex.kergis.com/
> > >>> > > > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95
> 6006 F40C
> > >>> > >
> > >>> > > --
> > >>> > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> > >>> > > http://www.kergis.com/
> > >>> > > http://kertex.kergis.com/
> > >>> > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006
> F40C
> > >>>
> > >>> --
> > >>> Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> > >>> http://www.kergis.com/
> > >>> http://kertex.kergis.com/
> > >>> Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
> > >
> > > 9fans / 9fans / see discussions + participants + delivery options
> Permalink
>
> --
> Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> http://www.kergis.com/
> http://kertex.kergis.com/
> Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T8b5b89fcf829819e-Mecbe892c7c26bc685f4e5f37
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
[-- Attachment #2: Type: text/html, Size: 23557 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] Nix/regen: assert triggered; best way to track
2025-01-29 15:14 ` ron minnich
@ 2025-01-29 17:37 ` tlaronde
2025-01-30 3:03 ` ron minnich
0 siblings, 1 reply; 15+ messages in thread
From: tlaronde @ 2025-01-29 17:37 UTC (permalink / raw)
To: 9fans
On Wed, Jan 29, 2025 at 07:14:10AM -0800, ron minnich wrote:
> I agree with (1), vmx is not ready for nix yet.
> as for (2), I'll take a slight change: the goal is to remove nix as an
> independent entity, and subsume it in other plan 9 kernels (I prefer 9front
> at this point)
>
> Big picture, NIX could be a build option for 9front, or a branch in 9front;
> something like that.
I agree that there is a large overlap, from a cursory look. But Nix
introduces also some supplementary things, so integrating Nix would
mean, it seems, allowing conditional compilation. And I fear some will
argue that this will lead to spaghetti code.
This is why I asked if someone had already written a "patcher file
server" so that in an interim period, files that can and should be for
a large portion shared, could be used without copying them but simply
by applying a diff against them---this could show, too, what the
differences are, and that they are minimal (for devices in pc or
port).
But since 1) is still to be achieved, people have some time to think
about what they want or accept, or not.
T. Laronde
>
> It makes no sense to keep NIX as its own thing, so much has improved in
> plan 9 in 14 years. I looked into it and it would turn into a lot of
> duplicate work, to no good effect.
>
>
> On Wed, Jan 29, 2025 at 5:51?AM <tlaronde@kergis.com> wrote:
>
> > It seems to me that the best course, for now, is the following:
> >
> > 1) [Using qemu or booting the kernel on baremetal] Try and correct
> > Nix, in its present state, to achieve a running Nix, with 9front, for
> > objtype==amd64;
> >
> > 2) Once 1) is achieved, start cleaning (then, at this moment, the mps
> > stuff could be revised) and reorganizing code to clearly segregate
> > Machine Independent (M.I.) and Machine Dependent (M.D.), so that
> > porting Nix to other archs be possible.
> >
> > And concurrently, during either step, document...
> >
> > On Tue, Jan 28, 2025 at 04:02:11PM -0800, Ron Minnich wrote:
> > > btw, if you
> > > acid 9pc64
> > > you can paste this right into acid
> > > src(0xfffffffff011cdee); // dumpstack+0x10
> > > src(0xfffffffff013d50f); // panic+0x133
> > > src(0xfffffffff0116a3b); // KADDR+0x55
> > > src(0xfffffffff012fe55); // sigsearch+0xc8
> > > src(0xfffffffff012fec9); // mpsinit+0x14
> > > src(0xfffffffff011622a); // main+0x30b
> > > src(0xfffffffff0110204); // ndnr
> > > and see the source.
> > >
> > > Also, the ndnr is a jmk-ism: it means "no deposit, no return"
> > >
> > > so, let's see, I can't tell if we went over this before.
> > > What is KADDR and KADDR2? They relate to TMFM, another jmk-ism I
> > > believe: Too Many F-ing Megabytes, where too many is "more than 2G" --
> > > why 2G? well ...
> > >
> > > basically, amd64, like lots of things (risc-v) uses this one simple
> > > trick: if you sign-extend a 32-bit pointer, you get something anchored
> > > either at the top 2G (kernel va) or the low 2G (user code).
> > >
> > > i.e. 0x80000000 -> 0xffffffff_80000000 -> this is convenient. You can
> > > use 32-bit pointers for lots of things, and, since the amd64 is a
> > > pretty half-way 64-bit CPU (lots of 64-bit instructions only
> > > completely work with RAX), this is helpful.
> > >
> > > And it works great until you get CPUs with TMFM. Then you need to
> > > split memory up:
> > > physical 0 -> 2 Gb becomes virtual 0xfffffff_80000000
> > > physical 2Gb and up becomes ... fffffe0000000000
> > > Why fffffe0000000000? the first amd64 only had something like 41(?)
> > > bits of virtual address:there's this giant hole in the middle,and
> > > kernel virtual HAD to start at that address -- 64 bits - whatever gets
> > > you to 23 bits. [I can't find the actual documents on this, I am out
> > > of time to look, so you'll need to fill in my likely errors here]
> > > It's a hardware mandate from opteron land.
> > >
> > > OK, so KADDR2 is fffffe0000000000, and that error is saying code
> > > called KADDR2 with something that's not in KADDR2. That va
> > > fffffffffffffc00 is in the low 2GiB physical, which is KADDR.
> > >
> > > This is a side effect of the real problem: you don't have the table it
> > > wants. So you need to fix that, OR, start using qemu for your testing.
> > >
> > > I hope I did not mess the details up too much here....
> > >
> > > On Tue, Jan 28, 2025 at 1:24?PM ron minnich <rminnich@gmail.com> wrote:
> > > >
> > > > I'd be happier to remove the mps dependency actually. the mps is long
> > dead. But that's a bigger story.
> > > >
> > > >
> > > > On Tue, Jan 28, 2025 at 11:24?AM Paul Lalonde <
> > paul.a.lalonde@gmail.com> wrote:
> > > >>
> > > >> Ah, that's the code path that sent me to QEMU.
> > > >> Vmx doesn't have any MP tables, which leads to this fault in mpsinit.
> > > >> Ron provided this minimal one for me, which I think we could learn
> > from to adapt into vmx. The hacky version of pointing the code directly at
> > something like this baked in didn't excite me.
> > > >>
> > > >> 50 43 4D 50 ; "PCMP"
> > > >> 00 00 ; Table Length (placeholder)
> > > >> 04 ; Spec Revision
> > > >> 00 ; Checksum (placeholder)
> > > >> 42 4F 43 48 53 43 50 55 ; "BOCHSCPU"
> > > >> 30 2E 31 20 20 20 20 20 20 20 20 ; "0.1 "
> > > >> 00 00 00 00 ; OEM Table Pointer
> > > >> 00 00 ; OEM Table Size
> > > >> 14 00 ; Entry Count (2 CPUs + 18 = 20,
> > little-endian)
> > > >> 00 00 E0 FE ; Local APIC Address (0xfee00000)
> > > >> 00 00 ; Ext Table Length
> > > >> 00 ; Ext Table Checksum
> > > >> 00 ; Reserved
> > > >>
> > > >> On Tue, Jan 28, 2025 at 11:15?AM <tlaronde@kergis.com> wrote:
> > > >>>
> > > >>> On Tue, Jan 28, 2025 at 09:18:29AM -0800, Paul Lalonde wrote:
> > > >>> > ktrace can generate a stack for you from that dump. The line
> > starting with
> > > >>> > "ktrace" is the command line (you might change 9k8cpu to the path
> > to the
> > > >>> > kernel file in you're not in the directory where you built it).
> > > >>> > Then the following lines up to but not including the "cpu0:
> > exiting" can be
> > > >>> > dropped into ktrace's stdin to have it generate a stack trace.
> > You'll need
> > > >>> > to add the ^d at the end if you're cut-and-pasting.
> > > >>> >
> > > >>> > Though it looks like it's just triggering the page fault trap on
> > that
> > > >>> > 0xfffffffffffffc00 address, which itself looks like a victim of
> > > >>> > sign-extension. So back up to the fault and find the source of
> > that
> > > >>> > address?
> > > >>>
> > > >>> Yes:
> > > >>>
> > > >>> src(0xfffffffff011cdee); // dumpstack+0x10
> > > >>> src(0xfffffffff013d50f); // panic+0x133
> > > >>> src(0xfffffffff0116a3b); // KADDR+0x55
> > > >>> src(0xfffffffff012fe55); // sigsearch+0xc8
> > > >>> src(0xfffffffff012fec9); // mpsinit+0x14
> > > >>> src(0xfffffffff011622a); // main+0x30b
> > > >>> src(0xfffffffff0110204); // ndnr
> > > >>>
> > > >>> this doesn't tell me much more than what I knew already: it panics in
> > > >>> mpsinit, calling KADDR in map.c.
> > > >>>
> > > >>> During my next wandering under Nix, I will try to track back from
> > > >>> where the offending address is taken or with what it is constructed.
> > > >>>
> > > >>> >
> > > >>> > On Tue, Jan 28, 2025 at 9:09?AM <tlaronde@kergis.com> wrote:
> > > >>> >
> > > >>> > > On Tue, Jan 28, 2025 at 07:49:02AM -0800, Paul Lalonde wrote:
> > > >>> > > > Do you have a stack for the assert, from the ktrace?
> > > >>> > > >
> > > >>> > >
> > > >>> > > Yes, and I was wrong: it fails relatively "late" in main.c: at
> > > >>> > > mpsinit.
> > > >>> > >
> > > >>> > > Here is the info (I added a bunch of print() before each
> > function call
> > > >>> > > to know where it stumbled upon an incorrect address):
> > > >>> > >
> > > >>> > > term% nix/test_vmx
> > > >>> > >
> > > >>> > > NIX
> > > >>> > > mmunit...mmuinit: vmstart 0xfffffffff0000000 vmunused
> > 0xfffffffff023d000
> > > >>> > > vmunmapped 0xfffffffff0400000 vmend 0xfffffffff4000000
> > > >>> > > sys->pd 0x108003 0x108023
> > > >>> > > cpu0: mmu l3 pte 0xfffffffff0106ff8 = 107023
> > > >>> > > cpu0: mmu l2 pte 0xfffffffff0107ff8 = 108023
> > > >>> > > cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
> > > >>> > > cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
> > > >>> > > ioinit... multibootmemassert... kbdinit... meminit...asm: addr
> > > >>> > > 0x0000000004000000 end 0x0000000004000000 type 1 size 0
> > > >>> > > cm 0: addr 0x4000000 npage 0
> > > >>> > > 0 0 0
> > > >>> > > npage 0 upage 0 kpage 16384
> > > >>> > > confinit... archinit... mallocinit...base 0xfffffffff023d000 ptr
> > > >>> > > 0xfffffffff023d000 nunits 4047617
> > > >>> > > acpiinit... umeminit... trapinit... printinit... i8259init...
> > procinit...
> > > >>> > > mpsinit...panic: cpu0: map.c:KADDR() passed addr
> > fffffffffffffc00 >=
> > > >>> > > fffffe0000000000
> > > >>> > > panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00 >=
> > fffffe0000000000
> > > >>> > >
> > > >>> > > dumpstack
> > > >>> > > ktrace 9k8cpu 0xfffffffff011cdee 0xfffffffff0105d58
> > > >>> > > estackx 0xfffffffff0106000
> > > >>> > > 0xfffffffff0105c70=0xfffffffff0105da8
> > > >>> > > 0xfffffffff0105c78=0xfffffffff011cb91
> > > >>> > > 0xfffffffff0105c80=0xfffffffff0105c98
> > > >>> > > 0xfffffffff0105c98=0xfffffffff013cff7
> > > >>> > > 0xfffffffff0105cb0=0xfffffffff0105cd0
> > > >>> > > 0xfffffffff0105cc0=0xfffffffff0105ea7
> > > >>> > > 0xfffffffff0105cc8=0xfffffffff0105df3
> > > >>> > > 0xfffffffff0105ce0=0xfffffffff013d14d
> > > >>> > > 0xfffffffff0105d08=0xfffffffff0105d90
> > > >>> > > 0xfffffffff0105d28=0xfffffffff011cdee
> > > >>> > > 0xfffffffff0105d30=0xfffffffff0105da8
> > > >>> > > 0xfffffffff0105d40=0xfffffffff0105d58
> > > >>> > > 0xfffffffff0105d48=0xfffffffff0105da8
> > > >>> > > 0xfffffffff0105d50=0xfffffffff011cdee
> > > >>> > > 0xfffffffff0105d58=0xfffffffff011cb99
> > > >>> > > 0xfffffffff0105d68=0xfffffffff013d50f
> > > >>> > > 0xfffffffff0105d88=0xfffffffff0105ed0
> > > >>> > > 0xfffffffff0105d90=0xfffffffff013cff7
> > > >>> > > 0xfffffffff0105d98=0xfffffffff0105db5
> > > >>> > > 0xfffffffff0105e08=0xfffffffff013d1b8
> > > >>> > > 0xfffffffff0105e10=0xfffffffff0105e00
> > > >>> > > 0xfffffffff0105e20=0xfffffffff0105ea3
> > > >>> > > 0xfffffffff0105e28=0xfffffffff0105e98
> > > >>> > > 0xfffffffff0105e38=0xfffffffff013d1b8
> > > >>> > > 0xfffffffff0105e40=0xfffffffff0105e98
> > > >>> > > 0xfffffffff0105e60=0xfffffffff013d217
> > > >>> > > 0xfffffffff0105e68=0xfffffffff015d9c9
> > > >>> > > 0xfffffffff0105e80=0xfffffffff0105fb8
> > > >>> > > 0xfffffffff0105e90=0xfffffffff015d5d9
> > > >>> > > 0xfffffffff0105ea8=0xfffffffff0105ed0
> > > >>> > > 0xfffffffff0105ec0=0xfffffffff0116a3b
> > > >>> > > 0xfffffffff0105ef8=0xfffffffff012fe55
> > > >>> > > 0xfffffffff0105f08=0xfffffffff01a1afa
> > > >>> > > 0xfffffffff0105f10=0x0000000000000004
> > > >>> > > 0xfffffffff0105f18=0x0000000000000046
> > > >>> > > 0xfffffffff0105f20=0xfffffffff00fffd9
> > > >>> > > 0xfffffffff0105f28=0x0000000000000006
> > > >>> > > 0xfffffffff0105f30=0xfffffffff015d5d9
> > > >>> > > 0xfffffffff0105f38=0xfffffffff0000400
> > > >>> > > 0xfffffffff0105f40=0x0000000000000000
> > > >>> > > 0xfffffffff0105f48=0xfffffffff012fec9
> > > >>> > > 0xfffffffff0105f50=0xfffffffff01a1aff
> > > >>> > > 0xfffffffff0105f58=0x0000000000000208
> > > >>> > > 0xfffffffff0105f60=0x0000000000000124
> > > >>> > > 0xfffffffff0105f68=0xfffffffff01149d0
> > > >>> > > 0xfffffffff0105f70=0x0000000000000006
> > > >>> > > 0xfffffffff0105f78=0xfffffffff0114ba7
> > > >>> > > 0xfffffffff0105f80=0xfffffffff0227510
> > > >>> > > 0xfffffffff0105f88=0xffffffff00000000
> > > >>> > > 0xfffffffff0105f90=0x0000000000000000
> > > >>> > > 0xfffffffff0105f98=0xfffffffff0105fb8
> > > >>> > > 0xfffffffff0105fa0=0x0000000bf0116b0d
> > > >>> > > 0xfffffffff0105fa8=0xfffffffff011622a
> > > >>> > > 0xfffffffff0105fb0=0xffffffff00000400
> > > >>> > > 0xfffffffff0105fb8=0xffffffff00000000
> > > >>> > > 0xfffffffff0105fc0=0x0000000000000000
> > > >>> > > 0xfffffffff0105fc8=0x0000000000000000
> > > >>> > > 0xfffffffff0105fd0=0x0000000000000000
> > > >>> > > 0xfffffffff0105fd8=0x0000000000000000
> > > >>> > > 0xfffffffff0105fe0=0x0000000000000000
> > > >>> > > 0xfffffffff0105fe8=0xfffffffff0110204
> > > >>> > > 0xfffffffff0105ff0=0x000000002badb002
> > > >>> > > 0xfffffffff0105ff8=0x000000000023b000
> > > >>> > > cpu0: exiting
> > > >>> > >
> > > >>> > > >
> > > >>> > > >
> > > >>> > > > On Tue, Jan 28, 2025 at 6:09?AM <tlaronde@kergis.com> wrote:
> > > >>> > > >
> > > >>> > > > > After fixing problems leading to compiler
> > warnings---legitimate
> > > >>> > > > > warnings, but even the too short binary negated unsigned
> > 32bits values
> > > >>> > > > > promoted to 64 bits with leading bits hence 0 as mask were
> > harmless---
> > > >>> > > > > now I want to look at the stumbing block.
> > > >>> > > > >
> > > >>> > > > > For me, under vmx, this is the assert in map.c:17:
> > > >>> > > > >
> > > >>> > > > > assert(pa < KSEG2);
> > > >>> > > > >
> > > >>> > > > > that triggers, and it should come from a call from multiboot.
> > > >>> > > > >
> > > >>> > > > > My first reflex is to start adding printf() instructions to
> > track the
> > > >>> > > > > problem, but is there a better way when dealing with the
> > kernel?
> > > >>> > > > >
> > > >>> > > > > Second question: since, if I'm not mistaken, 9front doesn't
> > use
> > > >>> > > > > multiboot, is vmx usable (i.e. agnostic about) with the
> > multiboot
> > > >>> > > stuff?
> > > >>> > > > > The embedded boot stuff should handle the thing by itself
> > without load
> > > >>> > > > > addresses having to be adjusted because of vmx?
> > > >>> > > > > --
> > > >>> > > > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> > > >>> > > > > http://www.kergis.com/
> > > >>> > > > > http://kertex.kergis.com/
> > > >>> > > > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95
> > 6006 F40C
> > > >>> > >
> > > >>> > > --
> > > >>> > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> > > >>> > > http://www.kergis.com/
> > > >>> > > http://kertex.kergis.com/
> > > >>> > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006
> > F40C
> > > >>>
> > > >>> --
> > > >>> Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> > > >>> http://www.kergis.com/
> > > >>> http://kertex.kergis.com/
> > > >>> Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
> > > >
> > > > 9fans / 9fans / see discussions + participants + delivery options
> > Permalink
> >
> > --
> > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> > http://www.kergis.com/
> > http://kertex.kergis.com/
> > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
--
Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T8b5b89fcf829819e-Md979b320dcf3aead26a0cf06
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [9fans] Nix/regen: assert triggered; best way to track
2025-01-29 17:37 ` tlaronde
@ 2025-01-30 3:03 ` ron minnich
0 siblings, 0 replies; 15+ messages in thread
From: ron minnich @ 2025-01-30 3:03 UTC (permalink / raw)
To: 9fans
[-- Attachment #1: Type: text/plain, Size: 17270 bytes --]
I think it would be pretty easy to follow the pattern of including nix
conditionally in a conf file so you would have instead of pc64 , nixpc64.
I’m already doing that with my VM threads kernel — It’s an unchanged kernel
with just two additional lines to add a new device and support code.
There’s a lot of power in the conf files that’s worth looking at.
On Wed, Jan 29, 2025 at 10:06 <tlaronde@kergis.com> wrote:
> On Wed, Jan 29, 2025 at 07:14:10AM -0800, ron minnich wrote:
> > I agree with (1), vmx is not ready for nix yet.
> > as for (2), I'll take a slight change: the goal is to remove nix as an
> > independent entity, and subsume it in other plan 9 kernels (I prefer
> 9front
> > at this point)
> >
> > Big picture, NIX could be a build option for 9front, or a branch in
> 9front;
> > something like that.
>
> I agree that there is a large overlap, from a cursory look. But Nix
> introduces also some supplementary things, so integrating Nix would
> mean, it seems, allowing conditional compilation. And I fear some will
> argue that this will lead to spaghetti code.
>
> This is why I asked if someone had already written a "patcher file
> server" so that in an interim period, files that can and should be for
> a large portion shared, could be used without copying them but simply
> by applying a diff against them---this could show, too, what the
> differences are, and that they are minimal (for devices in pc or
> port).
>
> But since 1) is still to be achieved, people have some time to think
> about what they want or accept, or not.
>
>
> T. Laronde
>
> >
> > It makes no sense to keep NIX as its own thing, so much has improved in
> > plan 9 in 14 years. I looked into it and it would turn into a lot of
> > duplicate work, to no good effect.
> >
> >
> > On Wed, Jan 29, 2025 at 5:51?AM <tlaronde@kergis.com> wrote:
> >
> > > It seems to me that the best course, for now, is the following:
> > >
> > > 1) [Using qemu or booting the kernel on baremetal] Try and correct
> > > Nix, in its present state, to achieve a running Nix, with 9front, for
> > > objtype==amd64;
> > >
> > > 2) Once 1) is achieved, start cleaning (then, at this moment, the mps
> > > stuff could be revised) and reorganizing code to clearly segregate
> > > Machine Independent (M.I.) and Machine Dependent (M.D.), so that
> > > porting Nix to other archs be possible.
> > >
> > > And concurrently, during either step, document...
> > >
> > > On Tue, Jan 28, 2025 at 04:02:11PM -0800, Ron Minnich wrote:
> > > > btw, if you
> > > > acid 9pc64
> > > > you can paste this right into acid
> > > > src(0xfffffffff011cdee); // dumpstack+0x10
> > > > src(0xfffffffff013d50f); // panic+0x133
> > > > src(0xfffffffff0116a3b); // KADDR+0x55
> > > > src(0xfffffffff012fe55); // sigsearch+0xc8
> > > > src(0xfffffffff012fec9); // mpsinit+0x14
> > > > src(0xfffffffff011622a); // main+0x30b
> > > > src(0xfffffffff0110204); // ndnr
> > > > and see the source.
> > > >
> > > > Also, the ndnr is a jmk-ism: it means "no deposit, no return"
> > > >
> > > > so, let's see, I can't tell if we went over this before.
> > > > What is KADDR and KADDR2? They relate to TMFM, another jmk-ism I
> > > > believe: Too Many F-ing Megabytes, where too many is "more than 2G"
> --
> > > > why 2G? well ...
> > > >
> > > > basically, amd64, like lots of things (risc-v) uses this one simple
> > > > trick: if you sign-extend a 32-bit pointer, you get something
> anchored
> > > > either at the top 2G (kernel va) or the low 2G (user code).
> > > >
> > > > i.e. 0x80000000 -> 0xffffffff_80000000 -> this is convenient. You can
> > > > use 32-bit pointers for lots of things, and, since the amd64 is a
> > > > pretty half-way 64-bit CPU (lots of 64-bit instructions only
> > > > completely work with RAX), this is helpful.
> > > >
> > > > And it works great until you get CPUs with TMFM. Then you need to
> > > > split memory up:
> > > > physical 0 -> 2 Gb becomes virtual 0xfffffff_80000000
> > > > physical 2Gb and up becomes ... fffffe0000000000
> > > > Why fffffe0000000000? the first amd64 only had something like 41(?)
> > > > bits of virtual address:there's this giant hole in the middle,and
> > > > kernel virtual HAD to start at that address -- 64 bits - whatever
> gets
> > > > you to 23 bits. [I can't find the actual documents on this, I am out
> > > > of time to look, so you'll need to fill in my likely errors here]
> > > > It's a hardware mandate from opteron land.
> > > >
> > > > OK, so KADDR2 is fffffe0000000000, and that error is saying code
> > > > called KADDR2 with something that's not in KADDR2. That va
> > > > fffffffffffffc00 is in the low 2GiB physical, which is KADDR.
> > > >
> > > > This is a side effect of the real problem: you don't have the table
> it
> > > > wants. So you need to fix that, OR, start using qemu for your
> testing.
> > > >
> > > > I hope I did not mess the details up too much here....
> > > >
> > > > On Tue, Jan 28, 2025 at 1:24?PM ron minnich <rminnich@gmail.com>
> wrote:
> > > > >
> > > > > I'd be happier to remove the mps dependency actually. the mps is
> long
> > > dead. But that's a bigger story.
> > > > >
> > > > >
> > > > > On Tue, Jan 28, 2025 at 11:24?AM Paul Lalonde <
> > > paul.a.lalonde@gmail.com> wrote:
> > > > >>
> > > > >> Ah, that's the code path that sent me to QEMU.
> > > > >> Vmx doesn't have any MP tables, which leads to this fault in
> mpsinit.
> > > > >> Ron provided this minimal one for me, which I think we could learn
> > > from to adapt into vmx. The hacky version of pointing the code
> directly at
> > > something like this baked in didn't excite me.
> > > > >>
> > > > >> 50 43 4D 50 ; "PCMP"
> > > > >> 00 00 ; Table Length (placeholder)
> > > > >> 04 ; Spec Revision
> > > > >> 00 ; Checksum (placeholder)
> > > > >> 42 4F 43 48 53 43 50 55 ; "BOCHSCPU"
> > > > >> 30 2E 31 20 20 20 20 20 20 20 20 ; "0.1 "
> > > > >> 00 00 00 00 ; OEM Table Pointer
> > > > >> 00 00 ; OEM Table Size
> > > > >> 14 00 ; Entry Count (2 CPUs + 18 =
> 20,
> > > little-endian)
> > > > >> 00 00 E0 FE ; Local APIC Address
> (0xfee00000)
> > > > >> 00 00 ; Ext Table Length
> > > > >> 00 ; Ext Table Checksum
> > > > >> 00 ; Reserved
> > > > >>
> > > > >> On Tue, Jan 28, 2025 at 11:15?AM <tlaronde@kergis.com> wrote:
> > > > >>>
> > > > >>> On Tue, Jan 28, 2025 at 09:18:29AM -0800, Paul Lalonde wrote:
> > > > >>> > ktrace can generate a stack for you from that dump. The line
> > > starting with
> > > > >>> > "ktrace" is the command line (you might change 9k8cpu to the
> path
> > > to the
> > > > >>> > kernel file in you're not in the directory where you built it).
> > > > >>> > Then the following lines up to but not including the "cpu0:
> > > exiting" can be
> > > > >>> > dropped into ktrace's stdin to have it generate a stack trace.
> > > You'll need
> > > > >>> > to add the ^d at the end if you're cut-and-pasting.
> > > > >>> >
> > > > >>> > Though it looks like it's just triggering the page fault trap
> on
> > > that
> > > > >>> > 0xfffffffffffffc00 address, which itself looks like a victim of
> > > > >>> > sign-extension. So back up to the fault and find the source of
> > > that
> > > > >>> > address?
> > > > >>>
> > > > >>> Yes:
> > > > >>>
> > > > >>> src(0xfffffffff011cdee); // dumpstack+0x10
> > > > >>> src(0xfffffffff013d50f); // panic+0x133
> > > > >>> src(0xfffffffff0116a3b); // KADDR+0x55
> > > > >>> src(0xfffffffff012fe55); // sigsearch+0xc8
> > > > >>> src(0xfffffffff012fec9); // mpsinit+0x14
> > > > >>> src(0xfffffffff011622a); // main+0x30b
> > > > >>> src(0xfffffffff0110204); // ndnr
> > > > >>>
> > > > >>> this doesn't tell me much more than what I knew already: it
> panics in
> > > > >>> mpsinit, calling KADDR in map.c.
> > > > >>>
> > > > >>> During my next wandering under Nix, I will try to track back from
> > > > >>> where the offending address is taken or with what it is
> constructed.
> > > > >>> >
> > > > >>> > On Tue, Jan 28, 2025 at 9:09?AM <tlaronde@kergis.com> wrote:
> > > > >>> >
> > > > >>> > > On Tue, Jan 28, 2025 at 07:49:02AM -0800, Paul Lalonde wrote:
> > > > >>> > > > Do you have a stack for the assert, from the ktrace?
> > > > >>> > > >
> > > > >>> > >
> > > > >>> > > Yes, and I was wrong: it fails relatively "late" in main.c:
> at
> > > > >>> > > mpsinit.
> > > > >>> > >
> > > > >>> > > Here is the info (I added a bunch of print() before each
> > > function call
> > > > >>> > > to know where it stumbled upon an incorrect address):
> > > > >>> > >
> > > > >>> > > term% nix/test_vmx
> > > > >>> > >
> > > > >>> > > NIX
> > > > >>> > > mmunit...mmuinit: vmstart 0xfffffffff0000000 vmunused
> > > 0xfffffffff023d000
> > > > >>> > > vmunmapped 0xfffffffff0400000 vmend 0xfffffffff4000000
> > > > >>> > > sys->pd 0x108003 0x108023
> > > > >>> > > cpu0: mmu l3 pte 0xfffffffff0106ff8 = 107023
> > > > >>> > > cpu0: mmu l2 pte 0xfffffffff0107ff8 = 108023
> > > > >>> > > cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
> > > > >>> > > cpu0: mmu l1 pte 0xfffffffff0108c00 = e3
> > > > >>> > > ioinit... multibootmemassert... kbdinit... meminit...asm:
> addr
> > > > >>> > > 0x0000000004000000 end 0x0000000004000000 type 1 size 0
> > > > >>> > > cm 0: addr 0x4000000 npage 0
> > > > >>> > > 0 0 0
> > > > >>> > > npage 0 upage 0 kpage 16384
> > > > >>> > > confinit... archinit... mallocinit...base
> 0xfffffffff023d000 ptr
> > > > >>> > > 0xfffffffff023d000 nunits 4047617
> > > > >>> > > acpiinit... umeminit... trapinit... printinit...
> i8259init...
> > > procinit...
> > > > >>> > > mpsinit...panic: cpu0: map.c:KADDR() passed addr
> > > fffffffffffffc00 >=
> > > > >>> > > fffffe0000000000
> > > > >>> > > panic: cpu0: map.c:KADDR() passed addr fffffffffffffc00 >=
> > > fffffe0000000000
> > > > >>> > >
> > > > >>> > > dumpstack
> > > > >>> > > ktrace 9k8cpu 0xfffffffff011cdee 0xfffffffff0105d58
> > > > >>> > > estackx 0xfffffffff0106000
> > > > >>> > > 0xfffffffff0105c70=0xfffffffff0105da8
> > > > >>> > > 0xfffffffff0105c78=0xfffffffff011cb91
> > > > >>> > > 0xfffffffff0105c80=0xfffffffff0105c98
> > > > >>> > > 0xfffffffff0105c98=0xfffffffff013cff7
> > > > >>> > > 0xfffffffff0105cb0=0xfffffffff0105cd0
> > > > >>> > > 0xfffffffff0105cc0=0xfffffffff0105ea7
> > > > >>> > > 0xfffffffff0105cc8=0xfffffffff0105df3
> > > > >>> > > 0xfffffffff0105ce0=0xfffffffff013d14d
> > > > >>> > > 0xfffffffff0105d08=0xfffffffff0105d90
> > > > >>> > > 0xfffffffff0105d28=0xfffffffff011cdee
> > > > >>> > > 0xfffffffff0105d30=0xfffffffff0105da8
> > > > >>> > > 0xfffffffff0105d40=0xfffffffff0105d58
> > > > >>> > > 0xfffffffff0105d48=0xfffffffff0105da8
> > > > >>> > > 0xfffffffff0105d50=0xfffffffff011cdee
> > > > >>> > > 0xfffffffff0105d58=0xfffffffff011cb99
> > > > >>> > > 0xfffffffff0105d68=0xfffffffff013d50f
> > > > >>> > > 0xfffffffff0105d88=0xfffffffff0105ed0
> > > > >>> > > 0xfffffffff0105d90=0xfffffffff013cff7
> > > > >>> > > 0xfffffffff0105d98=0xfffffffff0105db5
> > > > >>> > > 0xfffffffff0105e08=0xfffffffff013d1b8
> > > > >>> > > 0xfffffffff0105e10=0xfffffffff0105e00
> > > > >>> > > 0xfffffffff0105e20=0xfffffffff0105ea3
> > > > >>> > > 0xfffffffff0105e28=0xfffffffff0105e98
> > > > >>> > > 0xfffffffff0105e38=0xfffffffff013d1b8
> > > > >>> > > 0xfffffffff0105e40=0xfffffffff0105e98
> > > > >>> > > 0xfffffffff0105e60=0xfffffffff013d217
> > > > >>> > > 0xfffffffff0105e68=0xfffffffff015d9c9
> > > > >>> > > 0xfffffffff0105e80=0xfffffffff0105fb8
> > > > >>> > > 0xfffffffff0105e90=0xfffffffff015d5d9
> > > > >>> > > 0xfffffffff0105ea8=0xfffffffff0105ed0
> > > > >>> > > 0xfffffffff0105ec0=0xfffffffff0116a3b
> > > > >>> > > 0xfffffffff0105ef8=0xfffffffff012fe55
> > > > >>> > > 0xfffffffff0105f08=0xfffffffff01a1afa
> > > > >>> > > 0xfffffffff0105f10=0x0000000000000004
> > > > >>> > > 0xfffffffff0105f18=0x0000000000000046
> > > > >>> > > 0xfffffffff0105f20=0xfffffffff00fffd9
> > > > >>> > > 0xfffffffff0105f28=0x0000000000000006
> > > > >>> > > 0xfffffffff0105f30=0xfffffffff015d5d9
> > > > >>> > > 0xfffffffff0105f38=0xfffffffff0000400
> > > > >>> > > 0xfffffffff0105f40=0x0000000000000000
> > > > >>> > > 0xfffffffff0105f48=0xfffffffff012fec9
> > > > >>> > > 0xfffffffff0105f50=0xfffffffff01a1aff
> > > > >>> > > 0xfffffffff0105f58=0x0000000000000208
> > > > >>> > > 0xfffffffff0105f60=0x0000000000000124
> > > > >>> > > 0xfffffffff0105f68=0xfffffffff01149d0
> > > > >>> > > 0xfffffffff0105f70=0x0000000000000006
> > > > >>> > > 0xfffffffff0105f78=0xfffffffff0114ba7
> > > > >>> > > 0xfffffffff0105f80=0xfffffffff0227510
> > > > >>> > > 0xfffffffff0105f88=0xffffffff00000000
> > > > >>> > > 0xfffffffff0105f90=0x0000000000000000
> > > > >>> > > 0xfffffffff0105f98=0xfffffffff0105fb8
> > > > >>> > > 0xfffffffff0105fa0=0x0000000bf0116b0d
> > > > >>> > > 0xfffffffff0105fa8=0xfffffffff011622a
> > > > >>> > > 0xfffffffff0105fb0=0xffffffff00000400
> > > > >>> > > 0xfffffffff0105fb8=0xffffffff00000000
> > > > >>> > > 0xfffffffff0105fc0=0x0000000000000000
> > > > >>> > > 0xfffffffff0105fc8=0x0000000000000000
> > > > >>> > > 0xfffffffff0105fd0=0x0000000000000000
> > > > >>> > > 0xfffffffff0105fd8=0x0000000000000000
> > > > >>> > > 0xfffffffff0105fe0=0x0000000000000000
> > > > >>> > > 0xfffffffff0105fe8=0xfffffffff0110204
> > > > >>> > > 0xfffffffff0105ff0=0x000000002badb002
> > > > >>> > > 0xfffffffff0105ff8=0x000000000023b000
> > > > >>> > > cpu0: exiting
> > > > >>> > >
> > > > >>> > > >
> > > > >>> > > >
> > > > >>> > > > On Tue, Jan 28, 2025 at 6:09?AM <tlaronde@kergis.com>
> wrote:
> > > > >>> > > > > After fixing problems leading to compiler
> > > warnings---legitimate
> > > > >>> > > > > warnings, but even the too short binary negated unsigned
> > > 32bits values
> > > > >>> > > > > promoted to 64 bits with leading bits hence 0 as mask
> were
> > > harmless---
> > > > >>> > > > > now I want to look at the stumbing block.
> > > > >>> > > > >
> > > > >>> > > > > For me, under vmx, this is the assert in map.c:17:
> > > > >>> > > > >
> > > > >>> > > > > assert(pa < KSEG2);
> > > > >>> > > > >
> > > > >>> > > > > that triggers, and it should come from a call from
> multiboot.
> > > > >>> > > > >
> > > > >>> > > > > My first reflex is to start adding printf() instructions
> to
> > > track the
> > > > >>> > > > > problem, but is there a better way when dealing with the
> > > kernel?
> > > > >>> > > > >
> > > > >>> > > > > Second question: since, if I'm not mistaken, 9front
> doesn't
> > > use
> > > > >>> > > > > multiboot, is vmx usable (i.e. agnostic about) with the
> > > multiboot
> > > > >>> > > stuff?
> > > > >>> > > > > The embedded boot stuff should handle the thing by itself
> > > without load
> > > > >>> > > > > addresses having to be adjusted because of vmx?
> > > > >>> > > > > --
> > > > >>> > > > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> > > > >>> > > > > http://www.kergis.com/
> > > > >>> > > > > http://kertex.kergis.com/
> > > > >>> > > > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1
> AE95
> > > 6006 F40C
> > > > >>> > >
> > > > >>> > > --
> > > > >>> > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> > > > >>> > > http://www.kergis.com/
> > > > >>> > > http://kertex.kergis.com/
> > > > >>> > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95
> 6006
> > > F40C
> > > > >>>
> > > > >>> --
> > > > >>> Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> > > > >>> http://www.kergis.com/
> > > > >>> http://kertex.kergis.com/
> > > > >>> Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006
> F40C
> > > > >
> > > > > 9fans / 9fans / see discussions + participants + delivery options
> > > Permalink
> > >
> > > --
> > > Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> > > http://www.kergis.com/
> > > http://kertex.kergis.com/
> > > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
>
> --
> Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
> http://www.kergis.com/
> http://kertex.kergis.com/
> Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T8b5b89fcf829819e-Mfb3ad6d99115e7139e994ddb
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
[-- Attachment #2: Type: text/html, Size: 28800 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-01-30 3:03 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-01-28 10:21 [9fans] Nix/regen: assert triggered; best way to track tlaronde
2025-01-28 15:35 ` Ron Minnich
2025-01-28 15:49 ` Paul Lalonde
2025-01-28 17:07 ` tlaronde
2025-01-28 17:18 ` Paul Lalonde
2025-01-28 18:16 ` tlaronde
2025-01-28 19:23 ` Paul Lalonde
2025-01-28 21:23 ` ron minnich
2025-01-29 0:02 ` Ron Minnich
2025-01-29 1:37 ` ron minnich
2025-01-29 12:34 ` tlaronde
2025-01-29 15:14 ` ron minnich
2025-01-29 17:37 ` tlaronde
2025-01-30 3:03 ` ron minnich
2025-01-28 17:27 ` ori
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).