9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] 9k amd64 kernel and floating point
@ 2025-03-02 20:28 Russ Cox
  2025-03-02 21:48 ` [9fans] " i
  2025-03-03 16:07 ` [9fans] " cinap_lenrek
  0 siblings, 2 replies; 21+ messages in thread
From: Russ Cox @ 2025-03-02 20:28 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 2019 bytes --]

Hi all,

Is anyone running the 9k amd64 kernel? I wrote a simple floating point test
program and would very much like to know if it works for you. The program
is at https://swtch.com/tmp/fpnote2.c.

6c fpnote2.c
6l fpnote2.6
6.out

Successful output ends by printing 49999995000001 49999995000001.

Unsuccessful output looks like this:

term% 6.fpnote2
start counting
.postnote
postnote done
floatnote mynote
out of sync 2970158019 140737488346705

The 'out of sync' message means the integer registers and floating-point
registers, which are counting the same sum, have gotten out of sync. It
seems like the floating-point registers are being lost shortly after a note
arrives and is handled by the process. Without the note, the program runs
correctly.

I did find one bug in what seems to be the latest copy of
/sys/src/9k/k10/fpu.c
<https://github.com/0intro/9legacy/blob/6df3ae5f452342095aaafbdaef1bd6c9342dd6af/sys/src/9k/k10/fpu.c#L166>.
In sysrforkchild, the child->fpusave pointer is set to point at up->fxsave,
which is the parent's FPU save buffer. This will get corrected during exec,
which will clear fpusave and set fpustate to Init. Then the first FPU usage
will set fpusave correctly. The sharing here is certainly a bug, but it's
not the problem: correcting that line to say child->fxsave does not make my
test program start working (as expected; in my program the child does no
floating point).

I'm seeing the failures using QEMU, both on a non-x86 Mac using -accel tcg
and on an AMD Linux system using -accel kvm. Those are different enough to
make me lean toward thinking it's a real hardware bug, but it could still
possibly be a QEMU problem. I'd be very interested to know how real
hardware fares.

Thanks!

Best,
Russ

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-M3e9849e499c6ce1204a99bf4
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 2976 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [9fans] Re: 9k amd64 kernel and floating point
  2025-03-02 20:28 [9fans] 9k amd64 kernel and floating point Russ Cox
@ 2025-03-02 21:48 ` i
  2025-03-03  0:14   ` i
  2025-03-03 16:07 ` [9fans] " cinap_lenrek
  1 sibling, 1 reply; 21+ messages in thread
From: i @ 2025-03-02 21:48 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 626 bytes --]

I'm getting 

start counting
.................postnote
....postnote done
floatnote mynote
...............................................................................49999995000001 49999995000001

on this 9k http://www.collyer.net/who/geoff/9/ but geoff might have fixed some stuff.

I'll probably boot 9legacy 9k tomorrow if nobody answers until then. I'm running on OpenBSD vmd.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-Mb9795b4dfca534b87cfa4a01
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1495 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [9fans] Re: 9k amd64 kernel and floating point
  2025-03-02 21:48 ` [9fans] " i
@ 2025-03-03  0:14   ` i
  2025-03-03 10:52     ` Alyssa M via 9fans
  0 siblings, 1 reply; 21+ messages in thread
From: i @ 2025-03-03  0:14 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 477 bytes --]

cpu% fpnote2
start counting
...postnote
.postnote done
floatnote mynote
................................................................................................49999995000001 49999995000001

with 9legacy 9k and my OpenBSD vmd patches.
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-Md8d53240ad39ce3f33597c6e
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1048 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [9fans] Re: 9k amd64 kernel and floating point
  2025-03-03  0:14   ` i
@ 2025-03-03 10:52     ` Alyssa M via 9fans
  0 siblings, 0 replies; 21+ messages in thread
From: Alyssa M via 9fans @ 2025-03-03 10:52 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 1464 bytes --]

I'm seeing a similar but intermittent failure with arm32 on the Raspberry Pi, running Richard Miller's build.
Just occasionally I see the "out of sync" message - always so far with apparently equal number pairs reported (though different each time). It might be unrelated.

term% fpnote2
start counting
.postnote
postnote done
floatnote mynote
out of sync 687926779 687926779
term%

and another with:

.......floatnote interrupt
.floatnote interrupt
out of sync 39461826279152 39461826279152

when I hammered on the DEL key.

One time I've also seen it get stuck in a loop producing:

term% fpnote2
start counting
.postnote
postnote done
floatnote mynote
floatnote sys: trap: fault read va=0x11104 pc=0x1604
floatnote sys: trap: fault read va=0x11104 pc=0x1604
floatnote sys: trap: fault read va=0x11104 pc=0x1604
floatnote sys: trap: fault read va=0x11104 pc=0x1604
....
Hitting DEL seemed to snap it out of it, and it completed successfully after a while...

I tried replacing the print() call in the note handler with write()s, because I was suspicious that it might do some illicit FP, but that had no effect. The problem still happens.

7c on the arm64 with 9front seems not to show this problem.
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-M78599f447066a2e3cef0aa28
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 2516 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [9fans] 9k amd64 kernel and floating point
  2025-03-02 20:28 [9fans] 9k amd64 kernel and floating point Russ Cox
  2025-03-02 21:48 ` [9fans] " i
@ 2025-03-03 16:07 ` cinap_lenrek
  2025-03-03 16:23   ` hahahahacker2009
  2025-03-05 13:41   ` Russ Cox
  1 sibling, 2 replies; 21+ messages in thread
From: cinap_lenrek @ 2025-03-03 16:07 UTC (permalink / raw)
  To: 9fans

good day.

i'v took a look at the patch:

https://github.com/rsc/plan9/commit/3715bf9b86a86ed6a3a857cabfc7dff5d70b409b

i spend some time yesterday trying to implement the
same behaviour in 9front kernels and it is not too
difficult, but the semantics would need to be clarified
and i'm not totally convinced this is the right approach.

one issue i can see with this is that now it is unclear what
/proc/$pid/fpregs is supposed to be once a process executes
the note handler.

as i understand the patch above, fpregs would reflect the registers
before executing the handler, not the one being in use. was this
intended? i think this could lead to confusion if code
uses vector instructions and the debugger doent reflect the actual
register state of whats being executed.

maybe the solution could be to have 2 register files then:

fpregs <- current execution
ofpregs <- saved at trap/note

also, do we want the note-handler to have a clean (FPinit) fpu
context or should it get a copy? and why?

do we want to backport this to all architectures, or just
the ones that combine fp and vector instructions?

it would be a good idea to document this changes of behaviour
somewhere and have a rationale.

and last: have the go developers considered alternatives
such as NSAVE/NRSTR for their "signal" handling?

--
cinap

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-M3ba57043ade5e1039c47ffe3
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [9fans] 9k amd64 kernel and floating point
  2025-03-03 16:07 ` [9fans] " cinap_lenrek
@ 2025-03-03 16:23   ` hahahahacker2009
  2025-03-04 13:23     ` Alyssa M via 9fans
  2025-03-05 13:41   ` Russ Cox
  1 sibling, 1 reply; 21+ messages in thread
From: hahahahacker2009 @ 2025-03-03 16:23 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 310 bytes --]

I dont think that vlong on arm32 raspberry pi works. And the code below use vlong.
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-M1808d84304068c0c09f6d580
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 798 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [9fans] 9k amd64 kernel and floating point
  2025-03-03 16:23   ` hahahahacker2009
@ 2025-03-04 13:23     ` Alyssa M via 9fans
  0 siblings, 0 replies; 21+ messages in thread
From: Alyssa M via 9fans @ 2025-03-04 13:23 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 546 bytes --]

Well it's a 32bit processor, so vlong will be emulated partly in the runtime library and partly by the compiler. If there's a bug there then an example would be a contribution to science. :) I would still expect such a bug to produce a reliable wrong answer or a crash - and not be perturbed by a note handler, though.
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-M6d76e156216d9464b557f9a9
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1048 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [9fans] 9k amd64 kernel and floating point
  2025-03-03 16:07 ` [9fans] " cinap_lenrek
  2025-03-03 16:23   ` hahahahacker2009
@ 2025-03-05 13:41   ` Russ Cox
  2025-03-05 17:38     ` cinap_lenrek
  2025-03-05 19:40     ` cinap_lenrek
  1 sibling, 2 replies; 21+ messages in thread
From: Russ Cox @ 2025-03-05 13:41 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 3986 bytes --]

What I thought was a floating-point problem turned out to be an integer
problem: noted(NCONT) was smashing a few of the integer registers on its
way back to the original note context. Nothing to do with floats at all.
https://github.com/rsc/plan9/commit/aa00f938f6b3c6a5b4502c25605666f479a22c16

On Mon, Mar 3, 2025 at 11:17 AM <cinap_lenrek@felloff.net> wrote:

> one issue i can see with this is that now it is unclear what
> /proc/$pid/fpregs is supposed to be once a process executes
> the note handler.
>

I was focusing on making something work at all, but I agree that's a detail
to get right. I believe that /proc/$pid/regs is the note registers during
the note handler, and so .../fpregs should be the same. I don't think we
need fpregs and ofpregs, since we don't have regs and oregs.

also, do we want the note-handler to have a clean (FPinit) fpu
> context or should it get a copy? and why?
>

I chose to give it a copy because systems might initialize certain
registers to certain values at startup and then assume those are always
set. Copying preserves those decisions. The integer registers are also
copied. To me it fits with the simple elegance of fork compared to the
awkwardness of StartProcess. (Other operating system designers may
disagree, but Plan 9 is a system in the fork camp.)


> do we want to backport this to all architectures, or just
> the ones that combine fp and vector instructions?
>

It would make sense to me to allow on all architectures for consistency,
but I'm personally working on this as a Go developer, not a Plan 9
developer. I'm just trying to minimize the irregularities that Go takes on
to support Plan 9. Thanks for looking into making the same fix on 9front. I
think my /sys/src/9k fix is a little cleaner, for what it's worth. In the
/sys/src/9 fix I probably should have used two state fields instead of one
shifty field.

it would be a good idea to document this changes of behaviour
> somewhere and have a rationale.
>

Definitely. My tree is a personal work space, not a real Plan 9
distribution. I don't think you need much rationale though. This was an
irregular limitation that has been removed. Vector instructions have become
very important and will only become more so. Plan 9 couldn't even use them
in memmove because memmove might be called in a note handler. To me at
least, that's clearly wrong. The limitation was bearable only because it
didn't matter much. Now it matters a significant amount. I notice that the
9front kernel allows floating point during kernel execution, presumably for
the same reason.

Updated the docs in my tree:
https://github.com/rsc/plan9/commit/dd95b25897369ff2575b2ad744e18954c4620464

and last: have the go developers considered alternatives
> such as NSAVE/NRSTR for their "signal" handling?
>

I've never worked on the Plan 9 port of Go before, so I can't speak for
the port's developers did, but I had completely forgotten NSAVE/NRSTR
existed. I just looked at them anew, and I don't think they would help.
noted(NRSTR) puts the onus of saving registers on the user process, which
is maybe not great but fine for the CPU registers in the ureg struct, but
the FPU registers aren't there. You'd need to make a separate space for
them, and then worry about all the nuances of different FP save routines on
different systems, detecting whether they need to be saved at all, and so
on. The kernel already has to get all that code right for context switches.
Getting it right for note handlers seems like a reasonably small expansion
of that role. Other operating systems do that just fine. And as I noted
(ha!) earlier, this change benefits C as much as it benefits Go.

Best,
Russ

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-Mb1f1ebaf3409dab87046da32
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 5833 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [9fans] 9k amd64 kernel and floating point
  2025-03-05 13:41   ` Russ Cox
@ 2025-03-05 17:38     ` cinap_lenrek
  2025-03-07  1:08       ` Anthony Martin
  2025-03-05 19:40     ` cinap_lenrek
  1 sibling, 1 reply; 21+ messages in thread
From: cinap_lenrek @ 2025-03-05 17:38 UTC (permalink / raw)
  To: 9fans

Thank you for your answer.

> I believe that /proc/$pid/regs is the note registers during
> the note handler, and so .../fpregs should be the same. I don't think we
> need fpregs and ofpregs, since we don't have regs and oregs.

I think that is mistaken. /proc/n/regs gives you the registers
of the process when it entered kernel mode (Porc.dbgreg)
regardless of if the process is currently executing a note
handler or not.

Which makes alot of sense, as this is the basis for the debugger
to work (you place BRK instructions and the process traps from
usermdoe to kernel, saves the user registers in Proc.dbgreg
and wakes the debugger to take a look).

The original user registers before we enter the note handler
are actually copied to the user-mode stack (by notify), and
the kernel remembers a pointer to it (Proc.ureg) for later
when the handler does noted() syscall.

In summary, we have:

Proc.dbgreg             <- pointer to Ureg on the kernel stack (for debugger)
Proc.ureg               <- pointer to saved Ureg on user stack (for noted)

Also, the user-process is free to modify the registers before
calling noted().

There was never a Ureg* equivalent for the fpu state (being
dumped on the user stack). However, i believe, as the fpu is being
disabled during execution of the note handler, you could imaginge
a handler using fpregs file to read/modify the fpu state.

So the conflict here is that we have two valid uses here.
We want the current fpu state (for debuggers) and for note handlers,
we might need to get to the saved fpu state of the interrupted
program. That is why i proposed the ofpregs file (for archs that
provide a separate fpu context for note handlers). Kind of prioritying 
the debugger here, as it does not need to be aware about if
the process is currently executing in a note handler or not,
it will always be the "regs" and "fpregs" files reflecting the
current state of the process. For note handler, you'd open
"ofpregs" or "fpregs" files depending on the kernel support.

> It would make sense to me to allow on all architectures for consistency,
> but I'm personally working on this as a Go developer, not a Plan 9
> developer. I'm just trying to minimize the irregularities that Go takes on
> to support Plan 9.

That was mainly why i was asking about a rationale. Who wants
to touch for example the MIPS fpu enumation code for this?
The real motivation here is really the integer vector instructions.
Lets make that explicit.

As you said, the 9front kernel does a similar with within the
kernel itself for amd64 and arm64 allowing the use of vector
instrictions even from interrupt handlers. Tho this change was
much easier todo it is not visible to the userspace impact.

>> also, do we want the note-handler to have a clean (FPinit) fpu
>> context or should it get a copy? and why?
> I chose to give it a copy because systems might initialize certain
> registers to certain values at startup and then assume those are always
> set. Copying preserves those decisions. The integer registers are also
> copied. To me it fits with the simple elegance of fork compared to the
> awkwardness of StartProcess. (Other operating system designers may
> disagree, but Plan 9 is a system in the fork camp.)

ok, fair enougth.

>> such as NSAVE/NRSTR for their "signal" handling?
> I've never worked on the Plan 9 port of Go before, so I can't speak for
> the port's developers did, but I had completely forgotten NSAVE/NRSTR
> existed. I just looked at them anew, and I don't think they would help.
> noted(NRSTR) puts the onus of saving registers on the user process, which
> is maybe not great but fine for the CPU registers in the ureg struct, but
> the FPU registers aren't there.

Yes, it allows you to handle the saving logic from usermode and lets
you have also re-entrant handlers (this was for emulating unix signals).

> You'd need to make a separate space for them, and then worry about all the
> nuances of different FP save routines on
> different systems, detecting whether they need to be saved at all, and so
> on.

Well, your compiler emits simd instructions already, all you'd need is to
wrap your handler with some code that saves the registers and restores
them afterwards. You already know what registers your generated code is
going to use. no?

> The kernel already has to get all that code right for context switches.
> Getting it right for note handlers seems like a reasonably small expansion
> of that role. Other operating systems do that just fine.

Sure, you need to reserve some space eigther in the kernel, or in
userspace. Note also that shoving another FPsave struct into the Proc
structure increases its size by ~1K. So everyone pays that price
for go programs using simd in note handlers. (Tho this can be avoided).

> And as I noted (ha!) earlier, this change benefits C as much as it benefits Go.

Yes please, that is what i found missing in all these go bug reports
and commit messages. There needed to be a discussion about the
pros and cons and some convincing. I want to avoid another nsec() scenario
where new features get hastily introduced for go, just for go later to abandon it.
Go has alot more options here solving the problem. Anything we add to the
kernel is going to stick around forever... If we go there, lets do it
properly.

--
cinap

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-M50476cb6a8468035fa4eb478
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [9fans] 9k amd64 kernel and floating point
  2025-03-05 13:41   ` Russ Cox
  2025-03-05 17:38     ` cinap_lenrek
@ 2025-03-05 19:40     ` cinap_lenrek
  2025-04-07 18:41       ` ori
  1 sibling, 1 reply; 21+ messages in thread
From: cinap_lenrek @ 2025-03-05 19:40 UTC (permalink / raw)
  To: 9fans

just for reference the work-in-progress implementation for this on 9front amd64 kernel:

https://felloff.net/usr/cinap_lenrek/pc64fpnote.patch

needs some work around the devproc access to the fpsave structs
and maybe find an alternative to the #ifdef code... but otherwise
the handler part was easy to implement.

and the fpnote.c testsuite (thanks for that btw :-)) output from my ryzen:

cpu% ./6.out
# nop
# float1
# noteonly
floatnote mynote
# beforenote
floatnote mynote
# afternote
floatnote mynote
# beforeafter
floatnote mynote
# simul
start counting
5e+13
PASS
postnote
cpu% postnote done

--
cinap

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-M3e58a7cb7e234baa24a5de28
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [9fans] 9k amd64 kernel and floating point
  2025-03-05 17:38     ` cinap_lenrek
@ 2025-03-07  1:08       ` Anthony Martin
  2025-03-07 18:04         ` Russ Cox
  2025-03-07 18:44         ` cinap_lenrek
  0 siblings, 2 replies; 21+ messages in thread
From: Anthony Martin @ 2025-03-07  1:08 UTC (permalink / raw)
  To: 9fans

cinap_lenrek@felloff.net once said:
> There was never a Ureg* equivalent for the fpu state (being dumped on
> the user stack). However, i believe, as the fpu is being disabled
> during execution of the note handler, you could imaginge a handler
> using fpregs file to read/modify the fpu state.
>
> So the conflict here is that we have two valid uses here. We want the
> current fpu state (for debuggers) and for note handlers, we might need
> to get to the saved fpu state of the interrupted program. That is why
> i proposed the ofpregs file (for archs that provide a separate fpu
> context for note handlers). Kind of prioritying the debugger here, as
> it does not need to be aware about if the process is currently
> executing in a note handler or not, it will always be the "regs" and
> "fpregs" files reflecting the current state of the process. For note
> handler, you'd open "ofpregs" or "fpregs" files depending on the
> kernel support.

If we're going down this route, we should copy the FPsave
to the stack in notify as we do the Ureg. Then we wouldn't
need the ofpregs file at all. This will require some thinking
about backwards compatibility. There needs to be a way
to signal that you have additional data after the Ureg.

"I agree that it is really sad that we have to save/restore FP on
signals, but I think it's unavoidable." - torvalds, 2002

> I want to avoid another nsec() scenario where new features get hastily
> introduced for go, just for go later to abandon it. Go has alot more
> options here solving the problem. Anything we add to the kernel is
> going to stick around forever... If we go there, lets do it properly.

What did Go abandon? It looks like they're still using the nsec
and tsemacquire system calls over a decade later. Did I miss
something?

Cheers,
  Anthony

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-M7101db7be15ad0c148f61a4e
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [9fans] 9k amd64 kernel and floating point
  2025-03-07  1:08       ` Anthony Martin
@ 2025-03-07 18:04         ` Russ Cox
  2025-03-07 18:44         ` cinap_lenrek
  1 sibling, 0 replies; 21+ messages in thread
From: Russ Cox @ 2025-03-07 18:04 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 1285 bytes --]

Hi Cinap,

I see what you are saying about wanting /proc/n/fpregs for both debuggers
vs note handlers themselves. I think debuggers are more likely, so fpregs
should be the current FP registers (if you're in a note handler, it's the
handler's registers). I would suggest /proc/n/notefpregs for the fp
registers at the time a note arrived, and that the file returns a read
error if a note is not being handled.

As for pushing all the register save/restore logic into Go, that still
doesn't make any sense to me. Again, C programs benefit from this just as
much as Go programs do. I don't see why every Plan 9 C program should have
to link in a copy of the same FP saving logic when the kernel can just do
it in one place.

I only did 386 and amd64, but I think it would make sense (and should not
be too hard) to apply the change to all architectures. Programs that do
some floating point in a note handler shouldn't only run on a subset of
systems. I will leave the other architectures to the people who maintain
Plan 9.

Best,
Russ

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-M0f06f2e1648bd6a697d9d11a
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1887 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [9fans] 9k amd64 kernel and floating point
  2025-03-07  1:08       ` Anthony Martin
  2025-03-07 18:04         ` Russ Cox
@ 2025-03-07 18:44         ` cinap_lenrek
  2025-03-10 15:00           ` Russ Cox
  1 sibling, 1 reply; 21+ messages in thread
From: cinap_lenrek @ 2025-03-07 18:44 UTC (permalink / raw)
  To: 9fans

> If we're going down this route, we should copy the FPsave
> to the stack in notify as we do the Ureg. Then we wouldn't
> need the ofpregs file at all. 

yeah. i'm not fond of it ofpregs-file solution eigther.

> This will require some thinking
> about backwards compatibility. There needs to be a way
> to signal that you have additional data after the Ureg.

i suppose one could just pass a pointer as a second argument
to the note handler. new code using the second argument would
crash on a older kernel, but old userpsace code would just ignore
the extra argument on a new kernel. libthread programs would
need to be checked if they have enougth stack space for the
extra fpu regs.

a cleaner way might be to introduce a new syscall to
set the note handler that is fpu-context aware?

> What did Go abandon? It looks like they're still using the nsec
> and tsemacquire system calls over a decade later. Did I miss
> something?

i apologize, you whre right, it was not abandoned.
there was confusion about the calling convention of the syscall.

//go:nosplit
func nanotime1() int64 {
        var scratch int64
        ns := nsec(&scratch)
        // TODO(aram): remove hack after I fix _nsec in the pc64 kernel.
        if ns == 0 {
                return scratch
        }
        return ns
}

--
cinap

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-M45ef9ecbdcbbc86b7e4a1731
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [9fans] 9k amd64 kernel and floating point
  2025-03-07 18:44         ` cinap_lenrek
@ 2025-03-10 15:00           ` Russ Cox
  2025-03-11  3:35             ` Alyssa M via 9fans
  0 siblings, 1 reply; 21+ messages in thread
From: Russ Cox @ 2025-03-10 15:00 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 722 bytes --]

Hi again,

I'd forgotten about libthread and stack space for Uregs. That's
unfortunate, especially because Intel keeps growing the size of the FP
regs. I'd rather not hard-code the expected size in binaries and then have
to deal with it growing again. It is starting to sound like a notefpregs
file is the right answer. The vast majority of note handlers won't care and
won't need updating.

Speaking of nsec, there is a problem with that too. I'll start a new thread.

Best,
Russ

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-M3734ba96550944a26e350b6e
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1281 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [9fans] 9k amd64 kernel and floating point
  2025-03-10 15:00           ` Russ Cox
@ 2025-03-11  3:35             ` Alyssa M via 9fans
  2025-03-15 16:14               ` Alyssa M via 9fans
  0 siblings, 1 reply; 21+ messages in thread
From: Alyssa M via 9fans @ 2025-03-11  3:35 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 345 bytes --]

Is there ever a need for more than one active Ureg+fpreg per process? If not, maybe it could be pre-allocated at TOS.
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-M3d6119de5a43ce683af5a97c
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 839 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [9fans] 9k amd64 kernel and floating point
  2025-03-11  3:35             ` Alyssa M via 9fans
@ 2025-03-15 16:14               ` Alyssa M via 9fans
  2025-03-15 21:33                 ` Charles Forsyth
  2025-03-22 19:40                 ` Russ Cox
  0 siblings, 2 replies; 21+ messages in thread
From: Alyssa M via 9fans @ 2025-03-15 16:14 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 1019 bytes --]

Well, I tried this on my partly-implemented Plan 9 emulator. I sandwiched Ureg+fpreg between argv and the TOS structure. So note handlers get a fixed Ureg address that happens to have the fpregs after it. noted(2) is altered to use the fixed address for Ureg. Plan 9 binaries don't have to change, and libthread shouldn't blow any stacks, whatever Intel does (or indeed ARM in this case).
The fpnote2 binary I compiled on plan 9 runs on my emulator, as does a test program that does floating point in a note handler.
(I have to admit I'm still wondering what on Earth people are doing with floating point in note handlers.)
One of these days I'll try to get set up to tinker with the Plan 9 kernel, but right now I don't want to risk breaking the installation on my little Raspberry Pi...
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-Me372a1c0bea9d981f7233b3d
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1578 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [9fans] 9k amd64 kernel and floating point
  2025-03-15 16:14               ` Alyssa M via 9fans
@ 2025-03-15 21:33                 ` Charles Forsyth
  2025-03-15 21:34                   ` Charles Forsyth
  2025-03-22 19:40                 ` Russ Cox
  1 sibling, 1 reply; 21+ messages in thread
From: Charles Forsyth @ 2025-03-15 21:33 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 1782 bytes --]

>
> (I have to admit I'm still wondering what on Earth people are doing with
> floating point in note handlers.)


i only ever used them to resume a process/coroutine/whatever. to be fair,
it's trickier since "fp" came to include integer vector instructions and
block moves

On Sat, 15 Mar 2025 at 16:59, Alyssa M via 9fans <9fans@9fans.net> wrote:

> Well, I tried this on my partly-implemented Plan 9 emulator. I sandwiched
> Ureg+fpreg between argv and the TOS structure. So note handlers get a fixed
> Ureg address that happens to have the fpregs after it. noted(2) is altered
> to use the fixed address for Ureg. Plan 9 binaries don't have to change,
> and libthread shouldn't blow any stacks, whatever Intel does (or indeed ARM
> in this case).
> The fpnote2 binary I compiled on plan 9 runs on my emulator, as does a
> test program that does floating point in a note handler.
> (I have to admit I'm still wondering what on Earth people are doing with
> floating point in note handlers.)
> One of these days I'll try to get set up to tinker with the Plan 9 kernel,
> but right now I don't want to risk breaking the installation on my little
> Raspberry Pi...
> *9fans <https://9fans.topicbox.com/latest>* / 9fans / see discussions
> <https://9fans.topicbox.com/groups/9fans> + participants
> <https://9fans.topicbox.com/groups/9fans/members> + delivery options
> <https://9fans.topicbox.com/groups/9fans/subscription> Permalink
> <https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-Me372a1c0bea9d981f7233b3d>
>

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-M18d6aeb8dd012dd2a2d91c1a
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 2417 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [9fans] 9k amd64 kernel and floating point
  2025-03-15 21:33                 ` Charles Forsyth
@ 2025-03-15 21:34                   ` Charles Forsyth
  0 siblings, 0 replies; 21+ messages in thread
From: Charles Forsyth @ 2025-03-15 21:34 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 1974 bytes --]

it's funny how old the problem of process trap handling is in computing.


On Sat, 15 Mar 2025 at 21:33, Charles Forsyth <charles.forsyth@gmail.com>
wrote:

> (I have to admit I'm still wondering what on Earth people are doing with
>> floating point in note handlers.)
>
>
> i only ever used them to resume a process/coroutine/whatever. to be fair,
> it's trickier since "fp" came to include integer vector instructions and
> block moves
>
> On Sat, 15 Mar 2025 at 16:59, Alyssa M via 9fans <9fans@9fans.net> wrote:
>
>> Well, I tried this on my partly-implemented Plan 9 emulator. I sandwiched
>> Ureg+fpreg between argv and the TOS structure. So note handlers get a fixed
>> Ureg address that happens to have the fpregs after it. noted(2) is altered
>> to use the fixed address for Ureg. Plan 9 binaries don't have to change,
>> and libthread shouldn't blow any stacks, whatever Intel does (or indeed ARM
>> in this case).
>> The fpnote2 binary I compiled on plan 9 runs on my emulator, as does a
>> test program that does floating point in a note handler.
>> (I have to admit I'm still wondering what on Earth people are doing with
>> floating point in note handlers.)
>> One of these days I'll try to get set up to tinker with the Plan 9
>> kernel, but right now I don't want to risk breaking the installation on my
>> little Raspberry Pi...
>> *9fans <https://9fans.topicbox.com/latest>* / 9fans / see discussions
>> <https://9fans.topicbox.com/groups/9fans> + participants
>> <https://9fans.topicbox.com/groups/9fans/members> + delivery options
>> <https://9fans.topicbox.com/groups/9fans/subscription> Permalink
>> <https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-Me372a1c0bea9d981f7233b3d>
>>

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-M2ad633cf79e97d0cad97e0b3
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 2902 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [9fans] 9k amd64 kernel and floating point
  2025-03-15 16:14               ` Alyssa M via 9fans
  2025-03-15 21:33                 ` Charles Forsyth
@ 2025-03-22 19:40                 ` Russ Cox
  2025-03-22 21:43                   ` Alyssa M via 9fans
  1 sibling, 1 reply; 21+ messages in thread
From: Russ Cox @ 2025-03-22 19:40 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 499 bytes --]

On Sat, Mar 15, 2025 at 12:58 PM Alyssa M via 9fans <9fans@9fans.net> wrote:

> (I have to admit I'm still wondering what on Earth people are doing with
> floating point in note handlers.)
>

Are you saying I shouldn't be writing my note handlers in Javascript?

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-M67870f1315b412e0ac50f162
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 1252 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [9fans] 9k amd64 kernel and floating point
  2025-03-22 19:40                 ` Russ Cox
@ 2025-03-22 21:43                   ` Alyssa M via 9fans
  0 siblings, 0 replies; 21+ messages in thread
From: Alyssa M via 9fans @ 2025-03-22 21:43 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 282 bytes --]

Lol! Well I'm sure someone will! Or will want to... :)
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-Md04eee609e574cc8fdae9845
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Attachment #2: Type: text/html, Size: 780 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [9fans] 9k amd64 kernel and floating point
  2025-03-05 19:40     ` cinap_lenrek
@ 2025-04-07 18:41       ` ori
  0 siblings, 0 replies; 21+ messages in thread
From: ori @ 2025-04-07 18:41 UTC (permalink / raw)
  To: 9fans

To update those not closely watching commits; this got committed over the weekend.

Quoth cinap_lenrek@felloff.net:
> just for reference the work-in-progress implementation for this on 9front amd64 kernel:
> 
> https://felloff.net/usr/cinap_lenrek/pc64fpnote.patch
> 
> needs some work around the devproc access to the fpsave structs
> and maybe find an alternative to the #ifdef code... but otherwise
> the handler part was easy to implement.
> 
> and the fpnote.c testsuite (thanks for that btw :-)) output from my ryzen:
> 
> cpu% ./6.out
> # nop
> # float1
> # noteonly
> floatnote mynote
> # beforenote
> floatnote mynote
> # afternote
> floatnote mynote
> # beforeafter
> floatnote mynote
> # simul
> start counting
> 5e+13
> PASS
> postnote
> cpu% postnote done
> 
> --
> cinap

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Taf6b900592afc500-Mfef74e2bec47c170f0a6acb6
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2025-04-07 18:42 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-03-02 20:28 [9fans] 9k amd64 kernel and floating point Russ Cox
2025-03-02 21:48 ` [9fans] " i
2025-03-03  0:14   ` i
2025-03-03 10:52     ` Alyssa M via 9fans
2025-03-03 16:07 ` [9fans] " cinap_lenrek
2025-03-03 16:23   ` hahahahacker2009
2025-03-04 13:23     ` Alyssa M via 9fans
2025-03-05 13:41   ` Russ Cox
2025-03-05 17:38     ` cinap_lenrek
2025-03-07  1:08       ` Anthony Martin
2025-03-07 18:04         ` Russ Cox
2025-03-07 18:44         ` cinap_lenrek
2025-03-10 15:00           ` Russ Cox
2025-03-11  3:35             ` Alyssa M via 9fans
2025-03-15 16:14               ` Alyssa M via 9fans
2025-03-15 21:33                 ` Charles Forsyth
2025-03-15 21:34                   ` Charles Forsyth
2025-03-22 19:40                 ` Russ Cox
2025-03-22 21:43                   ` Alyssa M via 9fans
2025-03-05 19:40     ` cinap_lenrek
2025-04-07 18:41       ` ori

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).