9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] syscall silently kill processes
@ 2022-06-17  9:37 andrey100100100
  2022-06-17 13:46 ` Thaddeus Woskowiak
  0 siblings, 1 reply; 49+ messages in thread
From: andrey100100100 @ 2022-06-17  9:37 UTC (permalink / raw)
  To: 9fans

Hi all!

Strange behavior of syscall 'read' with signal 'alarm' in followed
simple program (ip/port - not matter):

-----------------------------------------------------------------------
#include <u.h>
#include <libc.h>
#include <thread.h>

static int
handler_alarm(void *, char *msg)
{
        if(strstr(msg, "alarm"))
                return 1;

        return 0;
}

static void
proc_udp(void *)
{
        char resp[512];
        char req[] = "request";
        int fd;

        threadnotify(handler_alarm, 1);

        if((fd = dial("udp!185.157.221.201!5678", nil, nil, nil)) >=
0){
                if(write(fd, req, strlen(req)) == strlen(req)){
                        fprint(1, "start\n");
                        alarm(2000);
                        read(fd, resp, sizeof(resp));
                        alarm(0);
                        fprint(1, "end\n");
                }
                close(fd);
        }

        threadexits(nil);
}

int mainstacksize = 5242880;

void
threadmain(int argc, char *argv[])
{
        for(int i = 0; i < 80; i++){
                proccreate(proc_udp, nil, 10240);
        }

        sleep(5000);
        threadexitsall(nil);
}
-----------------------------------------------------------------------


cpu% 6.out | grep end | wc -l
     33

sometimes little more or less
but

cpu% 6.out | grep start | wc -l
     80
     
always.

Testing on Miller's RPi and 9front (amd64 & RPi 2)


Why does read() kill process?
Why not always?
Why number of 'ended' processes arond 33?
This is normal behavior?
How to fix the program so that the processes do not lost?


Can someone point me in the right direction?


Thanks!
Andrey

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M6b9bfae581b00133c66b93c2
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-17  9:37 [9fans] syscall silently kill processes andrey100100100
@ 2022-06-17 13:46 ` Thaddeus Woskowiak
  2022-06-17 14:11   ` Jacob Moody
  0 siblings, 1 reply; 49+ messages in thread
From: Thaddeus Woskowiak @ 2022-06-17 13:46 UTC (permalink / raw)
  To: 9fans

I believe threadnotify() should be called from threadmain() to
properly register the handler in the rendez group.

On Fri, Jun 17, 2022 at 5:39 AM <andrey100100100@gmail.com> wrote:
> 
> Hi all!
> 
> Strange behavior of syscall 'read' with signal 'alarm' in followed
> simple program (ip/port - not matter):
> 
> -----------------------------------------------------------------------
> #include <u.h>
> #include <libc.h>
> #include <thread.h>
> 
> static int
> handler_alarm(void *, char *msg)
> {
>         if(strstr(msg, "alarm"))
>                 return 1;
> 
> return 0;
> }
> 
> static void
> proc_udp(void *)
> {
>         char resp[512];
>         char req[] = "request";
>         int fd;
> 
> threadnotify(handler_alarm, 1);
> 
> if((fd = dial("udp!185.157.221.201!5678", nil, nil, nil)) >=
> 0){
>         if(write(fd, req, strlen(req)) == strlen(req)){
>                 fprint(1, "start\n");
>                 alarm(2000);
>                 read(fd, resp, sizeof(resp));
>                 alarm(0);
>                 fprint(1, "end\n");
>         }
>         close(fd);
> }
> 
> threadexits(nil);
> }
> 
> int mainstacksize = 5242880;
> 
> void
> threadmain(int argc, char *argv[])
> {
>         for(int i = 0; i < 80; i++){
>                 proccreate(proc_udp, nil, 10240);
>         }
> 
> sleep(5000);
> threadexitsall(nil);
> }
> -----------------------------------------------------------------------
> 
> cpu% 6.out | grep end | wc -l
>      33
> 
> sometimes little more or less
> but
> 
> cpu% 6.out | grep start | wc -l
>      80
> 
> always.
> 
> Testing on Miller's RPi and 9front (amd64 & RPi 2)
> 
> Why does read() kill process?
> Why not always?
> Why number of 'ended' processes arond 33?
> This is normal behavior?
> How to fix the program so that the processes do not lost?
> 
> Can someone point me in the right direction?
> 
> Thanks!
> Andrey

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M799a747eed5b007fc4d07fbe
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-17 13:46 ` Thaddeus Woskowiak
@ 2022-06-17 14:11   ` Jacob Moody
  2022-06-17 14:39     ` Thaddeus Woskowiak
  2022-06-17 15:06     ` andrey100100100
  0 siblings, 2 replies; 49+ messages in thread
From: Jacob Moody @ 2022-06-17 14:11 UTC (permalink / raw)
  To: 9fans

On 6/17/22 07:46, Thaddeus Woskowiak wrote:
> I believe threadnotify() should be called from threadmain() to
> properly register the handler in the rendez group

This is incorrect, according to thread(2):

"The thread library depends on all procs
being in the same rendezvous group"

The issue here is that your note handler has to call noted,
you are returning from the handler without actually resuming the program.
You either need to call noted(NCONT) to resume execution or noted(NDFLT)
to stop execution.

An excerpt from notify(2):

"A notification handler must finish either by exiting the
program or by calling noted; if the handler returns the
behavior is undefined and probably erroneous."

So you are indeed observing undefined behavior.


Hope this helps,
moody

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-Mfced9ffce2a92c38458048ad
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-17 14:11   ` Jacob Moody
@ 2022-06-17 14:39     ` Thaddeus Woskowiak
  2022-06-17 15:06     ` andrey100100100
  1 sibling, 0 replies; 49+ messages in thread
From: Thaddeus Woskowiak @ 2022-06-17 14:39 UTC (permalink / raw)
  To: 9fans

On Fri, Jun 17, 2022 at 10:13 AM Jacob Moody <moody@posixcafe.org> wrote:
>
> On 6/17/22 07:46, Thaddeus Woskowiak wrote:
> > I believe threadnotify() should be called from threadmain() to
> > properly register the handler in the rendez group
>
> This is incorrect, according to thread(2):
>
> "The thread library depends on all procs
> being in the same rendezvous group"
>

Doh! Thanks for the info moody.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M0d5aad458aafef3bcbf5c79c
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-17 14:11   ` Jacob Moody
  2022-06-17 14:39     ` Thaddeus Woskowiak
@ 2022-06-17 15:06     ` andrey100100100
  2022-06-17 16:08       ` Skip Tavakkolian
  2022-06-17 16:11       ` Jacob Moody
  1 sibling, 2 replies; 49+ messages in thread
From: andrey100100100 @ 2022-06-17 15:06 UTC (permalink / raw)
  To: 9fans

В Пт, 17/06/2022 в 08:11 -0600, Jacob Moody пишет:
> On 6/17/22 07:46, Thaddeus Woskowiak wrote:
> > I believe threadnotify() should be called from threadmain() to
> > properly register the handler in the rendez group
> 
> This is incorrect, according to thread(2):
> 
> "The thread library depends on all procs
> being in the same rendezvous group"


From sleep(2):

    Alarm causes an alarm note (see notify(2)) to be sent to the
    invoking process after the number of milliseconds given by
    the argument.

Mean to be sent only to the invoking process, NOT to the process group.

> 
> The issue here is that your note handler has to call noted,
> you are returning from the handler without actually resuming the
> program.
> You either need to call noted(NCONT) to resume execution or
> noted(NDFLT)
> to stop execution.
> 
> An excerpt from notify(2):
> 
> "A notification handler must finish either by exiting the
> program or by calling noted; if the handler returns the
> behavior is undefined and probably erroneous."
> 
> So you are indeed observing undefined behavior.
> 

With:

------------------------------------
static int
handler_alarm(void *, char *msg)
{
        if(strstr(msg, "alarm")){
                noted(NCONT);
                return 1;
        }

        return 0;
}
------------------------------------

result the same:

cpu% 6.out | grep end | wc -l
     33


And noted(NCONT) may be needed, when process recieved many (2 and more)
notes at once.

May be something wrong  with interrupted an incomplete  system call?


> 
> Hope this helps,
> moody
> 


Regards,
Andrej





------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M4fa69df14eff60273727c92b
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-17 15:06     ` andrey100100100
@ 2022-06-17 16:08       ` Skip Tavakkolian
  2022-06-17 16:11         ` Skip Tavakkolian
  2022-06-17 16:11       ` Jacob Moody
  1 sibling, 1 reply; 49+ messages in thread
From: Skip Tavakkolian @ 2022-06-17 16:08 UTC (permalink / raw)
  To: 9fans

interesting catch. it seems to be a tunable limit.

% grep NFN /sys/src/libthread/note.c
#define NFN 33
static int (*onnote[NFN])(void*, char*);
static int onnotepid[NFN];
for(i=0; i<NFN; i++)
return i<NFN;
for(i=0; i<NFN; i++){
if(i==NFN){

On Fri, Jun 17, 2022 at 8:08 AM <andrey100100100@gmail.com> wrote:
>
> В Пт, 17/06/2022 в 08:11 -0600, Jacob Moody пишет:
> > On 6/17/22 07:46, Thaddeus Woskowiak wrote:
> > > I believe threadnotify() should be called from threadmain() to
> > > properly register the handler in the rendez group
> >
> > This is incorrect, according to thread(2):
> >
> > "The thread library depends on all procs
> > being in the same rendezvous group"
>
>
> From sleep(2):
>
>     Alarm causes an alarm note (see notify(2)) to be sent to the
>     invoking process after the number of milliseconds given by
>     the argument.
>
> Mean to be sent only to the invoking process, NOT to the process group.
>
> >
> > The issue here is that your note handler has to call noted,
> > you are returning from the handler without actually resuming the
> > program.
> > You either need to call noted(NCONT) to resume execution or
> > noted(NDFLT)
> > to stop execution.
> >
> > An excerpt from notify(2):
> >
> > "A notification handler must finish either by exiting the
> > program or by calling noted; if the handler returns the
> > behavior is undefined and probably erroneous."
> >
> > So you are indeed observing undefined behavior.
> >
>
> With:
>
> ------------------------------------
> static int
> handler_alarm(void *, char *msg)
> {
>         if(strstr(msg, "alarm")){
>                 noted(NCONT);
>                 return 1;
>         }
>
>         return 0;
> }
> ------------------------------------
>
> result the same:
>
> cpu% 6.out | grep end | wc -l
>      33
>
>
> And noted(NCONT) may be needed, when process recieved many (2 and more)
> notes at once.
>
> May be something wrong  with interrupted an incomplete  system call?
>
>
> >
> > Hope this helps,
> > moody
> >
> 
> 
> Regards,
> Andrej
> 

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-Mb2e0af1ac4067b7f4649d000
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-17 15:06     ` andrey100100100
  2022-06-17 16:08       ` Skip Tavakkolian
@ 2022-06-17 16:11       ` Jacob Moody
  2022-06-17 18:48         ` andrey100100100
  1 sibling, 1 reply; 49+ messages in thread
From: Jacob Moody @ 2022-06-17 16:11 UTC (permalink / raw)
  To: 9fans

On 6/17/22 09:06, andrey100100100@gmail.com wrote:
> В Пт, 17/06/2022 в 08:11 -0600, Jacob Moody пишет:
>> On 6/17/22 07:46, Thaddeus Woskowiak wrote:
>>> I believe threadnotify() should be called from threadmain() to
>>> properly register the handler in the rendez group
>>
>> This is incorrect, according to thread(2):
>>
>> "The thread library depends on all procs
>> being in the same rendezvous group"
> 
> 
> From sleep(2):
> 
>     Alarm causes an alarm note (see notify(2)) to be sent to the
>     invoking process after the number of milliseconds given by
>     the argument.
> 
> Mean to be sent only to the invoking process, NOT to the process group.

Yes this is correct, If I implied otherwise I apologize. My point with
pointing out that excerpt is that groups likely had nothing to do with this.

>>
>> The issue here is that your note handler has to call noted,
>> you are returning from the handler without actually resuming the
>> program.
>> You either need to call noted(NCONT) to resume execution or
>> noted(NDFLT)
>> to stop execution.
>>
>> An excerpt from notify(2):
>>
>> "A notification handler must finish either by exiting the
>> program or by calling noted; if the handler returns the
>> behavior is undefined and probably erroneous."
>>
>> So you are indeed observing undefined behavior.
>>
> 
> With:
> 
> ------------------------------------
> static int
> handler_alarm(void *, char *msg)
> {
>         if(strstr(msg, "alarm")){
>                 noted(NCONT);
>                 return 1;
>         }
> 
>         return 0;
> }
> ------------------------------------
> result the same:
> 
> cpu% 6.out | grep end | wc -l
>      33
> 
> 
> And noted(NCONT) may be needed, when process recieved many (2 and more)
> notes at once.
> 
> May be something wrong  with interrupted an incomplete  system call?

You _always_ should call either noted(NCONT) or noted(NDFLT).
But you are correct in that this wasn't the exact issue. I poked
around with the code a bit. I rewrote it to just use
fork(), and I got all 80 "end" messages. So I suspected
libthread had some arbitrary limit:

#define NFN             33
#define ERRLEN  48
typedef struct Note Note;
struct Note
{
        Lock            inuse;
        Proc            *proc;          /* recipient */
        char            s[ERRMAX];      /* arg2 */
};

static Note     notes[128];
static Note     *enotes = notes+nelem(notes);
static int              (*onnote[NFN])(void*, char*);
static int              onnotepid[NFN];
static Lock     onnotelock;

int
threadnotify(int (*f)(void*, char*), int in)
{
        int i, topid;
        int (*from)(void*, char*), (*to)(void*, char*);

        if(in){
                from = nil;
                to = f;
                topid = _threadgetproc()->pid;
        }else{
                from = f;
                to = nil;
                topid = 0;
        }
        lock(&onnotelock);
        for(i=0; i<NFN; i++)
                if(onnote[i]==from){
                        onnote[i] = to;
                        onnotepid[i] = topid;
                        break;
                }
        unlock(&onnotelock);
        return i<NFN;
}

That

#define NFN 33

seems like the culprit. Looks like if you checked
the return value of threadnotify you would have seen
your notes handler was not registered.

Now as to why this limit is so low, I am not sure. Perhaps
it should be bumped up.


Thanks,
moody




------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M687ef3adb4df6c21a188e7e1
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-17 16:08       ` Skip Tavakkolian
@ 2022-06-17 16:11         ` Skip Tavakkolian
  2022-06-17 16:16           ` Skip Tavakkolian
  0 siblings, 1 reply; 49+ messages in thread
From: Skip Tavakkolian @ 2022-06-17 16:11 UTC (permalink / raw)
  To: 9fans

it's worth grepping for persistent magic constants:

% grep 33 /sys/src/libthread/*.[ch]
/sys/src/libthread/note.c:#define NFN 33

On Fri, Jun 17, 2022 at 9:08 AM Skip Tavakkolian
<skip.tavakkolian@gmail.com> wrote:
>
> interesting catch. it seems to be a tunable limit.
>
> % grep NFN /sys/src/libthread/note.c
> #define NFN 33
> static int (*onnote[NFN])(void*, char*);
> static int onnotepid[NFN];
> for(i=0; i<NFN; i++)
> return i<NFN;
> for(i=0; i<NFN; i++){
> if(i==NFN){
>
> On Fri, Jun 17, 2022 at 8:08 AM <andrey100100100@gmail.com> wrote:
> >
> > В Пт, 17/06/2022 в 08:11 -0600, Jacob Moody пишет:
> > > On 6/17/22 07:46, Thaddeus Woskowiak wrote:
> > > > I believe threadnotify() should be called from threadmain() to
> > > > properly register the handler in the rendez group
> > >
> > > This is incorrect, according to thread(2):
> > >
> > > "The thread library depends on all procs
> > > being in the same rendezvous group"
> >
> >
> > From sleep(2):
> >
> >     Alarm causes an alarm note (see notify(2)) to be sent to the
> >     invoking process after the number of milliseconds given by
> >     the argument.
> >
> > Mean to be sent only to the invoking process, NOT to the process group.
> >
> > >
> > > The issue here is that your note handler has to call noted,
> > > you are returning from the handler without actually resuming the
> > > program.
> > > You either need to call noted(NCONT) to resume execution or
> > > noted(NDFLT)
> > > to stop execution.
> > >
> > > An excerpt from notify(2):
> > >
> > > "A notification handler must finish either by exiting the
> > > program or by calling noted; if the handler returns the
> > > behavior is undefined and probably erroneous."
> > >
> > > So you are indeed observing undefined behavior.
> > >
> >
> > With:
> >
> > ------------------------------------
> > static int
> > handler_alarm(void *, char *msg)
> > {
> >         if(strstr(msg, "alarm")){
> >                 noted(NCONT);
> >                 return 1;
> >         }
> >
> >         return 0;
> > }
> > ------------------------------------
> >
> > result the same:
> >
> > cpu% 6.out | grep end | wc -l
> >      33
> >
> >
> > And noted(NCONT) may be needed, when process recieved many (2 and more)
> > notes at once.
> >
> > May be something wrong  with interrupted an incomplete  system call?
> >
> >
> > >
> > > Hope this helps,
> > > moody
> > >
> > 
> > 
> > Regards,
> > Andrej
> > 

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M2b6a9ca6ba8b315c113a43e9
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-17 16:11         ` Skip Tavakkolian
@ 2022-06-17 16:16           ` Skip Tavakkolian
  2022-06-17 17:42             ` adr
  0 siblings, 1 reply; 49+ messages in thread
From: Skip Tavakkolian @ 2022-06-17 16:16 UTC (permalink / raw)
  To: 9fans

Thanks to Douglas Adams, I think '42' might be a more obvious magic
number for a clue:

% 8c udpflood.c && 8l -o udpflood udpflood.8 && ./udpflood | grep end | wc -l
     42
% grep 42 /sys/src/libthread/note.c
#define NFN 42

On Fri, Jun 17, 2022 at 9:11 AM Skip Tavakkolian
<skip.tavakkolian@gmail.com> wrote:
>
> it's worth grepping for persistent magic constants:
>
> % grep 33 /sys/src/libthread/*.[ch]
> /sys/src/libthread/note.c:#define NFN 33
>
> On Fri, Jun 17, 2022 at 9:08 AM Skip Tavakkolian
> <skip.tavakkolian@gmail.com> wrote:
> >
> > interesting catch. it seems to be a tunable limit.
> >
> > % grep NFN /sys/src/libthread/note.c
> > #define NFN 33
> > static int (*onnote[NFN])(void*, char*);
> > static int onnotepid[NFN];
> > for(i=0; i<NFN; i++)
> > return i<NFN;
> > for(i=0; i<NFN; i++){
> > if(i==NFN){
> >
> > On Fri, Jun 17, 2022 at 8:08 AM <andrey100100100@gmail.com> wrote:
> > >
> > > В Пт, 17/06/2022 в 08:11 -0600, Jacob Moody пишет:
> > > > On 6/17/22 07:46, Thaddeus Woskowiak wrote:
> > > > > I believe threadnotify() should be called from threadmain() to
> > > > > properly register the handler in the rendez group
> > > >
> > > > This is incorrect, according to thread(2):
> > > >
> > > > "The thread library depends on all procs
> > > > being in the same rendezvous group"
> > >
> > >
> > > From sleep(2):
> > >
> > >     Alarm causes an alarm note (see notify(2)) to be sent to the
> > >     invoking process after the number of milliseconds given by
> > >     the argument.
> > >
> > > Mean to be sent only to the invoking process, NOT to the process group.
> > >
> > > >
> > > > The issue here is that your note handler has to call noted,
> > > > you are returning from the handler without actually resuming the
> > > > program.
> > > > You either need to call noted(NCONT) to resume execution or
> > > > noted(NDFLT)
> > > > to stop execution.
> > > >
> > > > An excerpt from notify(2):
> > > >
> > > > "A notification handler must finish either by exiting the
> > > > program or by calling noted; if the handler returns the
> > > > behavior is undefined and probably erroneous."
> > > >
> > > > So you are indeed observing undefined behavior.
> > > >
> > >
> > > With:
> > >
> > > ------------------------------------
> > > static int
> > > handler_alarm(void *, char *msg)
> > > {
> > >         if(strstr(msg, "alarm")){
> > >                 noted(NCONT);
> > >                 return 1;
> > >         }
> > >
> > >         return 0;
> > > }
> > > ------------------------------------
> > >
> > > result the same:
> > >
> > > cpu% 6.out | grep end | wc -l
> > >      33
> > >
> > >
> > > And noted(NCONT) may be needed, when process recieved many (2 and more)
> > > notes at once.
> > >
> > > May be something wrong  with interrupted an incomplete  system call?
> > >
> > >
> > > >
> > > > Hope this helps,
> > > > moody
> > > >
> > > 
> > > 
> > > Regards,
> > > Andrej
> > > 

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M5b827bf9eba38f893c1f67bb
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-17 16:16           ` Skip Tavakkolian
@ 2022-06-17 17:42             ` adr
  0 siblings, 0 replies; 49+ messages in thread
From: adr @ 2022-06-17 17:42 UTC (permalink / raw)
  To: 9fans

On Fri, 17 Jun 2022, Skip Tavakkolian wrote:
> Thanks to Douglas Adams, I think '42' might be a more obvious magic
> number for a clue:
>
> % 8c udpflood.c && 8l -o udpflood udpflood.8 && ./udpflood | grep end | wc -l
>     42
> % grep 42 /sys/src/libthread/note.c
> #define NFN 42

I don't understand, why does it work when commenting out the read
statement? Why it doesn't work even removing all the notification
handling?  Maybe I have it all wrong, but I think there is more to
this.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M346082debc6a2d5b01267879
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-17 16:11       ` Jacob Moody
@ 2022-06-17 18:48         ` andrey100100100
  2022-06-17 19:28           ` Jacob Moody
  2022-06-17 21:15           ` adr
  0 siblings, 2 replies; 49+ messages in thread
From: andrey100100100 @ 2022-06-17 18:48 UTC (permalink / raw)
  To: 9fans

В Пт, 17/06/2022 в 10:11 -0600, Jacob Moody пишет:
> On 6/17/22 09:06, andrey100100100@gmail.com wrote:
> > В Пт, 17/06/2022 в 08:11 -0600, Jacob Moody пишет:
> > > On 6/17/22 07:46, Thaddeus Woskowiak wrote:
> > > > I believe threadnotify() should be called from threadmain() to
> > > > properly register the handler in the rendez group
> > > 
> > > This is incorrect, according to thread(2):
> > > 
> > > "The thread library depends on all procs
> > > being in the same rendezvous group"
> > 
> > 
> > From sleep(2):
> > 
> >     Alarm causes an alarm note (see notify(2)) to be sent to the
> >     invoking process after the number of milliseconds given by
> >     the argument.
> > 
> > Mean to be sent only to the invoking process, NOT to the process
> > group.
> 
> Yes this is correct, If I implied otherwise I apologize. My point
> with
> pointing out that excerpt is that groups likely had nothing to do
> with this.
> 
> > > 
> > > The issue here is that your note handler has to call noted,
> > > you are returning from the handler without actually resuming the
> > > program.
> > > You either need to call noted(NCONT) to resume execution or
> > > noted(NDFLT)
> > > to stop execution.
> > > 
> > > An excerpt from notify(2):
> > > 
> > > "A notification handler must finish either by exiting the
> > > program or by calling noted; if the handler returns the
> > > behavior is undefined and probably erroneous."
> > > 
> > > So you are indeed observing undefined behavior.
> > > 
> > 
> > With:
> > 
> > ------------------------------------
> > static int
> > handler_alarm(void *, char *msg)
> > {
> >         if(strstr(msg, "alarm")){
> >                 noted(NCONT);
> >                 return 1;
> >         }
> > 
> >         return 0;
> > }
> > ------------------------------------
> > result the same:
> > 
> > cpu% 6.out | grep end | wc -l
> >      33
> > 
> > 
> > And noted(NCONT) may be needed, when process recieved many (2 and
> > more)
> > notes at once.
> > 
> > May be something wrong  with interrupted an incomplete  system
> > call?
> 
> You _always_ should call either noted(NCONT) or noted(NDFLT).

But from atnotify(2) (section 'Atnotify'):

                                                  When the system
          posts a note to the process, each handler registered with
          atnotify is called with arguments as described above until
          one of the handlers returns non-zero.  Then noted is called
          with argument NCONT.  If no registered function returns
          non-zero, atnotify calls noted with argument NDFLT.

from /sys/src/libc/port/atnotify.c :

--------------------------------
static
void
notifier(void *v, char *s)
{
        int i;

        for(i=0; i<NFN; i++)
                if(onnot[i] && ((*onnot[i])(v, s))){
                        noted(NCONT);
                        return;
                }
        noted(NDFLT);
}
--------------------------------

Seems like noted() call not needed in user code.



> But you are correct in that this wasn't the exact issue. I poked
> around with the code a bit. I rewrote it to just use
> fork(), and I got all 80 "end" messages. 

Yes, with fork() is working:

------------------------------------------------------------------
#include <u.h>
#include <libc.h>

static int
handler_alarm(void *, char *msg)
{
        if(strstr(msg, "alarm"))
                return 1;

        return 0;
}

static void
proc_udp(void *)
{
        char resp[512];
        char req[] = "request";
        int fd;

        atnotify(handler_alarm, 1);

        if((fd = dial("udp!185.157.221.201!5678", nil, nil, nil)) >=
0){
                if(write(fd, req, strlen(req)) == strlen(req)){
                        fprint(1, "start\n");
                        alarm(2000);
                        read(fd, resp, sizeof(resp));
                        alarm(0);
                        fprint(1, "end\n");
                }
                close(fd);
        }

}

void
main(int argc, char *argv[])
{
        for(int i = 0; i < 80; i++){
                switch(fork()){
                case -1:
                        sysfatal("fork: %r");
                case 0:
                        proc_udp(nil);
                        exits(nil);
                }
        }

        sleep(5000);
        exits(nil);
}
------------------------------------------------------------------

cpu% 6.out | grep end | wc -l
     80

But with rfork(RFPROC|RFMEM|RFNOWAIT) (the same, how in proccreate)
not:

cpu% 6.out | grep end | wc -l
      6

strange...

> So I suspected
> libthread had some arbitrary limit:
> 
> #define NFN             33
> #define ERRLEN  48
> typedef struct Note Note;
> struct Note
> {
>         Lock            inuse;
>         Proc            *proc;          /* recipient */
>         char            s[ERRMAX];      /* arg2 */
> };
> 
> static Note     notes[128];
> static Note     *enotes = notes+nelem(notes);
> static int              (*onnote[NFN])(void*, char*);
> static int              onnotepid[NFN];
> static Lock     onnotelock;
> 
> int
> threadnotify(int (*f)(void*, char*), int in)
> {
>         int i, topid;
>         int (*from)(void*, char*), (*to)(void*, char*);
> 
>         if(in){
>                 from = nil;
>                 to = f;
>                 topid = _threadgetproc()->pid;
>         }else{
>                 from = f;
>                 to = nil;
>                 topid = 0;
>         }
>         lock(&onnotelock);
>         for(i=0; i<NFN; i++)
>                 if(onnote[i]==from){
>                         onnote[i] = to;
>                         onnotepid[i] = topid;
>                         break;
>                 }
>         unlock(&onnotelock);
>         return i<NFN;
> }
> 
> That
> 
> #define NFN 33
> 
> seems like the culprit. Looks like if you checked
> the return value of threadnotify you would have seen
> your notes handler was not registered.

Very impotant note about the return value of threadnotify.
Thanks. My mistake.
So it seemed to me that the processes silently fall.

> 
> Now as to why this limit is so low, I am not sure. Perhaps
> it should be bumped up.
> 

Funny, in /sys/src/libc/port/atnotify.c:4 the same limit:

#define NFN     33


But in case of using fork() this limit does not affect.
Maybe becose RFMEM not set and and there can be 33 handlers per child.

> 
> Thanks,
> moody
> 
> 


I did not find other way to interrupt the stalled system call
in a program, other than to send a signal to the process.

Maybe there is a better way...


Many thanks Jacob Moody and Skip Tavakkolian for showing me the light.


Regards,
Andrej



------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-Md650ba9f9fcfad846fda95d8
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-17 18:48         ` andrey100100100
@ 2022-06-17 19:28           ` Jacob Moody
  2022-06-17 21:15           ` adr
  1 sibling, 0 replies; 49+ messages in thread
From: Jacob Moody @ 2022-06-17 19:28 UTC (permalink / raw)
  To: 9fans

On 6/17/22 12:48, andrey100100100@gmail.com wrote:
> В Пт, 17/06/2022 в 10:11 -0600, Jacob Moody пишет:
>> On 6/17/22 09:06, andrey100100100@gmail.com wrote:
>>> В Пт, 17/06/2022 в 08:11 -0600, Jacob Moody пишет:
>>>> On 6/17/22 07:46, Thaddeus Woskowiak wrote:
>>>>> I believe threadnotify() should be called from threadmain() to
>>>>> properly register the handler in the rendez group
>>>>
>>>> This is incorrect, according to thread(2):
>>>>
>>>> "The thread library depends on all procs
>>>> being in the same rendezvous group"
>>>
>>>
>>> From sleep(2):
>>>
>>>     Alarm causes an alarm note (see notify(2)) to be sent to the
>>>     invoking process after the number of milliseconds given by
>>>     the argument.
>>>
>>> Mean to be sent only to the invoking process, NOT to the process
>>> group.
>>
>> Yes this is correct, If I implied otherwise I apologize. My point
>> with
>> pointing out that excerpt is that groups likely had nothing to do
>> with this.
>>
>>>>
>>>> The issue here is that your note handler has to call noted,
>>>> you are returning from the handler without actually resuming the
>>>> program.
>>>> You either need to call noted(NCONT) to resume execution or
>>>> noted(NDFLT)
>>>> to stop execution.
>>>>
>>>> An excerpt from notify(2):
>>>>
>>>> "A notification handler must finish either by exiting the
>>>> program or by calling noted; if the handler returns the
>>>> behavior is undefined and probably erroneous."
>>>>
>>>> So you are indeed observing undefined behavior.
>>>>
>>>
>>> With:
>>>
>>> ------------------------------------
>>> static int
>>> handler_alarm(void *, char *msg)
>>> {
>>>         if(strstr(msg, "alarm")){
>>>                 noted(NCONT);
>>>                 return 1;
>>>         }
>>>
>>>         return 0;
>>> }
>>> ------------------------------------
>>> result the same:
>>>
>>> cpu% 6.out | grep end | wc -l
>>>      33
>>>
>>>
>>> And noted(NCONT) may be needed, when process recieved many (2 and
>>> more)
>>> notes at once.
>>>
>>> May be something wrong  with interrupted an incomplete  system
>>> call?
>>
>> You _always_ should call either noted(NCONT) or noted(NDFLT).
> 
> But from atnotify(2) (section 'Atnotify'):
> 
>                                                   When the system
>           posts a note to the process, each handler registered with
>           atnotify is called with arguments as described above until
>           one of the handlers returns non-zero.  Then noted is called
>           with argument NCONT.  If no registered function returns
>           non-zero, atnotify calls noted with argument NDFLT.
> 
> from /sys/src/libc/port/atnotify.c :
> 
> --------------------------------
> static
> void
> notifier(void *v, char *s)
> {
>         int i;
> 
>         for(i=0; i<NFN; i++)
>                 if(onnot[i] && ((*onnot[i])(v, s))){
>                         noted(NCONT);
>                         return;
>                 }
>         noted(NDFLT);
> }
> --------------------------------
> 
> Seems like noted() call not needed in user code.
> 

Oh look at that, my apologies. That's the whole difference
between just notify() and atnotify(), how did I miss that.

>> But you are correct in that this wasn't the exact issue. I poked
>> around with the code a bit. I rewrote it to just use
>> fork(), and I got all 80 "end" messages. 
> 
> Yes, with fork() is working:
> 
> ------------------------------------------------------------------
> #include <u.h>
> #include <libc.h>
> 
> static int
> handler_alarm(void *, char *msg)
> {
>         if(strstr(msg, "alarm"))
>                 return 1;
> 
>         return 0;
> }
> 
> static void
> proc_udp(void *)
> {
>         char resp[512];
>         char req[] = "request";
>         int fd;
> 
>         atnotify(handler_alarm, 1);
> 
>         if((fd = dial("udp!185.157.221.201!5678", nil, nil, nil)) >=
> 0){
>                 if(write(fd, req, strlen(req)) == strlen(req)){
>                         fprint(1, "start\n");
>                         alarm(2000);
>                         read(fd, resp, sizeof(resp));
>                         alarm(0);
>                         fprint(1, "end\n");
>                 }
>                 close(fd);
>         }
> 
> }
> 
> void
> main(int argc, char *argv[])
> {
>         for(int i = 0; i < 80; i++){
>                 switch(fork()){
>                 case -1:
>                         sysfatal("fork: %r");
>                 case 0:
>                         proc_udp(nil);
>                         exits(nil);
>                 }
>         }
> 
>         sleep(5000);
>         exits(nil);
> }
> ------------------------------------------------------------------
> 
> cpu% 6.out | grep end | wc -l
>      80
> 
> But with rfork(RFPROC|RFMEM|RFNOWAIT) (the same, how in proccreate)
> not:
> 
> cpu% 6.out | grep end | wc -l
>       6
> 
> strange...
> 
>> So I suspected
>> libthread had some arbitrary limit:
>>
>> #define NFN             33
>> #define ERRLEN  48
>> typedef struct Note Note;
>> struct Note
>> {
>>         Lock            inuse;
>>         Proc            *proc;          /* recipient */
>>         char            s[ERRMAX];      /* arg2 */
>> };
>>
>> static Note     notes[128];
>> static Note     *enotes = notes+nelem(notes);
>> static int              (*onnote[NFN])(void*, char*);
>> static int              onnotepid[NFN];
>> static Lock     onnotelock;
>>
>> int
>> threadnotify(int (*f)(void*, char*), int in)
>> {
>>         int i, topid;
>>         int (*from)(void*, char*), (*to)(void*, char*);
>>
>>         if(in){
>>                 from = nil;
>>                 to = f;
>>                 topid = _threadgetproc()->pid;
>>         }else{
>>                 from = f;
>>                 to = nil;
>>                 topid = 0;
>>         }
>>         lock(&onnotelock);
>>         for(i=0; i<NFN; i++)
>>                 if(onnote[i]==from){
>>                         onnote[i] = to;
>>                         onnotepid[i] = topid;
>>                         break;
>>                 }
>>         unlock(&onnotelock);
>>         return i<NFN;
>> }
>>
>> That
>>
>> #define NFN 33
>>
>> seems like the culprit. Looks like if you checked
>> the return value of threadnotify you would have seen
>> your notes handler was not registered.
> 
> Very impotant note about the return value of threadnotify.
> Thanks. My mistake.
> So it seemed to me that the processes silently fall.
> 
>>
>> Now as to why this limit is so low, I am not sure. Perhaps
>> it should be bumped up.
>>
> 
> Funny, in /sys/src/libc/port/atnotify.c:4 the same limit:
> 
> #define NFN     33
> 
> 
> But in case of using fork() this limit does not affect.
> Maybe becose RFMEM not set and and there can be 33 handlers per child.

This is exactly it, when you fork with RFMEM you share that global function
array. Thus all of your children are competing for one of those 33 slots.
Without it, it is exactly as you say, each child gets its own array.

Thanks for doing some digging in to the code in libc. I would have completely missed that.
Either way it makes sense to me that these limits should be bumped.

>>
>> Thanks,
>> moody
>>
>>
> 
> 
> I did not find other way to interrupt the stalled system call
> in a program, other than to send a signal to the process.
> 
> Maybe there is a better way...

You are correct in thinking alarm() is how you'd interrupt a system
call. That is to say, I am not aware of a better way of doing it right now.

But I think you're correct in wanting something a bit better.
I've never looked at code using alarm lovingly. Perhaps its
time to kick some ideas around for a better way of signaling
io timeouts other then alarm(). To me, it makes sense to tie
them to the fd itself rather then the process as a whole.
Perhaps through the /fd/*ctl interface?

> 
> Many thanks Jacob Moody and Skip Tavakkolian for showing me the light.

Thank you for using the system and taking some time to report and dig
in to bugs :D

I also wanted to tell you that 9front does have its own mailing
list for bugs and patches. If you are actively working with the
system you might find the discussions there useful.

http://lists.9front.org/


Thanks,
moody

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M5173390655895122a91157bf
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-17 18:48         ` andrey100100100
  2022-06-17 19:28           ` Jacob Moody
@ 2022-06-17 21:15           ` adr
  2022-06-18  6:40             ` andrey100100100
  1 sibling, 1 reply; 49+ messages in thread
From: adr @ 2022-06-17 21:15 UTC (permalink / raw)
  To: 9fans

On Fri, 17 Jun 2022, andrey100100100@gmail.com wrote:
> Seems like noted() call not needed in user code.

noted() is only needed when using the syscall notify, when using
atnotify() (or threadnotify) you don't need it, as it is said in
notify(2) and you did correctly in your first example. threadnotify
doesn't kill your process if there is no space free in onnote[],
onnotepid[], the handler is not registered, that's all. alarm()
should send the note to the process and the first handler registered
with the note "alarm" should be executed. Your handler checked for
the note and returned non zero, the process should continue. When
read is interrupted, it should return an error, the process should
not be killed. Here is the issue. Comment the read statement and
there will be the same number of "end"s as "start"s.

Note that you could register the handler in threadmain and avoid
completely this issue, but as I said before, something seems wrong
to me here.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-Madcf140195c52ad821869376
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-17 21:15           ` adr
@ 2022-06-18  6:40             ` andrey100100100
  2022-06-18  8:37               ` adr
  0 siblings, 1 reply; 49+ messages in thread
From: andrey100100100 @ 2022-06-18  6:40 UTC (permalink / raw)
  To: 9fans

В Пт, 17/06/2022 в 21:15 +0000, adr пишет:
> On Fri, 17 Jun 2022, andrey100100100@gmail.com wrote:
> > Seems like noted() call not needed in user code.
> 
> noted() is only needed when using the syscall notify, when using
> atnotify() (or threadnotify) you don't need it, as it is said in
> notify(2) and you did correctly in your first example. threadnotify
> doesn't kill your process if there is no space free in onnote[],
> onnotepid[], the handler is not registered, that's all. alarm()
> should send the note to the process and the first handler registered
> with the note "alarm" should be executed. Your handler checked for
> the note and returned non zero, the process should continue. When
> read is interrupted, it should return an error, the process should
> not be killed. Here is the issue. Comment the read statement and
> there will be the same number of "end"s as "start"s.
> 

Мore clear example, which demonstrate crux of the problem:

---------------------------------------------
#include <u.h>
#include <libc.h>
#include <thread.h>

static int
handler_alarm(void *, char *msg)
{
        if(strstr(msg, "alarm")){
                return 1;
        }

        return 0;
}

static void
proc_func(void *)
{
        if(threadnotify(handler_alarm, 1) == 0){
                fprint(1, "handler not registred\n");
        }

        alarm(2000);
        fprint(1, "start\n");
        sleep(10000);
        fprint(1, "end\n");
        alarm(0);

        threadexits(nil);
}

int mainstacksize = 5242880;

void
threadmain(int argc, char *argv[])
{
        for(int i = 0; i < 80; i++){
                proccreate(proc_func, nil, 10240);
        }

        sleep(5000);
        threadexitsall(nil);
}
---------------------------------------------

cpu% 6.out | grep end | wc -l
     33


Problem in unregistered handlers.



> Note that you could register the handler in threadmain and avoid
> completely this issue, but as I said before, something seems wrong
> to me here.
> 

I'm don't understand how handler in threadmain would solve the problem.
I need in 'alarm' on per process basis.






Regards,
Andrej

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M2bf5df4e0184bdff80c9eee9
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-18  6:40             ` andrey100100100
@ 2022-06-18  8:37               ` adr
  2022-06-18  9:22                 ` adr
  2022-06-18 16:57                 ` andrey100100100
  0 siblings, 2 replies; 49+ messages in thread
From: adr @ 2022-06-18  8:37 UTC (permalink / raw)
  To: 9fans

On Sat, 18 Jun 2022, andrey100100100@gmail.com wrote:

> ---------------------------------------------
>
> cpu% 6.out | grep end | wc -l
>     33
>
>
> Problem in unregistered handlers.

But unregistered handlers shouldn't be a problem. The process is
been killed when alarm sends the note. That's why the code worked
removing the read statement, the alarm is set off and the note is
not sent before the process ends. I just don't see why the process
is been killed. The documentation describes another behaivor. To
me it smells like bug barbecue (corrupted onnote?). Maybe I got
something wrong, bear with me.

>> Note that you could register the handler in threadmain and avoid
>> completely this issue, but as I said before, something seems wrong
>> to me here.
>
> I'm don't understand how handler in threadmain would solve the problem.
> I need in 'alarm' on per process basis.

You need alarm() in every process, but you don't need to register the
same handler 80 times!

adr.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M62a1f2e8578fcd812c35b4b5
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-18  8:37               ` adr
@ 2022-06-18  9:22                 ` adr
  2022-06-18 12:53                   ` Jacob Moody
  2022-06-18 22:22                   ` andrey100100100
  2022-06-18 16:57                 ` andrey100100100
  1 sibling, 2 replies; 49+ messages in thread
From: adr @ 2022-06-18  9:22 UTC (permalink / raw)
  To: 9fans

On Sat, 18 Jun 2022, adr wrote:

> On Sat, 18 Jun 2022, andrey100100100@gmail.com wrote:
>
>> ---------------------------------------------
>> 
>> cpu% 6.out | grep end | wc -l
>>     33
>> 
>> 
>> Problem in unregistered handlers.
>
> But unregistered handlers shouldn't be a problem. The process is
> been killed when alarm sends the note. That's why the code worked
> removing the read statement, the alarm is set off and the note is
> not sent before the process ends. I just don't see why the process
> is been killed. The documentation describes another behavior. To
> me it smells like bug barbecue (corrupted onnote?). Maybe I got
> something wrong, bear with me.
>
>>> Note that you could register the handler in threadmain and avoid
>>> completely this issue, but as I said before, something seems wrong
>>> to me here.
>> 
>> I'm don't understand how handler in threadmain would solve the problem.
>> I need in 'alarm' on per process basis.
>
> You need alarm() in every process, but you don't need to register the
> same handler 80 times!
>
> adr.

I think there is some confussion here, so I'll explain myself a
little more.

Lets change your last example to not use libthread:

#include <u.h>
#include <libc.h>

int
handler_alarm(void *, char *msg)
{
         if(strstr(msg, "alarm")){
                 return 1;
         }

         return 0;
}

int
test(void)
{
         if(atnotify(handler_alarm, 1) == 0){
                 fprint(1, "handler not registered\n");
         }

         alarm(10);
         fprint(1, "start\n");
         sleep(40);
         fprint(1, "end\n");
         alarm(0);

         return 0;
}

void
main()
{
         for(int i = 0; i < 80; i++){
                 test();
         }

         exits(nil);
}

You see, after the NFNth iteration of test(), onnot[NFN] in atnotify
will be full, the handlers wont be registered but the code will
work without any problem. It doesn't matter, the first handler in
onnot[] will be executed. I fact you only need one handler there, not
80, you should move atnotify to main.

The same should be happening with libthread. I'm really the only
one smelling a bug here?

adr.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M47714addb2d648e020737917
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-18  9:22                 ` adr
@ 2022-06-18 12:53                   ` Jacob Moody
  2022-06-18 22:03                     ` andrey100100100
  2022-06-19  5:54                     ` adr
  2022-06-18 22:22                   ` andrey100100100
  1 sibling, 2 replies; 49+ messages in thread
From: Jacob Moody @ 2022-06-18 12:53 UTC (permalink / raw)
  To: 9fans

On 6/18/22 03:22, adr wrote:
> On Sat, 18 Jun 2022, adr wrote:
> 
>> On Sat, 18 Jun 2022, andrey100100100@gmail.com wrote:
>>
>>> ---------------------------------------------
>>>
>>> cpu% 6.out | grep end | wc -l
>>>     33
>>>
>>>
>>> Problem in unregistered handlers.
>>
>> But unregistered handlers shouldn't be a problem. The process is
>> been killed when alarm sends the note. That's why the code worked
>> removing the read statement, the alarm is set off and the note is
>> not sent before the process ends. I just don't see why the process
>> is been killed. The documentation describes another behavior. To
>> me it smells like bug barbecue (corrupted onnote?). Maybe I got
>> something wrong, bear with me.
>>
>>>> Note that you could register the handler in threadmain and avoid
>>>> completely this issue, but as I said before, something seems wrong
>>>> to me here.
>>>
>>> I'm don't understand how handler in threadmain would solve the problem.
>>> I need in 'alarm' on per process basis.
>>
>> You need alarm() in every process, but you don't need to register the
>> same handler 80 times!
>>
>> adr.
> 
> I think there is some confussion here, so I'll explain myself a
> little more.
> 
> Lets change your last example to not use libthread:
> 
> #include <u.h>
> #include <libc.h>
> 
> int
> handler_alarm(void *, char *msg)
> {
>          if(strstr(msg, "alarm")){
>                  return 1;
>          }
> 
>          return 0;
> }
> 
> int
> test(void)
> {
>          if(atnotify(handler_alarm, 1) == 0){
>                  fprint(1, "handler not registered\n");
>          }
> 
>          alarm(10);
>          fprint(1, "start\n");
>          sleep(40);
>          fprint(1, "end\n");
>          alarm(0);
> 
>          return 0;
> }
> 
> void
> main()
> {
>          for(int i = 0; i < 80; i++){
>                  test();
>          }
> 
>          exits(nil);
> }
> 
> You see, after the NFNth iteration of test(), onnot[NFN] in atnotify
> will be full, the handlers wont be registered but the code will
> work without any problem. It doesn't matter, the first handler in
> onnot[] will be executed. I fact you only need one handler there, not
> 80, you should move atnotify to main.
> 
> The same should be happening with libthread. I'm really the only
> one smelling a bug here?

No, you've got me convinced something much more wrong is going on.
Because you're right, our read children shouldn't just be gone,
we should return from read with an error and then print the "end" line.
I've attempted to reproduce it, trying to remove the libthread/notify
factors. I've come up with this:

#include <u.h>
#include <libc.h>

static void
proc_udp(void*)
{
        char resp[512];
        char req[] = "request";
        int fd;
        int n;
        int pid;

        fd = dial("udp!185.157.221.201!5678", nil, nil, nil);
        if(fd < 0)
                exits("can't dial");

        if(write(fd, req, strlen(req)) != strlen(req))
                exits("can't write");

        pid = getpid();
        fprint(1, "start %d\n", pid);
        n = read(fd, resp, sizeof(resp)-1);
        fprint(1, "end %d %d\n", pid, n);
        exits(nil);
}

void
main(int, char**)
{
        int i;
        Waitmsg *wm;

        for(i = 0; i < 10; i++){
                switch(fork()){
                case -1:
                        sysfatal("fork %r");
                case 0:
                        proc_udp(nil);
                        sysfatal("ret");
                default:
                        break;
                }
        }
        for(i = 0; i < 10; i++){
                wm = wait();
                print("proc %d died with message %s\n", wm->pid, wm->msg);
        }
        exits(nil);
}

This code makes it pretty obvious that we are losing some children;
on my machine this program never exits. I see some portion of the
readers correctly returning -1, and the parent is able to get their
Waitmsg but not all of them.


Thanks,
moody


------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-Ma4f311286087163ef1e2565e
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-18  8:37               ` adr
  2022-06-18  9:22                 ` adr
@ 2022-06-18 16:57                 ` andrey100100100
  2022-06-19  2:40                   ` adr
  1 sibling, 1 reply; 49+ messages in thread
From: andrey100100100 @ 2022-06-18 16:57 UTC (permalink / raw)
  To: 9fans

В Сб, 18/06/2022 в 08:37 +0000, adr пишет:
> On Sat, 18 Jun 2022, andrey100100100@gmail.com wrote:
> 
> > ---------------------------------------------
> > 
> > cpu% 6.out | grep end | wc -l
> >     33
> > 
> > 
> > Problem in unregistered handlers.
> 
> But unregistered handlers shouldn't be a problem. The process is
> been killed when alarm sends the note. That's why the code worked
> removing the read statement, the alarm is set off and the note is
> not sent before the process ends. I just don't see why the process
> is been killed. 

The process dies, because the handler that suppresses the default
behavior on note 'alarm' is not registered. And the default behavior is
death.

> The documentation describes another behaivor. To
> me it smells like bug barbecue (corrupted onnote?). Maybe I got
> something wrong, bear with me.
> 
> > > Note that you could register the handler in threadmain and avoid
> > > completely this issue, but as I said before, something seems
> > > wrong
> > > to me here.
> > 
> > I'm don't understand how handler in threadmain would solve the
> > problem.
> > I need in 'alarm' on per process basis.
> 
> You need alarm() in every process, but you don't need to register the
> same handler 80 times!


Perhaps I need different handlers for different processes.
Or, for some processes, I need a default behavior, and for some others,
a handler.

> 
> adr.
> 

Regards,
Andrej

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-Mc5582399d6c559ec53eddd59
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-18 12:53                   ` Jacob Moody
@ 2022-06-18 22:03                     ` andrey100100100
  2022-06-19  5:54                     ` adr
  1 sibling, 0 replies; 49+ messages in thread
From: andrey100100100 @ 2022-06-18 22:03 UTC (permalink / raw)
  To: 9fans

В Сб, 18/06/2022 в 06:53 -0600, Jacob Moody пишет:
> On 6/18/22 03:22, adr wrote:
> > On Sat, 18 Jun 2022, adr wrote:
> > 
> > > On Sat, 18 Jun 2022, andrey100100100@gmail.com wrote:
> > > 
> > > > ---------------------------------------------
> > > > 
> > > > cpu% 6.out | grep end | wc -l
> > > >     33
> > > > 
> > > > 
> > > > Problem in unregistered handlers.
> > > 
> > > But unregistered handlers shouldn't be a problem. The process is
> > > been killed when alarm sends the note. That's why the code worked
> > > removing the read statement, the alarm is set off and the note is
> > > not sent before the process ends. I just don't see why the
> > > process
> > > is been killed. The documentation describes another behavior. To
> > > me it smells like bug barbecue (corrupted onnote?). Maybe I got
> > > something wrong, bear with me.
> > > 
> > > > > Note that you could register the handler in threadmain and
> > > > > avoid
> > > > > completely this issue, but as I said before, something seems
> > > > > wrong
> > > > > to me here.
> > > > 
> > > > I'm don't understand how handler in threadmain would solve the
> > > > problem.
> > > > I need in 'alarm' on per process basis.
> > > 
> > > You need alarm() in every process, but you don't need to register
> > > the
> > > same handler 80 times!
> > > 
> > > adr.
> > 
> > I think there is some confussion here, so I'll explain myself a
> > little more.
> > 
> > Lets change your last example to not use libthread:
> > 
> > #include <u.h>
> > #include <libc.h>
> > 
> > int
> > handler_alarm(void *, char *msg)
> > {
> >          if(strstr(msg, "alarm")){
> >                  return 1;
> >          }
> > 
> >          return 0;
> > }
> > 
> > int
> > test(void)
> > {
> >          if(atnotify(handler_alarm, 1) == 0){
> >                  fprint(1, "handler not registered\n");
> >          }
> > 
> >          alarm(10);
> >          fprint(1, "start\n");
> >          sleep(40);
> >          fprint(1, "end\n");
> >          alarm(0);
> > 
> >          return 0;
> > }
> > 
> > void
> > main()
> > {
> >          for(int i = 0; i < 80; i++){
> >                  test();
> >          }
> > 
> >          exits(nil);
> > }
> > 
> > You see, after the NFNth iteration of test(), onnot[NFN] in
> > atnotify
> > will be full, the handlers wont be registered but the code will
> > work without any problem. It doesn't matter, the first handler in
> > onnot[] will be executed. I fact you only need one handler there,
> > not
> > 80, you should move atnotify to main.
> > 
> > The same should be happening with libthread. I'm really the only
> > one smelling a bug here?
> 
> No, you've got me convinced something much more wrong is going on.
> Because you're right, our read children shouldn't just be gone,
> we should return from read with an error and then print the "end"
> line.
> I've attempted to reproduce it, trying to remove the libthread/notify
> factors. I've come up with this:
> 
> #include <u.h>
> #include <libc.h>
> 
> static void
> proc_udp(void*)
> {
>         char resp[512];
>         char req[] = "request";
>         int fd;
>         int n;
>         int pid;
> 
>         fd = dial("udp!185.157.221.201!5678", nil, nil, nil);
>         if(fd < 0)
>                 exits("can't dial");
> 
>         if(write(fd, req, strlen(req)) != strlen(req))
>                 exits("can't write");
> 
>         pid = getpid();
>         fprint(1, "start %d\n", pid);
>         n = read(fd, resp, sizeof(resp)-1);
>         fprint(1, "end %d %d\n", pid, n);
>         exits(nil);
> }
> 
> void
> main(int, char**)
> {
>         int i;
>         Waitmsg *wm;
> 
>         for(i = 0; i < 10; i++){
>                 switch(fork()){
>                 case -1:
>                         sysfatal("fork %r");
>                 case 0:
>                         proc_udp(nil);
>                         sysfatal("ret");
>                 default:
>                         break;
>                 }
>         }
>         for(i = 0; i < 10; i++){
>                 wm = wait();
>                 print("proc %d died with message %s\n", wm->pid, wm-
> >msg);
>         }
>         exits(nil);
> }
> 
> This code makes it pretty obvious that we are losing some children;
> on my machine this program never exits. I see some portion of the
> readers correctly returning -1, and the parent is able to get their
> Waitmsg but not all of them.
> 

cpu% 6.out
start 20383
start 20390
start 20385
start 20389
start 20387
start 20384
start 20388
start 20381
start 20382
start 20386
end 20390 -1
end 20386 -1
end 20382 -1
end 20381 -1
end 20387 -1
end 20384 -1
proc 20390 died with message 
proc 20384 died with message 
proc 20387 died with message 
proc 20381 died with message 
proc 20382 died with message 
proc 20386 died with message 

'losed' processes stalled in read syscall:

glenda        20380    0:00   0:00       52K Await    6.out
glenda        20383    0:00   0:00       48K Pread    6.out
glenda        20385    0:00   0:00       48K Pread    6.out
glenda        20388    0:00   0:00       48K Pread    6.out
glenda        20389    0:00   0:00       48K Pread    6.out


Regards,
Andrej

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M4109aa26c6245de508c32baa
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-18  9:22                 ` adr
  2022-06-18 12:53                   ` Jacob Moody
@ 2022-06-18 22:22                   ` andrey100100100
  1 sibling, 0 replies; 49+ messages in thread
From: andrey100100100 @ 2022-06-18 22:22 UTC (permalink / raw)
  To: 9fans

В Сб, 18/06/2022 в 09:22 +0000, adr пишет:
> On Sat, 18 Jun 2022, adr wrote:
> 
> > On Sat, 18 Jun 2022, andrey100100100@gmail.com wrote:
> > 
> > > ---------------------------------------------
> > > 
> > > cpu% 6.out | grep end | wc -l
> > >     33
> > > 
> > > 
> > > Problem in unregistered handlers.
> > 
> > But unregistered handlers shouldn't be a problem. The process is
> > been killed when alarm sends the note. That's why the code worked
> > removing the read statement, the alarm is set off and the note is
> > not sent before the process ends. I just don't see why the process
> > is been killed. The documentation describes another behavior. To
> > me it smells like bug barbecue (corrupted onnote?). Maybe I got
> > something wrong, bear with me.
> > 
> > > > Note that you could register the handler in threadmain and
> > > > avoid
> > > > completely this issue, but as I said before, something seems
> > > > wrong
> > > > to me here.
> > > 
> > > I'm don't understand how handler in threadmain would solve the
> > > problem.
> > > I need in 'alarm' on per process basis.
> > 
> > You need alarm() in every process, but you don't need to register
> > the
> > same handler 80 times!
> > 
> > adr.
> 
> I think there is some confussion here, so I'll explain myself a
> little more.
> 
> Lets change your last example to not use libthread:
> 
> #include <u.h>
> #include <libc.h>
> 
> int
> handler_alarm(void *, char *msg)
> {
>          if(strstr(msg, "alarm")){
>                  return 1;
>          }
> 
>          return 0;
> }
> 
> int
> test(void)
> {
>          if(atnotify(handler_alarm, 1) == 0){
>                  fprint(1, "handler not registered\n");
>          }
> 
>          alarm(10);
>          fprint(1, "start\n");
>          sleep(40);
>          fprint(1, "end\n");
>          alarm(0);
> 
>          return 0;
> }
> 
> void
> main()
> {
>          for(int i = 0; i < 80; i++){
>                  test();
>          }
> 
>          exits(nil);
> }
> 
> You see, after the NFNth iteration of test(), onnot[NFN] in atnotify
> will be full, the handlers wont be registered but the code will
> work without any problem. It doesn't matter, the first handler in
> onnot[] will be executed. I fact you only need one handler there, not
> 80, you should move atnotify to main.
> 
> The same should be happening with libthread. I'm really the only
> one smelling a bug here?

Atnotify and threadnotify have different implementation.
it seems that threadnotify is for processes with shared memory,
atnotify for fork() (no shared memory).

But then it is not entirely clear why lock() is used in atnotify?




Regards,
Andrej



------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-Mfea7f52ccd99974d957af810
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-18 16:57                 ` andrey100100100
@ 2022-06-19  2:40                   ` adr
  2022-06-19  5:01                     ` adr
  0 siblings, 1 reply; 49+ messages in thread
From: adr @ 2022-06-19  2:40 UTC (permalink / raw)
  To: 9fans

Oh man... how silly, I know what's going on. We are using processes
not threads, so although we are sharing the same array of handlers,
they are registered for different processes. When the array is full
the next processes fail to register handlers _for_them_ so as andrey
rightly said, the default action (death) is taken when the alarm
note is sent.

The solution is obvious, cancel the process' handlers before it
exits so we don't run out of space.

Now, is there any reason to not do that in threadexits() when it
terminates the process?

Shouldn't threadnotify() cancel only the process' handlers? We are
sharing onnote[NFN] and the code as it is right now removes the
first handler that match the pointer, it can belong to another
process.

Something like this?:

int
threadnotify(int (*f)(void*, char*), int in)
{
        int i, topid;
        int (*from)(void*, char*), (*to)(void*, char*);

        if(in){
                from = nil;
                frompid = 0;
                to = f;
                topid = _threadgetproc()->pid;
        }else{
                from = f;
                frompid = _threadgetproc()->pid;
                to = nil;
                topid = 0;
        }
        lock(&onnotelock);
        for(i=0; i<NFN; i++)
                if(onnote[i]==from && onnotepid[i]==frompid){
                        onnote[i] = to;
                        onnotepid[i] = topid;
                        break;
                }
        unlock(&onnotelock);
        return i<NFN;
}

Any thoughts?

adr.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M45b18058b8188bfcc376c11c
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-19  2:40                   ` adr
@ 2022-06-19  5:01                     ` adr
  2022-06-19  8:52                       ` andrey100100100
  0 siblings, 1 reply; 49+ messages in thread
From: adr @ 2022-06-19  5:01 UTC (permalink / raw)
  To: 9fans

On Sun, 19 Jun 2022, adr wrote:
> The solution is obvious, cancel the process' handlers before it
> exits so we don't run out of space.

This was really silly...

> Now, is there any reason to not do that in threadexits() when it
> terminates the process?
>
> Shouldn't threadnotify() cancel only the process' handlers? We are
> sharing onnote[NFN] and the code as it is right now removes the
> first handler that match the pointer, it can belong to another
> process.

I ended up playing with this (do not register duplicated handlers,
cancel only the notes of the thread's process and cancel all notes
when the process exits):

/sys/src/libthread/sched.c:
[...]
                if(t == nil){
                        _threaddebug(DBGSCHED, "all threads gone; exiting");
                        cancelnotes(p->pid);
                        _schedexit(p);
                }
[...]
/sys/src/libthread/note.c
[...]
int
threadnotify(int (*f)(void*, char*), int in)
{
        int i, frompid, topid;
        int (*from)(void*, char*), (*to)(void*, char*);

        if(in){
                from = nil;
                frompid = 0;
                to = f;
                topid = _threadgetproc()->pid;
                lock(&onnotelock);
                for(i=0; i<NFN; i++)
                        if(onnote[i]==to && onnotepid[i]==topid){
                                unlock(&onnotelock);
                                return i<NFN;
                        }
                unlock(&onnotelock);
        }else{
                from = f;
                frompid = _threadgetproc()->pid;
                to = nil;
                topid = 0;
        }
        lock(&onnotelock);
        for(i=0; i<NFN; i++)
                if(onnote[i]==from && onnotepid[i]==frompid){
                        onnote[i] = to;
                        onnotepid[i] = topid;
                        break;
                }
        unlock(&onnotelock);
        return i<NFN;
}

void
cancelnotes(int pid)
{
        int i;

        lock(&onnotelock);
        for(i=0; i<NFN; i++)
                if(onnotepid[i] == pid){
                        onnote[i] = nil;
                        onnotepid[i] = 0;
                }
        unlock(&onnotelock);
        return;
}
/sys/include/thread.h
[...]
void cancelnotes(int pid);
[...]

Anyway, I would like to know a real example when it is useful to
span a hundred processes using libthread without really exploiting
threads at all. I mean, we have been streching things a little
here!

adr.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M2ce0715ee9e683875392de68
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-18 12:53                   ` Jacob Moody
  2022-06-18 22:03                     ` andrey100100100
@ 2022-06-19  5:54                     ` adr
  2022-06-19  6:13                       ` Jacob Moody
  1 sibling, 1 reply; 49+ messages in thread
From: adr @ 2022-06-19  5:54 UTC (permalink / raw)
  To: 9fans

On Sat, 18 Jun 2022, Jacob Moody wrote:
> I've attempted to reproduce it, trying to remove the libthread/notify
> factors. I've come up with this:
>
> #include <u.h>
> #include <libc.h>
>
> static void
> proc_udp(void*)
> {
>        char resp[512];
>        char req[] = "request";
>        int fd;
>        int n;
>        int pid;
>
>        fd = dial("udp!185.157.221.201!5678", nil, nil, nil);
>        if(fd < 0)
>                exits("can't dial");
>
>        if(write(fd, req, strlen(req)) != strlen(req))
>                exits("can't write");
>
>        pid = getpid();
>        fprint(1, "start %d\n", pid);
>        n = read(fd, resp, sizeof(resp)-1);
>        fprint(1, "end %d %d\n", pid, n);
>        exits(nil);
> }
>
> void
> main(int, char**)
> {
>        int i;
>        Waitmsg *wm;
>
>        for(i = 0; i < 10; i++){
>                switch(fork()){
>                case -1:
>                        sysfatal("fork %r");
>                case 0:
>                        proc_udp(nil);
>                        sysfatal("ret");
>                default:
>                        break;
>                }
>        }
>        for(i = 0; i < 10; i++){
>                wm = wait();
>                print("proc %d died with message %s\n", wm->pid, wm->msg);
>        }
>        exits(nil);
> }
>
> This code makes it pretty obvious that we are losing some children;
> on my machine this program never exits. I see some portion of the
> readers correctly returning -1, and the parent is able to get their
> Waitmsg but not all of them.

Moody I think this old thread will interest you:

https://marc.info/?t=112730920400001&r=1&w=2

Russ Cox explained there:
  It appears that your program, at its core, it is doing this:

  void
  readproc(void *v)
  {
      int fd;
      char buf[100];
      fd = (int)v;
      read(fd, buf, sizeof buf);
  }

  void
  threadmain(int argc, char **argv)
  {
      int p[2];
      pipe(p);
      proccreate(readproc, (void*)p[0], 8192);
      proccreate(readproc, (void*)p[1], 8192);
      close(p[0]);
      /* and here you expect the first readproc to be done */
      close(p[1]);
      /* and here the second */
  }

  Each read call is holding up a reference to its channel
  inside the kernel, so that even though you've closed the fd
  and removed the ref from the fd table, there is still a reference
  to each side of the pipe in the form of the process blocked
  on the read.

  I've never been sure whether the implicit ref held during
  the system call is good behavior, but it's hard to change.

  In your case, writing 0 (or anything) makes the read
  finish, releasing the last ref to the underlying pipe when
  the system call finishes, and then everything cleans up
  as expected.  So you've found your workaround, and now
  we understand why it works.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M6e48031f9e8673387c0b47b8
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-19  5:54                     ` adr
@ 2022-06-19  6:13                       ` Jacob Moody
  0 siblings, 0 replies; 49+ messages in thread
From: Jacob Moody @ 2022-06-19  6:13 UTC (permalink / raw)
  To: 9fans

On 6/18/22 23:54, adr wrote:
> On Sat, 18 Jun 2022, Jacob Moody wrote:
>> I've attempted to reproduce it, trying to remove the libthread/notify
>> factors. I've come up with this:
>>
>> #include <u.h>
>> #include <libc.h>
>>
>> static void
>> proc_udp(void*)
>> {
>>        char resp[512];
>>        char req[] = "request";
>>        int fd;
>>        int n;
>>        int pid;
>>
>>        fd = dial("udp!185.157.221.201!5678", nil, nil, nil);
>>        if(fd < 0)
>>                exits("can't dial");
>>
>>        if(write(fd, req, strlen(req)) != strlen(req))
>>                exits("can't write");
>>
>>        pid = getpid();
>>        fprint(1, "start %d\n", pid);
>>        n = read(fd, resp, sizeof(resp)-1);
>>        fprint(1, "end %d %d\n", pid, n);
>>        exits(nil);
>> }
>>
>> void
>> main(int, char**)
>> {
>>        int i;
>>        Waitmsg *wm;
>>
>>        for(i = 0; i < 10; i++){
>>                switch(fork()){
>>                case -1:
>>                        sysfatal("fork %r");
>>                case 0:
>>                        proc_udp(nil);
>>                        sysfatal("ret");
>>                default:
>>                        break;
>>                }
>>        }
>>        for(i = 0; i < 10; i++){
>>                wm = wait();
>>                print("proc %d died with message %s\n", wm->pid, wm->msg);
>>        }
>>        exits(nil);
>> }
>>
>> This code makes it pretty obvious that we are losing some children;
>> on my machine this program never exits. I see some portion of the
>> readers correctly returning -1, and the parent is able to get their
>> Waitmsg but not all of them.
> 
> Moody I think this old thread will interest you:
> 
> https://marc.info/?t=112730920400001&r=1&w=2
> 
> Russ Cox explained there:
>   It appears that your program, at its core, it is doing this:
> 
>   void
>   readproc(void *v)
>   {
>       int fd;
>       char buf[100];
>       fd = (int)v;
>       read(fd, buf, sizeof buf);
>   }
> 
>   void
>   threadmain(int argc, char **argv)
>   {
>       int p[2];
>       pipe(p);
>       proccreate(readproc, (void*)p[0], 8192);
>       proccreate(readproc, (void*)p[1], 8192);
>       close(p[0]);
>       /* and here you expect the first readproc to be done */
>       close(p[1]);
>       /* and here the second */
>   }
> 
>   Each read call is holding up a reference to its channel
>   inside the kernel, so that even though you've closed the fd
>   and removed the ref from the fd table, there is still a reference
>   to each side of the pipe in the form of the process blocked
>   on the read.
> 
>   I've never been sure whether the implicit ref held during
>   the system call is good behavior, but it's hard to change.
> 
>   In your case, writing 0 (or anything) makes the read
>   finish, releasing the last ref to the underlying pipe when
>   the system call finishes, and then everything cleans up
>   as expected.  So you've found your workaround, and now
>   we understand why it works.
> 

I was just making the wrong observation here.
I thought I had observed the child procs getting
murdered mid read, and the parent never getting
the Waitmsg. Testing again I see as Andrej had observed,
they are just blocking. I thought I was seeing a bug
related to just udp, nothing to do with notes/threads.

I apologize for the confusion, interesting thread
you linked regardless.

> ------------------------------------------
> 9fans: 9fans
> Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M6e48031f9e8673387c0b47b8
> Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-Md81beb48e514ad6a776fa41d
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-19  5:01                     ` adr
@ 2022-06-19  8:52                       ` andrey100100100
  2022-06-19 10:32                         ` adr
  0 siblings, 1 reply; 49+ messages in thread
From: andrey100100100 @ 2022-06-19  8:52 UTC (permalink / raw)
  To: 9fans

В Вс, 19/06/2022 в 05:01 +0000, adr пишет:
> On Sun, 19 Jun 2022, adr wrote:
> > The solution is obvious, cancel the process' handlers before it
> > exits so we don't run out of space.
> 
> This was really silly...
> 
> > Now, is there any reason to not do that in threadexits() when it
> > terminates the process?
> > 
> > Shouldn't threadnotify() cancel only the process' handlers? We are
> > sharing onnote[NFN] and the code as it is right now removes the
> > first handler that match the pointer, it can belong to another
> > process.
> 
> I ended up playing with this (do not register duplicated handlers,
> cancel only the notes of the thread's process and cancel all notes
> when the process exits):
> 
> /sys/src/libthread/sched.c:
> [...]
>                 if(t == nil){
>                         _threaddebug(DBGSCHED, "all threads gone;
> exiting");
>                         cancelnotes(p->pid);
>                         _schedexit(p);
>                 }
> [...]
> /sys/src/libthread/note.c
> [...]
> int
> threadnotify(int (*f)(void*, char*), int in)
> {
>         int i, frompid, topid;
>         int (*from)(void*, char*), (*to)(void*, char*);
> 
>         if(in){
>                 from = nil;
>                 frompid = 0;
>                 to = f;
>                 topid = _threadgetproc()->pid;
>                 lock(&onnotelock);
>                 for(i=0; i<NFN; i++)
>                         if(onnote[i]==to && onnotepid[i]==topid){
>                                 unlock(&onnotelock);
>                                 return i<NFN;
>                         }
>                 unlock(&onnotelock);
>         }else{
>                 from = f;
>                 frompid = _threadgetproc()->pid;
>                 to = nil;
>                 topid = 0;
>         }
>         lock(&onnotelock);
>         for(i=0; i<NFN; i++)
>                 if(onnote[i]==from && onnotepid[i]==frompid){
>                         onnote[i] = to;
>                         onnotepid[i] = topid;
>                         break;
>                 }
>         unlock(&onnotelock);
>         return i<NFN;
> }
> 
> void
> cancelnotes(int pid)
> {
>         int i;
> 
>         lock(&onnotelock);
>         for(i=0; i<NFN; i++)
>                 if(onnotepid[i] == pid){
>                         onnote[i] = nil;
>                         onnotepid[i] = 0;
>                 }
>         unlock(&onnotelock);
>         return;
> }
> /sys/include/thread.h
> [...]
> void cancelnotes(int pid);
> [...]


No way. All processes must run simultaneously.
NFN limit cannot be bypassed.

> 
> Anyway, I would like to know a real example when it is useful to
> span a hundred processes using libthread without really exploiting
> threads at all. I mean, we have been streching things a little
> here!
> 

In general, problem not in processes, threads or notes. Problem in
network nature. In the unreliable nature of network communication,
requiring timeouts, packet loss handling, retransmission, etc.

I'm trying to solve it using Plan 9.

In this particular case, I am trying to reduce the total polling time
of, for example, a sensor network by increasing the number of sensors
polled at the same time.

Ready to hear the best solution.

Regards,
Andrej

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-Mf178309eb46992e6940a5ea4
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-19  8:52                       ` andrey100100100
@ 2022-06-19 10:32                         ` adr
  2022-06-19 11:40                           ` andrey100100100
  2022-06-19 15:10                           ` andrey100100100
  0 siblings, 2 replies; 49+ messages in thread
From: adr @ 2022-06-19 10:32 UTC (permalink / raw)
  To: 9fans

On Sun, 19 Jun 2022, andrey100100100@gmail.com wrote:
> No way. All processes must run simultaneously.
> NFN limit cannot be bypassed.

Yeah, that's why I said it was silly:
>>> The solution is obvious, cancel the process' handlers before it
>>> exits so we don't run out of space.
>>
>> This was really silly...

The changes I'm testing are not for evading the limit, but for
making the handler managment more efficient and specially to avoid
the case when a process could remove another process' handler from
onnote[].

adr.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M4d8837da259920e0095176e2
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-19 10:32                         ` adr
@ 2022-06-19 11:40                           ` andrey100100100
  2022-06-19 12:01                             ` andrey100100100
  2022-06-19 15:10                           ` andrey100100100
  1 sibling, 1 reply; 49+ messages in thread
From: andrey100100100 @ 2022-06-19 11:40 UTC (permalink / raw)
  To: 9fans

В Вс, 19/06/2022 в 10:32 +0000, adr пишет:
> On Sun, 19 Jun 2022, andrey100100100@gmail.com wrote:
> > No way. All processes must run simultaneously.
> > NFN limit cannot be bypassed.
> 
> Yeah, that's why I said it was silly:
> > > > The solution is obvious, cancel the process' handlers before it
> > > > exits so we don't run out of space.
> > > 
> > > This was really silly...
> 
> The changes I'm testing are not for evading the limit, but for
> making the handler managment more efficient and specially to avoid
> the case when a process could remove another process' handler from
> onnote[].

Ok.


More complete example with thread library:

-------------------------------------------------------------
#include <u.h>
#include <libc.h>
#include <thread.h>

static int
handler_alarm(void *, char *msg)
{
        if(strstr(msg, "alarm")){
                return 1;
        }

        return 0;
}

static void
proc_func(void *c)
{
        Channel *chan = c;

        int fd, resp_len;
        char req[] = "request";
        char resp[512], *r = nil;

        if(threadnotify(handler_alarm, 1) == 0){
                fprint(1, "handler not registred\n");
        }

        alarm(2000);
        if((fd = dial("udp!185.157.221.201!5678", nil, nil, nil)) >=
0){
                alarm(0);
                alarm(2000);
                if(write(fd, req, strlen(req)) == strlen(req)){
                        alarm(0);
                        alarm(2000);
                        if((resp_len = read(fd, resp, sizeof(resp))) >
0){
                                alarm(0);
                                if((r = malloc(sizeof(resp))) == nil){
                                        sysfatal("malloc error: %r");
                                }
                                memmove(r, resp, sizeof(resp));
                        }
                }
                close(fd);
        }

        send(chan, r);
        threadexits(nil);
}

int mainstacksize = 5242880;

void
threadmain(int argc, char *argv[])
{
        Channel *chan = nil;
        char *data = nil;
        int nproc = 0;

        ARGBEGIN{
        case 'n':
                nproc = atoi(EARGF(threadexitsall(nil)));
                break;
        default:
                threadexitsall(nil);
        }ARGEND;

        if((chan = chancreate(sizeof(char *), 0)) == nil){
                sysfatal("channel error: %r");
        }

        for(int i = 0; i < nproc; i++){
                proccreate(proc_func, chan, 10240);
        }

        for(int i = 0; i < nproc; i++){
                if(data = recvp(chan)){
                        free(data);
                }
        }

        if(nproc)
                fprint(1, "EXIT with nproc: %d\n", nproc);

        threadexitsall(nil);
}
-------------------------------------------------------------

cpu% 6.out -n 33
EXIT with nproc: 33

with 34:

cpu% 6.out -n 34
handler not registred

and stalled.


So it is important for me that all processes respond.
Such use, it seems to me, simplifies the program.


Regards,
Andrej



------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-Mefb1eb17df6e6f347f6c5bf9
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-19 11:40                           ` andrey100100100
@ 2022-06-19 12:01                             ` andrey100100100
  0 siblings, 0 replies; 49+ messages in thread
From: andrey100100100 @ 2022-06-19 12:01 UTC (permalink / raw)
  To: 9fans

В Вс, 19/06/2022 в 14:40 +0300, andrey100100100@gmail.com пишет:
> В Вс, 19/06/2022 в 10:32 +0000, adr пишет:
> > On Sun, 19 Jun 2022, andrey100100100@gmail.com wrote:
> > > No way. All processes must run simultaneously.
> > > NFN limit cannot be bypassed.
> > 
> > Yeah, that's why I said it was silly:
> > > > > The solution is obvious, cancel the process' handlers before
> > > > > it
> > > > > exits so we don't run out of space.
> > > > 
> > > > This was really silly...
> > 
> > The changes I'm testing are not for evading the limit, but for
> > making the handler managment more efficient and specially to avoid
> > the case when a process could remove another process' handler from
> > onnote[].
> 
> Ok.
> 
> 
> More complete example with thread library:
> 
> -------------------------------------------------------------
> #include <u.h>
> #include <libc.h>
> #include <thread.h>
> 
> static int
> handler_alarm(void *, char *msg)
> {
>         if(strstr(msg, "alarm")){
>                 return 1;
>         }
> 
>         return 0;
> }
> 
> static void
> proc_func(void *c)
> {
>         Channel *chan = c;
> 
>         int fd, resp_len;
>         char req[] = "request";
>         char resp[512], *r = nil;
> 
>         if(threadnotify(handler_alarm, 1) == 0){
>                 fprint(1, "handler not registred\n");
>         }
> 
>         alarm(2000);
>         if((fd = dial("udp!185.157.221.201!5678", nil, nil, nil)) >=
> 0){
>                 alarm(0);
>                 alarm(2000);
>                 if(write(fd, req, strlen(req)) == strlen(req)){
>                         alarm(0);
>                         alarm(2000);
>                         if((resp_len = read(fd, resp, sizeof(resp)))
> 0){
>                                 alarm(0);
>                                 if((r = malloc(sizeof(resp))) ==
> nil){
>                                         sysfatal("malloc error: %r");
>                                 }
>                                 memmove(r, resp, sizeof(resp));
>                         }
>                 }
>                 close(fd);
>         }
> 

+         alarm(0);

>         send(chan, r);
>         threadexits(nil);
> }
> 
> int mainstacksize = 5242880;
> 
> void
> threadmain(int argc, char *argv[])
> {
>         Channel *chan = nil;
>         char *data = nil;
>         int nproc = 0;
> 
>         ARGBEGIN{
>         case 'n':
>                 nproc = atoi(EARGF(threadexitsall(nil)));
>                 break;
>         default:
>                 threadexitsall(nil);
>         }ARGEND;
> 
>         if((chan = chancreate(sizeof(char *), 0)) == nil){
>                 sysfatal("channel error: %r");
>         }
> 
>         for(int i = 0; i < nproc; i++){
>                 proccreate(proc_func, chan, 10240);
>         }
> 
>         for(int i = 0; i < nproc; i++){
>                 if(data = recvp(chan)){
>                         free(data);
>                 }
>         }
> 
>         if(nproc)
>                 fprint(1, "EXIT with nproc: %d\n", nproc);
> 
>         threadexitsall(nil);
> }
> -------------------------------------------------------------
> 
> cpu% 6.out -n 33
> EXIT with nproc: 33
> 
> with 34:
> 
> cpu% 6.out -n 34
> handler not registred
> 
> and stalled.
> 
> 
> So it is important for me that all processes respond.
> Such use, it seems to me, simplifies the program.
> 
> 

Regards,
Andrej

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M1bf9fc766759c5c66a48278e
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-19 10:32                         ` adr
  2022-06-19 11:40                           ` andrey100100100
@ 2022-06-19 15:10                           ` andrey100100100
  2022-06-19 16:41                             ` adr
  1 sibling, 1 reply; 49+ messages in thread
From: andrey100100100 @ 2022-06-19 15:10 UTC (permalink / raw)
  To: 9fans

В Вс, 19/06/2022 в 10:32 +0000, adr пишет:
> On Sun, 19 Jun 2022, andrey100100100@gmail.com wrote:
> > No way. All processes must run simultaneously.
> > NFN limit cannot be bypassed.
> 
> Yeah, that's why I said it was silly:
> > > > The solution is obvious, cancel the process' handlers before it
> > > > exits so we don't run out of space.
> > > 
> > > This was really silly...
> 
> The changes I'm testing are not for evading the limit, but for
> making the handler managment more efficient and specially to avoid
> the case when a process could remove another process' handler from
> onnote[].
> 

Yes, you were absolutely right, the thread library needs some work.

It is impossible to use multiple processes with notes, due to the
exhaustion of the NFN limit.

as example:


---------------------------------------------------------
#include <u.h>
#include <libc.h>
#include <thread.h>

static int
handler_alarm(void *, char *msg)
{
        if(strstr(msg, "alarm")){
                return 1;
        }

        return 0;
}

static void
proc_func(void *c)
{
        Channel *chan = c;

        int fd;
        char req[] = "request";
        char resp[512], *r = nil;

        if(threadnotify(handler_alarm, 1) == 0){
                fprint(1, "handler not registred\n");
        }

        alarm(2000);
        if((fd = dial("udp!185.157.221.201!5678", nil, nil, nil)) >=
0){
                alarm(0);
                alarm(2000);
                if(write(fd, req, strlen(req)) == strlen(req)){
                        alarm(0);
                        alarm(2000);
                        if(read(fd, resp, sizeof(resp)) > 0){
                                alarm(0);
                                if((r = malloc(sizeof(resp))) == nil){
                                        sysfatal("malloc error: %r");
                                }
                                memmove(r, resp, sizeof(resp));
                        }
                }
                close(fd);
        }

        alarm(0);
        send(chan, r);
        threadexits(nil);
}

int mainstacksize = 5242880;

void
threadmain(int argc, char *argv[])
{
        Channel *chan = nil;
        char *data = nil;
        int nproc = 0;

        ARGBEGIN{
        case 'n':
                nproc = atoi(EARGF(threadexitsall(nil)));
                break;
        default:
                threadexitsall(nil);
        }ARGEND;

        if((chan = chancreate(sizeof(char *), 0)) == nil){
                sysfatal("channel error: %r");
        }

        for(int j = 0; j < 10 ; j++){
                for(int i = 0; i < nproc; i++){
                        proccreate(proc_func, chan, 10240);
                }

                for(int i = 0; i < nproc; i++){
                        if(data = recvp(chan)){
                                free(data);
                        }
                }
        fprint(1, "j: %d\n", j);
        }

        if(nproc)
                fprint(1, "EXIT with nproc: %d\n", nproc);

        threadexitsall(nil);
}
---------------------------------------------------------

cpu% 6.out -n 10
j: 0
j: 1
j: 2
handler not registred
handler not registred
handler not registred
handler not registred
handler not registred
handler not registred
handler not registred

and stalled forever...



And yes, threadnotify(handler_alarm, 0) in proc_func
does not help.

I don't know how best to get out of this situation.
Probably will have to rewrite the program, so that it does not matter
how many processes have fallen. But it's an increase in complexity.

Regards,
Andrej


------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M2fc7967213269fe01e89ac5c
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-19 15:10                           ` andrey100100100
@ 2022-06-19 16:41                             ` adr
  2022-06-19 21:22                               ` andrey100100100
  0 siblings, 1 reply; 49+ messages in thread
From: adr @ 2022-06-19 16:41 UTC (permalink / raw)
  To: 9fans

On Sun, 19 Jun 2022, andrey100100100@gmail.com wrote:
> Yes, you were absolutely right, the thread library needs some work.
>
> It is impossible to use multiple processes with notes, due to the
> exhaustion of the NFN limit.

Andrej, what are you going to do with alarm in the real thing?

You could use threads (cooperative round-ribbon multitasking) using
some data log to register their state and use a "master" thread to
control them (kill them, change some data structure, etc).

You can use rfork, locks, pipes, etc and forget about libthread.

You could experiment with the topology of the nodes, for example
instead of a big star you can simplify things a lot using chains
of nodes where the last node sends the chain data to the analyzer
(you were talking about polling sensors):

aNode1 --> aNode2 --> aNode3 --> ... --> aNoden -->
                                                                                       > --- > Anz
bNode1 --> bNode2 --> bNode3 --> ... --> bNoden -->

Imagine that n=100. Each node only has to make a connection with
2 nodes (the first of the chain just one), adding data to the
received one and send it to the next. Anz only has to make a
connection with the last nodes of the chains, two in this case,
and process the data received. Of course you have to play with your
numbers, the acceptable delay, etc.

I'm pretty sure people here could point you to some examples using
plan9, and note that you could use raspberry pi zeros as nodes.

Have fun.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-Md1b869bf61deccadf9733908
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-19 16:41                             ` adr
@ 2022-06-19 21:22                               ` andrey100100100
  2022-06-19 21:26                                 ` andrey100100100
  2022-06-20  4:41                                 ` adr
  0 siblings, 2 replies; 49+ messages in thread
From: andrey100100100 @ 2022-06-19 21:22 UTC (permalink / raw)
  To: 9fans

В Вс, 19/06/2022 в 16:41 +0000, adr пишет:
> On Sun, 19 Jun 2022, andrey100100100@gmail.com wrote:
> > Yes, you were absolutely right, the thread library needs some work.
> > 
> > It is impossible to use multiple processes with notes, due to the
> > exhaustion of the NFN limit.
> 
> Andrej, what are you going to do with alarm in the real thing?

The note 'alarm' is needed to interrupt the system call on timeout
since system calls to lan 9 can be of a network nature, a notes is
indispensable.
A great example of this is the read() system call on a udp-connection.
How else can this system call be interrupted?



> You could use threads (cooperative round-ribbon multitasking) using
> some data log to register their state and use a "master" thread to
> control them (kill them, change some data structure, etc).
> 
> You can use rfork, locks, pipes, etc and forget about libthread.
> 
> You could experiment with the topology of the nodes, for example
> instead of a big star you can simplify things a lot using chains
> of nodes where the last node sends the chain data to the analyzer
> (you were talking about polling sensors):
> 
> aNode1 --> aNode2 --> aNode3 --> ... --> aNoden -->
>                                                                      
>                   > --- > Anz
> bNode1 --> bNode2 --> bNode3 --> ... --> bNoden -->
> 
> Imagine that n=100. Each node only has to make a connection with
> 2 nodes (the first of the chain just one), adding data to the
> received one and send it to the next. Anz only has to make a
> connection with the last nodes of the chains, two in this case,
> and process the data received. Of course you have to play with your
> numbers, the acceptable delay, etc.
> 
> I'm pretty sure people here could point you to some examples using
> plan9, and note that you could use raspberry pi zeros as nodes.
> 
> Have fun.
> 
> ------------------------------------------
> 9fans: 9fans
> Permalink:
> https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-Md1b869bf61deccadf9733908
> Delivery options:
> https://9fans.topicbox.com/groups/9fans/subscription

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M24ba5edacde926075a431016
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-19 21:22                               ` andrey100100100
@ 2022-06-19 21:26                                 ` andrey100100100
  2022-06-20  4:41                                 ` adr
  1 sibling, 0 replies; 49+ messages in thread
From: andrey100100100 @ 2022-06-19 21:26 UTC (permalink / raw)
  To: 9fans

В Пн, 20/06/2022 в 00:22 +0300, andrey100100100@gmail.com пишет:
> В Вс, 19/06/2022 в 16:41 +0000, adr пишет:
> > On Sun, 19 Jun 2022, andrey100100100@gmail.com wrote:
> > > Yes, you were absolutely right, the thread library needs some
> > > work.
> > > 
> > > It is impossible to use multiple processes with notes, due to the
> > > exhaustion of the NFN limit.
> > 
> > Andrej, what are you going to do with alarm in the real thing?
> 
> The note 'alarm' is needed to interrupt the system call on timeout
> since system calls to lan 9 can be of a network nature, a notes is
> indispensable.
> A great example of this is the read() system call on a udp-
> connection.
> How else can this system call be interrupted?
> 
> 
> 
> > You could use threads (cooperative round-ribbon multitasking) using
> > some data log to register their state and use a "master" thread to
> > control them (kill them, change some data structure, etc).
> > 
> > You can use rfork, locks, pipes, etc and forget about libthread.


Channels are a good abstraction for exchanging data between processes.
Even if not to use the libthread, I would have to (probably) write it.
But yeah, maybe it's easier to do without it.

> > 
> > You could experiment with the topology of the nodes, for example
> > instead of a big star you can simplify things a lot using chains
> > of nodes where the last node sends the chain data to the analyzer
> > (you were talking about polling sensors):
> > 
> > aNode1 --> aNode2 --> aNode3 --> ... --> aNoden -->
> >                                                                    
> >   
> >                   > --- > Anz
> > bNode1 --> bNode2 --> bNode3 --> ... --> bNoden -->
> > 
> > Imagine that n=100. Each node only has to make a connection with
> > 2 nodes (the first of the chain just one), adding data to the
> > received one and send it to the next. Anz only has to make a
> > connection with the last nodes of the chains, two in this case,
> > and process the data received. Of course you have to play with your
> > numbers, the acceptable delay, etc.
> > 
> > I'm pretty sure people here could point you to some examples using
> > plan9, and note that you could use raspberry pi zeros as nodes.
> > 
> > Have fun.
> > 

Regards,
Andrej


------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M06156a275c41ac100d18311b
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-19 21:22                               ` andrey100100100
  2022-06-19 21:26                                 ` andrey100100100
@ 2022-06-20  4:41                                 ` adr
  2022-06-20  5:39                                   ` andrey100100100
  2022-06-20  5:59                                   ` adr
  1 sibling, 2 replies; 49+ messages in thread
From: adr @ 2022-06-20  4:41 UTC (permalink / raw)
  To: 9fans

On Mon, 20 Jun 2022, andrey100100100@gmail.com wrote:
> The note 'alarm' is needed to interrupt the system call on timeout
> since system calls to lan 9 can be of a network nature, a notes is
> indispensable.
> A great example of this is the read() system call on a udp-connection.
> How else can this system call be interrupted?

Start two processes, one which just make the call, another as a
timer to kill it. But I have something in mind for a case like
this, when all the processes are going to use the same handler
(that's why I was asking). Let me play with it a litle before I
share it.

adr.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-Mc017bb6a54415db715338ff2
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-20  4:41                                 ` adr
@ 2022-06-20  5:39                                   ` andrey100100100
  2022-06-20  5:59                                   ` adr
  1 sibling, 0 replies; 49+ messages in thread
From: andrey100100100 @ 2022-06-20  5:39 UTC (permalink / raw)
  To: 9fans

В Пн, 20/06/2022 в 04:41 +0000, adr пишет:
> On Mon, 20 Jun 2022, andrey100100100@gmail.com wrote:
> > The note 'alarm' is needed to interrupt the system call on timeout
> > since system calls to lan 9 can be of a network nature, a notes is
> > indispensable.
> > A great example of this is the read() system call on a udp-
> > connection.
> > How else can this system call be interrupted?
> 
> Start two processes, one which just make the call, another as a
> timer to kill it.

Yes, I had such an idea, but in the thread library, process IDs
increase monotonically, which is bad (due to IDs overflow) for long-
lived and actively spawning programs.

On the other hand, with process IDs, as in the kernel (reuse of IDs),
it is also required to receive group notes in case of an unexpected
termination of the child.

I.e. the problem is in identifying the processes to kill.

>  But I have something in mind for a case like
> this, when all the processes are going to use the same handler
> (that's why I was asking).

It would be great if could undo handler (delete from onnote or
something similar).

>  Let me play with it a litle before I
> share it.
> 
> adr.
> 


Regards,
Andrej

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M6bc161e383ef6a388d7c5b0a
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-20  4:41                                 ` adr
  2022-06-20  5:39                                   ` andrey100100100
@ 2022-06-20  5:59                                   ` adr
  2022-06-20 15:56                                     ` andrey100100100
  1 sibling, 1 reply; 49+ messages in thread
From: adr @ 2022-06-20  5:59 UTC (permalink / raw)
  To: 9fans

On Mon, 20 Jun 2022, adr wrote:
> But I have something in mind for a case like
> this, when all the processes are going to use the same handler
> (that's why I was asking). Let me play with it a litle before I
> share it.

Ok, the idea is this: If in is bigger than zero in
threadnotify(int (*f)(void*, char*), int in), the handler is register
for the calling process. If in is 0, then the handler is cleared
for the calling process. If in is -1, the handler is register for
all processes and if in is less than -1, it is cleared for all
processes (expect for those who have already registered it for themselves).

Now back to your example, as all the processes are going to use the same handler,
you just have to register it once in threadmain:

#include <u.h> 
#include <libc.h> 
#include <thread.h>

static int
handler_alarm(void *, char *msg)
{
        if(strstr(msg, "alarm"))
                return 1;
        return 0;
}

static void
proc_udp(void *)
{
        char resp[512];
        char req[] = "request";
        int fd;
        if((fd = dial("udp!185.157.221.201!5678", nil, nil, nil)) >= 0){
                if(write(fd, req, strlen(req)) == strlen(req)){
                        fprint(1, "start\n");
                        alarm(2000);
                        read(fd, resp, sizeof(resp));
                        alarm(0);
                        fprint(1, "end\n");
                }
                close(fd);
        }
        threadexits(nil);
}

int mainstacksize = 5242880;

void
threadmain(int argc, char *argv[])
{
        threadnotify(handler_alarm, -1);
        for(int i = 0; i < 80; i++)
                proccreate(proc_udp, nil, 10240);
        sleep(5000);
        threadexitsall(nil);
}
Now,
; ./5.out | grep end | wc -l
      80

Are you happy Andrej?

adr.

/sys/src/libthread/sched.c: 
[...]
                if(t == nil){
                        _threaddebug(DBGSCHED, "all threads gone; exiting");
                        cancelnotes(p->pid);
                        _schedexit(p);
                } 
[...] 
/sys/src/libthread/note.c 
[...] 
int 
threadnotify(int (*f)(void*, char*), int in) 
{
        int i, frompid, topid;
        int (*from)(void*, char*), (*to)(void*, char*);

        if(in && in>-2){
                from = nil;
                frompid = 0;
                to = f;
                topid = (in == -1)? -1 : _threadgetproc()->pid;
                lock(&onnotelock);
                for(i=0; i<NFN; i++)
                        if(onnote[i]==to && onnotepid[i]==topid){
                                unlock(&onnotelock);
                                return i<NFN;
                        }
                unlock(&onnotelock);
        }else{
                from = f;
                frompid = (in < -1)? -1 : _threadgetproc()->pid;
                to = nil;
                topid = 0;
        }
        lock(&onnotelock);
        for(i=0; i<NFN; i++)
                if(onnote[i]==from && onnotepid[i]==frompid){
                        onnote[i] = to;
                        onnotepid[i] = topid;
                        break;
                }
        unlock(&onnotelock);
        return i<NFN;
}

void
cancelnotes(int pid)
{
        int i;

        lock(&onnotelock);
        for(i=0; i<NFN; i++)
                if(onnotepid[i] == pid){
                        onnote[i] = nil;
                        onnotepid[i] = 0;
                }
        unlock(&onnotelock);
        return;
}

static void
delayednotes(Proc *p, void *v)
{
        int i;
        Note *n;
        int (*fn)(void*, char*);

        if(!p->pending)
                return;

        p->pending = 0;
        for(n=notes; n<enotes; n++){
                if(n->proc == p){
                        for(i=0; i<NFN; i++){
                                if((onnotepid[i]!=p->pid && onnotepid[i]!=-1) || (fn = onnote[i])==nil)
                                        continue;
                                if((*fn)(v, n->s))
                                        break;
[...]
/sys/include/thread.h 
[...] 
void cancelnotes(int pid); 
[...] 

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-Meda7cacd5cce9cc74c1ddaf8
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-20  5:59                                   ` adr
@ 2022-06-20 15:56                                     ` andrey100100100
  2022-06-20 22:29                                       ` Skip Tavakkolian
  0 siblings, 1 reply; 49+ messages in thread
From: andrey100100100 @ 2022-06-20 15:56 UTC (permalink / raw)
  To: 9fans

В Пн, 20/06/2022 в 05:59 +0000, adr пишет:
> On Mon, 20 Jun 2022, adr wrote:
> > But I have something in mind for a case like
> > this, when all the processes are going to use the same handler
> > (that's why I was asking). Let me play with it a litle before I
> > share it.
> 
> Ok, the idea is this: If in is bigger than zero in
> threadnotify(int (*f)(void*, char*), int in), the handler is register
> for the calling process. If in is 0, then the handler is cleared
> for the calling process. If in is -1, the handler is register for
> all processes and if in is less than -1, it is cleared for all
> processes (expect for those who have already registered it for
> themselves).
> 
> Now back to your example, as all the processes are going to use the
> same handler,
> you just have to register it once in threadmain:
> 
> #include <u.h> 
> #include <libc.h> 
> #include <thread.h>
> 
> static int
> handler_alarm(void *, char *msg)
> {
>         if(strstr(msg, "alarm"))
>                 return 1;
>         return 0;
> }
> 
> static void
> proc_udp(void *)
> {
>         char resp[512];
>         char req[] = "request";
>         int fd;
>         if((fd = dial("udp!185.157.221.201!5678", nil, nil, nil)) >=
> 0){
>                 if(write(fd, req, strlen(req)) == strlen(req)){
>                         fprint(1, "start\n");
>                         alarm(2000);
>                         read(fd, resp, sizeof(resp));
>                         alarm(0);
>                         fprint(1, "end\n");
>                 }
>                 close(fd);
>         }
>         threadexits(nil);
> }
> 
> int mainstacksize = 5242880;
> 
> void
> threadmain(int argc, char *argv[])
> {
>         threadnotify(handler_alarm, -1);
>         for(int i = 0; i < 80; i++)
>                 proccreate(proc_udp, nil, 10240);
>         sleep(5000);
>         threadexitsall(nil);
> }
> Now,
> ; ./5.out | grep end | wc -l
>       80
> 
> Are you happy Andrej?


Yes. Thank you very much! It's working!

How convenient it is to use - more experiments are needed.


> 
> adr.
> 
> /sys/src/libthread/sched.c: 
> [...]
>                 if(t == nil){
>                         _threaddebug(DBGSCHED, "all threads gone;
> exiting");
>                         cancelnotes(p->pid);
>                         _schedexit(p);
>                 } 
> [...] 
> /sys/src/libthread/note.c 
> [...] 
> int 
> threadnotify(int (*f)(void*, char*), int in) 
> {
>         int i, frompid, topid;
>         int (*from)(void*, char*), (*to)(void*, char*);
> 
>         if(in && in>-2){
>                 from = nil;
>                 frompid = 0;
>                 to = f;
>                 topid = (in == -1)? -1 : _threadgetproc()->pid;
>                 lock(&onnotelock);
>                 for(i=0; i<NFN; i++)
>                         if(onnote[i]==to && onnotepid[i]==topid){
>                                 unlock(&onnotelock);
>                                 return i<NFN;
>                         }
>                 unlock(&onnotelock);
>         }else{
>                 from = f;
>                 frompid = (in < -1)? -1 : _threadgetproc()->pid;
>                 to = nil;
>                 topid = 0;
>         }
>         lock(&onnotelock);
>         for(i=0; i<NFN; i++)
>                 if(onnote[i]==from && onnotepid[i]==frompid){
>                         onnote[i] = to;
>                         onnotepid[i] = topid;
>                         break;
>                 }
>         unlock(&onnotelock);
>         return i<NFN;
> }
> 
> void
> cancelnotes(int pid)
> {
>         int i;
> 
>         lock(&onnotelock);
>         for(i=0; i<NFN; i++)
>                 if(onnotepid[i] == pid){
>                         onnote[i] = nil;
>                         onnotepid[i] = 0;
>                 }
>         unlock(&onnotelock);
>         return;
> }
> 
> static void
> delayednotes(Proc *p, void *v)
> {
>         int i;
>         Note *n;
>         int (*fn)(void*, char*);
> 
>         if(!p->pending)
>                 return;
> 
>         p->pending = 0;
>         for(n=notes; n<enotes; n++){
>                 if(n->proc == p){
>                         for(i=0; i<NFN; i++){
>                                 if((onnotepid[i]!=p->pid &&
> onnotepid[i]!=-1) || (fn = onnote[i])==nil)
>                                         continue;
>                                 if((*fn)(v, n->s))
>                                         break;
> [...]
> /sys/include/thread.h 
> [...] 
> void cancelnotes(int pid); 
> [...] 
> 

Regards,
Andrej

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M3802ed594c69b660a4210973
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-20 15:56                                     ` andrey100100100
@ 2022-06-20 22:29                                       ` Skip Tavakkolian
  2022-06-21  7:07                                         ` andrey100100100
  2022-06-21  7:22                                         ` adr
  0 siblings, 2 replies; 49+ messages in thread
From: Skip Tavakkolian @ 2022-06-20 22:29 UTC (permalink / raw)
  To: 9fans

It's cleaner to use channels with separate io and timer threads that
do their syscalls via ioproc; this one doesn't require any changes to libthread:

https://gist.github.com/9nut/aaa9b9b6a22d69996b75ccdc6e615c61


On Mon, Jun 20, 2022 at 8:57 AM <andrey100100100@gmail.com> wrote:
>
> В Пн, 20/06/2022 в 05:59 +0000, adr пишет:
> > On Mon, 20 Jun 2022, adr wrote:
> > > But I have something in mind for a case like
> > > this, when all the processes are going to use the same handler
> > > (that's why I was asking). Let me play with it a litle before I
> > > share it.
> >
> > Ok, the idea is this: If in is bigger than zero in
> > threadnotify(int (*f)(void*, char*), int in), the handler is register
> > for the calling process. If in is 0, then the handler is cleared
> > for the calling process. If in is -1, the handler is register for
> > all processes and if in is less than -1, it is cleared for all
> > processes (expect for those who have already registered it for
> > themselves).
> >
> > Now back to your example, as all the processes are going to use the
> > same handler,
> > you just have to register it once in threadmain:
> >
> > #include <u.h>
> > #include <libc.h>
> > #include <thread.h>
> >
> > static int
> > handler_alarm(void *, char *msg)
> > {
> >         if(strstr(msg, "alarm"))
> >                 return 1;
> >         return 0;
> > }
> >
> > static void
> > proc_udp(void *)
> > {
> >         char resp[512];
> >         char req[] = "request";
> >         int fd;
> >         if((fd = dial("udp!185.157.221.201!5678", nil, nil, nil)) >=
> > 0){
> >                 if(write(fd, req, strlen(req)) == strlen(req)){
> >                         fprint(1, "start\n");
> >                         alarm(2000);
> >                         read(fd, resp, sizeof(resp));
> >                         alarm(0);
> >                         fprint(1, "end\n");
> >                 }
> >                 close(fd);
> >         }
> >         threadexits(nil);
> > }
> >
> > int mainstacksize = 5242880;
> >
> > void
> > threadmain(int argc, char *argv[])
> > {
> >         threadnotify(handler_alarm, -1);
> >         for(int i = 0; i < 80; i++)
> >                 proccreate(proc_udp, nil, 10240);
> >         sleep(5000);
> >         threadexitsall(nil);
> > }
> > Now,
> > ; ./5.out | grep end | wc -l
> >       80
> >
> > Are you happy Andrej?
>
>
> Yes. Thank you very much! It's working!
>
> How convenient it is to use - more experiments are needed.
>
>
> >
> > adr.
> >
> > /sys/src/libthread/sched.c:
> > [...]
> >                 if(t == nil){
> >                         _threaddebug(DBGSCHED, "all threads gone;
> > exiting");
> >                         cancelnotes(p->pid);
> >                         _schedexit(p);
> >                 }
> > [...]
> > /sys/src/libthread/note.c
> > [...]
> > int
> > threadnotify(int (*f)(void*, char*), int in)
> > {
> >         int i, frompid, topid;
> >         int (*from)(void*, char*), (*to)(void*, char*);
> >
> >         if(in && in>-2){
> >                 from = nil;
> >                 frompid = 0;
> >                 to = f;
> >                 topid = (in == -1)? -1 : _threadgetproc()->pid;
> >                 lock(&onnotelock);
> >                 for(i=0; i<NFN; i++)
> >                         if(onnote[i]==to && onnotepid[i]==topid){
> >                                 unlock(&onnotelock);
> >                                 return i<NFN;
> >                         }
> >                 unlock(&onnotelock);
> >         }else{
> >                 from = f;
> >                 frompid = (in < -1)? -1 : _threadgetproc()->pid;
> >                 to = nil;
> >                 topid = 0;
> >         }
> >         lock(&onnotelock);
> >         for(i=0; i<NFN; i++)
> >                 if(onnote[i]==from && onnotepid[i]==frompid){
> >                         onnote[i] = to;
> >                         onnotepid[i] = topid;
> >                         break;
> >                 }
> >         unlock(&onnotelock);
> >         return i<NFN;
> > }
> >
> > void
> > cancelnotes(int pid)
> > {
> >         int i;
> >
> >         lock(&onnotelock);
> >         for(i=0; i<NFN; i++)
> >                 if(onnotepid[i] == pid){
> >                         onnote[i] = nil;
> >                         onnotepid[i] = 0;
> >                 }
> >         unlock(&onnotelock);
> >         return;
> > }
> >
> > static void
> > delayednotes(Proc *p, void *v)
> > {
> >         int i;
> >         Note *n;
> >         int (*fn)(void*, char*);
> >
> >         if(!p->pending)
> >                 return;
> >
> >         p->pending = 0;
> >         for(n=notes; n<enotes; n++){
> >                 if(n->proc == p){
> >                         for(i=0; i<NFN; i++){
> >                                 if((onnotepid[i]!=p->pid &&
> > onnotepid[i]!=-1) || (fn = onnote[i])==nil)
> >                                         continue;
> >                                 if((*fn)(v, n->s))
> >                                         break;
> > [...]
> > /sys/include/thread.h
> > [...]
> > void cancelnotes(int pid);
> > [...]
> >
> 
> Regards,
> Andrej

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M99eed0ec152c6bbad2332628
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-20 22:29                                       ` Skip Tavakkolian
@ 2022-06-21  7:07                                         ` andrey100100100
  2022-06-21 11:26                                           ` adr
  2022-06-21  7:22                                         ` adr
  1 sibling, 1 reply; 49+ messages in thread
From: andrey100100100 @ 2022-06-21  7:07 UTC (permalink / raw)
  To: 9fans

В Пн, 20/06/2022 в 15:29 -0700, Skip Tavakkolian пишет:
> It's cleaner to use channels with separate io and timer threads that
> do their syscalls via ioproc; this one doesn't require any changes to
> libthread:
> 
> https://gist.github.com/9nut/aaa9b9b6a22d69996b75ccdc6e615c61

Thanks for the work you've done!
Yes, I have considered this possibility.
But it was precisely this kind of code bloat that I wanted to avoid.

And yet, the libthread needs fixing. What adr suggested, at first
glance, is acceptable.
And it turns out that we are trying to get around the problems of the
libthread using the libthread...
By not fixing the libthread, we reduce the number of use cases for it,
which is bad.


PS. Philosophical question: should someone monitor the process or
should the process serve itself?


> 
> 
> On Mon, Jun 20, 2022 at 8:57 AM <andrey100100100@gmail.com> wrote:
> > 
> > В Пн, 20/06/2022 в 05:59 +0000, adr пишет:
> > > On Mon, 20 Jun 2022, adr wrote:
> > > > But I have something in mind for a case like
> > > > this, when all the processes are going to use the same handler
> > > > (that's why I was asking). Let me play with it a litle before I
> > > > share it.
> > > 
> > > Ok, the idea is this: If in is bigger than zero in
> > > threadnotify(int (*f)(void*, char*), int in), the handler is
> > > register
> > > for the calling process. If in is 0, then the handler is cleared
> > > for the calling process. If in is -1, the handler is register for
> > > all processes and if in is less than -1, it is cleared for all
> > > processes (expect for those who have already registered it for
> > > themselves).
> > > 
> > > Now back to your example, as all the processes are going to use
> > > the
> > > same handler,
> > > you just have to register it once in threadmain:
> > > 
> > > #include <u.h>
> > > #include <libc.h>
> > > #include <thread.h>
> > > 
> > > static int
> > > handler_alarm(void *, char *msg)
> > > {
> > >         if(strstr(msg, "alarm"))
> > >                 return 1;
> > >         return 0;
> > > }
> > > 
> > > static void
> > > proc_udp(void *)
> > > {
> > >         char resp[512];
> > >         char req[] = "request";
> > >         int fd;
> > >         if((fd = dial("udp!185.157.221.201!5678", nil, nil, nil))
> > > 0){
> > >                 if(write(fd, req, strlen(req)) == strlen(req)){
> > >                         fprint(1, "start\n");
> > >                         alarm(2000);
> > >                         read(fd, resp, sizeof(resp));
> > >                         alarm(0);
> > >                         fprint(1, "end\n");
> > >                 }
> > >                 close(fd);
> > >         }
> > >         threadexits(nil);
> > > }
> > > 
> > > int mainstacksize = 5242880;
> > > 
> > > void
> > > threadmain(int argc, char *argv[])
> > > {
> > >         threadnotify(handler_alarm, -1);
> > >         for(int i = 0; i < 80; i++)
> > >                 proccreate(proc_udp, nil, 10240);
> > >         sleep(5000);
> > >         threadexitsall(nil);
> > > }
> > > Now,
> > > ; ./5.out | grep end | wc -l
> > >       80
> > > 
> > > Are you happy Andrej?
> > 
> > 
> > Yes. Thank you very much! It's working!
> > 
> > How convenient it is to use - more experiments are needed.
> > 
> > 
> > > 
> > > adr.
> > > 
> > > /sys/src/libthread/sched.c:
> > > [...]
> > >                 if(t == nil){
> > >                         _threaddebug(DBGSCHED, "all threads gone;
> > > exiting");
> > >                         cancelnotes(p->pid);
> > >                         _schedexit(p);
> > >                 }
> > > [...]
> > > /sys/src/libthread/note.c
> > > [...]
> > > int
> > > threadnotify(int (*f)(void*, char*), int in)
> > > {
> > >         int i, frompid, topid;
> > >         int (*from)(void*, char*), (*to)(void*, char*);
> > > 
> > >         if(in && in>-2){
> > >                 from = nil;
> > >                 frompid = 0;
> > >                 to = f;
> > >                 topid = (in == -1)? -1 : _threadgetproc()->pid;
> > >                 lock(&onnotelock);
> > >                 for(i=0; i<NFN; i++)
> > >                         if(onnote[i]==to && onnotepid[i]==topid){
> > >                                 unlock(&onnotelock);
> > >                                 return i<NFN;
> > >                         }
> > >                 unlock(&onnotelock);
> > >         }else{
> > >                 from = f;
> > >                 frompid = (in < -1)? -1 : _threadgetproc()->pid;
> > >                 to = nil;
> > >                 topid = 0;
> > >         }
> > >         lock(&onnotelock);
> > >         for(i=0; i<NFN; i++)
> > >                 if(onnote[i]==from && onnotepid[i]==frompid){
> > >                         onnote[i] = to;
> > >                         onnotepid[i] = topid;
> > >                         break;
> > >                 }
> > >         unlock(&onnotelock);
> > >         return i<NFN;
> > > }
> > > 
> > > void
> > > cancelnotes(int pid)
> > > {
> > >         int i;
> > > 
> > >         lock(&onnotelock);
> > >         for(i=0; i<NFN; i++)
> > >                 if(onnotepid[i] == pid){
> > >                         onnote[i] = nil;
> > >                         onnotepid[i] = 0;
> > >                 }
> > >         unlock(&onnotelock);
> > >         return;
> > > }
> > > 
> > > static void
> > > delayednotes(Proc *p, void *v)
> > > {
> > >         int i;
> > >         Note *n;
> > >         int (*fn)(void*, char*);
> > > 
> > >         if(!p->pending)
> > >                 return;
> > > 
> > >         p->pending = 0;
> > >         for(n=notes; n<enotes; n++){
> > >                 if(n->proc == p){
> > >                         for(i=0; i<NFN; i++){
> > >                                 if((onnotepid[i]!=p->pid &&
> > > onnotepid[i]!=-1) || (fn = onnote[i])==nil)
> > >                                         continue;
> > >                                 if((*fn)(v, n->s))
> > >                                         break;
> > > [...]
> > > /sys/include/thread.h
> > > [...]
> > > void cancelnotes(int pid);
> > > [...]
> > > 

Regards,
Andrej

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M00318c8e21c28c4ba97042ba
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-20 22:29                                       ` Skip Tavakkolian
  2022-06-21  7:07                                         ` andrey100100100
@ 2022-06-21  7:22                                         ` adr
  1 sibling, 0 replies; 49+ messages in thread
From: adr @ 2022-06-21  7:22 UTC (permalink / raw)
  To: 9fans

On Mon, 20 Jun 2022, Skip Tavakkolian wrote:
> It's cleaner to use channels with separate io and timer threads that
> do their syscalls via ioproc; this one doesn't require any changes to libthread:
>
> https://gist.github.com/9nut/aaa9b9b6a22d69996b75ccdc6e615c61

Nice example, but I strongly recomend changing libthread or removing
notes support. Right now:

If a process try to remove a handler which has been register for
another process before, it will remove the handler for that process
instead of its own.

When a process exits its handlers are not cleared, so if the process
doesn't call threadnotify to clean its handlers, this space is
wasted for the rest of the program. Also it makes sense to clean
them all at exit instead of doing it explicity for each handler.

threadnotify will insert duplicated handlers, wasting the limited
space in onnote[], onnotepid[]. Note that those duplicated handlers
will be completly ignored, they will just occupy space.

If you are using notes and all your processes are going to share
the same array of handlers, doesn't make sense to be able to register
handlers which could be used by all of them?

Another discussion is if the notes mechanism really fits in libthread,
as I said, maybe is better to remove notes support completly and
force the programmer to exploit the api. Yours is a good example.

Regards,
adr.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-Mc9df1d1014565db0ee016754
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-21  7:07                                         ` andrey100100100
@ 2022-06-21 11:26                                           ` adr
  2022-06-21 13:03                                             ` andrey100100100
  2022-06-21 13:47                                             ` andrey100100100
  0 siblings, 2 replies; 49+ messages in thread
From: adr @ 2022-06-21 11:26 UTC (permalink / raw)
  To: 9fans

On Tue, 21 Jun 2022, andrey100100100@gmail.com wrote:
> ? ??, 20/06/2022 ? 15:29 -0700, Skip Tavakkolian ?????:
>> It's cleaner to use channels with separate io and timer threads that
>> do their syscalls via ioproc; this one doesn't require any changes to
>> libthread:
>>
>> https://gist.github.com/9nut/aaa9b9b6a22d69996b75ccdc6e615c61
>
> Thanks for the work you've done!
> Yes, I have considered this possibility.
> But it was precisely this kind of code bloat that I wanted to avoid.

It looks like code bloat, but it really isn't. It is doing the job
with the tools of the api according to the paradigm designed in
libthread. That's why the word "cleaner" is completely correct.

I think note.c was added to resolve some particual case, and for
the state of note.c, I don't think it has been used too much.

For example, let's remove note.c. You could obtain the same result
in your example (all processes using the same handler) using atnotify
because the notes are registered to the children when proccreate
uses rfork:

void
threadmain(int argc, char *argv[])
{
        atnotify(handler_alarm, 1);

./5.out | grep end | wc -l
        80

If you have to use a different handler for each processes you can't
use atnotify because of RFMEM, but you can use the syscalls notify
and noted:

#include <u.h> 
#include <libc.h> 
#include <thread.h>

static void
handler_alarm(void *, char *msg)
{
        if(strstr(msg, "alarm")){
                print("yes");
                noted(NCONT);
                return; /* just in case */
        }
        noted(NDFLT);
}

static void
proc_udp(void *)
{
        char resp[512];
        char req[] = "request";
        int fd;
        notify(handler_alarm);
        if((fd = dial("udp!185.157.221.201!5678", nil, nil, nil)) >= 0){
                if(write(fd, req, strlen(req)) == strlen(req)){
                        fprint(1, "start\n");
                        alarm(2000);
                        read(fd, resp, sizeof(resp));
                        alarm(0);
                        fprint(1, "end\n");
                }
                close(fd);
        }
        threadexits(nil);
}

int mainstacksize = 5242880;

void
threadmain(int argc, char *argv[])
{
        for(int i = 0; i < 80; i++)
                proccreate(proc_udp, nil, 10240);
        sleep(5000);
        threadexitsall(nil);
  }

./5.out | grep end | wc -l
        80

Threadnotify is trying to do an atnotify that works with RFMEM,
but to do that onnote should be allocated to grow or shrink (or
have a size thinking in the maximum number of processes the program
could spawn, not the number of handlers a process could register
as in atnotify), instead of pointers to handlers, it should be an
array of pointers to arrays of handlers allocated by each process.

Now again, does the notes mechanism actually fit in libthread? If
it does it should be fixed, if not removed.

adr.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-Mf12e91abda54e653de320337
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-21 11:26                                           ` adr
@ 2022-06-21 13:03                                             ` andrey100100100
  2022-06-21 13:22                                               ` adr
  2022-06-21 13:47                                             ` andrey100100100
  1 sibling, 1 reply; 49+ messages in thread
From: andrey100100100 @ 2022-06-21 13:03 UTC (permalink / raw)
  To: 9fans

В Вт, 21/06/2022 в 11:26 +0000, adr пишет:
> On Tue, 21 Jun 2022, andrey100100100@gmail.com wrote:
> > ? ??, 20/06/2022 ? 15:29 -0700, Skip Tavakkolian ?????:
> > > It's cleaner to use channels with separate io and timer threads
> > > that
> > > do their syscalls via ioproc; this one doesn't require any
> > > changes to
> > > libthread:
> > > 
> > > https://gist.github.com/9nut/aaa9b9b6a22d69996b75ccdc6e615c61
> > 
> > Thanks for the work you've done!
> > Yes, I have considered this possibility.
> > But it was precisely this kind of code bloat that I wanted to
> > avoid.
> 
> It looks like code bloat, but it really isn't. It is doing the job
> with the tools of the api according to the paradigm designed in
> libthread. That's why the word "cleaner" is completely correct.

Yes, ready to agree, this solution is more independent of low-level
features and more obvious.

But, i would like more compact code.

I wonder how threadnotify() in plan9port will behave...

> 
> I think note.c was added to resolve some particual case, and for
> the state of note.c, I don't think it has been used too much.
> 
> For example, let's remove note.c. You could obtain the same result
> in your example (all processes using the same handler) using atnotify
> because the notes are registered to the children when proccreate
> uses rfork:
> 
> void
> threadmain(int argc, char *argv[])
> {
>         atnotify(handler_alarm, 1);
> 
> ./5.out | grep end | wc -l
>         80
> 
> If you have to use a different handler for each processes you can't
> use atnotify because of RFMEM, but you can use the syscalls notify
> and noted:
> 
> #include <u.h> 
> #include <libc.h> 
> #include <thread.h>
> 
> static void
> handler_alarm(void *, char *msg)
> {
>         if(strstr(msg, "alarm")){
>                 print("yes");
>                 noted(NCONT);
>                 return; /* just in case */
>         }
>         noted(NDFLT);
> }
> 
> static void
> proc_udp(void *)
> {
>         char resp[512];
>         char req[] = "request";
>         int fd;
>         notify(handler_alarm);
>         if((fd = dial("udp!185.157.221.201!5678", nil, nil, nil)) >=
> 0){
>                 if(write(fd, req, strlen(req)) == strlen(req)){
>                         fprint(1, "start\n");
>                         alarm(2000);
>                         read(fd, resp, sizeof(resp));
>                         alarm(0);
>                         fprint(1, "end\n");
>                 }
>                 close(fd);
>         }
>         threadexits(nil);
> }
> 
> int mainstacksize = 5242880;
> 
> void
> threadmain(int argc, char *argv[])
> {
>         for(int i = 0; i < 80; i++)
>                 proccreate(proc_udp, nil, 10240);
>         sleep(5000);
>         threadexitsall(nil);
>   }
> 
> ./5.out | grep end | wc -l
>         80
> 
> Threadnotify is trying to do an atnotify that works with RFMEM,
> but to do that onnote should be allocated to grow or shrink (or
> have a size thinking in the maximum number of processes the program
> could spawn, not the number of handlers a process could register
> as in atnotify), instead of pointers to handlers, it should be an
> array of pointers to arrays of handlers allocated by each process.
> 
> Now again, does the notes mechanism actually fit in libthread? If
> it does it should be fixed, if not removed.


I vote for the fix.
Perhaps the notification is being used somewhere or by someone.


> 
> adr.
> 

Regards,
Andrej

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-Mf8c7cfd50b4091a520d792e5
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-21 13:03                                             ` andrey100100100
@ 2022-06-21 13:22                                               ` adr
  2022-06-28 15:28                                                 ` adr
  0 siblings, 1 reply; 49+ messages in thread
From: adr @ 2022-06-21 13:22 UTC (permalink / raw)
  To: 9fans

On Tue, 21 Jun 2022, andrey100100100@gmail.com wrote:

>> For example, let's remove note.c. You could obtain the same result

Just for clarity, you actually don't need to remove note.c to do
what I said below.

>> in your example (all processes using the same handler) using atnotify
>> because the notes are registered to the children when proccreate
>> uses rfork:
>>
>> void
>> threadmain(int argc, char *argv[])
>> {
>>         atnotify(handler_alarm, 1);
>>
>> ./5.out | grep end | wc -l
>>         80
>>
>> If you have to use a different handler for each processes you can't
>> use atnotify because of RFMEM, but you can use the syscalls notify
>> and noted:
>>
>> #include <u.h>
>> #include <libc.h>
>> #include <thread.h>
>>
>> static void
>> handler_alarm(void *, char *msg)
>> {
>>         if(strstr(msg, "alarm")){
>>                 print("yes");
>>                 noted(NCONT);
>>                 return; /* just in case */
>>         }
>>         noted(NDFLT);
>> }
>>
>> static void
>> proc_udp(void *)
>> {
>>         char resp[512];
>>         char req[] = "request";
>>         int fd;
>>         notify(handler_alarm);
>>         if((fd = dial("udp!185.157.221.201!5678", nil, nil, nil)) >=
>> 0){
>>                 if(write(fd, req, strlen(req)) == strlen(req)){
>>                         fprint(1, "start\n");
>>                         alarm(2000);
>>                         read(fd, resp, sizeof(resp));
>>                         alarm(0);
>>                         fprint(1, "end\n");
>>                 }
>>                 close(fd);
>>         }
>>         threadexits(nil);
>> }
>>
>> int mainstacksize = 5242880;
>>
>> void
>> threadmain(int argc, char *argv[])
>> {
>>         for(int i = 0; i < 80; i++)
>>                 proccreate(proc_udp, nil, 10240);
>>         sleep(5000);
>>         threadexitsall(nil);
>>   }
>>
>> ./5.out | grep end | wc -l
>>         80
>>
>> Threadnotify is trying to do an atnotify that works with RFMEM,
>> but to do that onnote should be allocated to grow or shrink (or
>> have a size thinking in the maximum number of processes the program
>> could spawn, not the number of handlers a process could register
>> as in atnotify), instead of pointers to handlers, it should be an
>> array of pointers to arrays of handlers allocated by each process.
>>
>> Now again, does the notes mechanism actually fit in libthread? If
>> it does it should be fixed, if not removed.
>
>
> I vote for the fix.
> Perhaps the notification is being used somewhere or by someone.
>
>
>>
>> adr.
>>
>
> Regards,
> Andrej

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M198ebde18601eb82a1a5b8d8
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-21 11:26                                           ` adr
  2022-06-21 13:03                                             ` andrey100100100
@ 2022-06-21 13:47                                             ` andrey100100100
  1 sibling, 0 replies; 49+ messages in thread
From: andrey100100100 @ 2022-06-21 13:47 UTC (permalink / raw)
  To: 9fans

В Вт, 21/06/2022 в 11:26 +0000, adr пишет:
> On Tue, 21 Jun 2022, andrey100100100@gmail.com wrote:
> > ? ??, 20/06/2022 ? 15:29 -0700, Skip Tavakkolian ?????:
> > > It's cleaner to use channels with separate io and timer threads
> > > that
> > > do their syscalls via ioproc; this one doesn't require any
> > > changes to
> > > libthread:
> > > 
> > > https://gist.github.com/9nut/aaa9b9b6a22d69996b75ccdc6e615c61
> > 
> > Thanks for the work you've done!
> > Yes, I have considered this possibility.
> > But it was precisely this kind of code bloat that I wanted to
> > avoid.
> 
> It looks like code bloat, but it really isn't. It is doing the job
> with the tools of the api according to the paradigm designed in
> libthread. That's why the word "cleaner" is completely correct.

And one more problem:

cpu% 6.out -a 'udp!52.43.121.77!10011' -n 250 -t 7 | grep time | wc -l
6.out 50974: warning: process exceeds 100 file descriptors
6.out 51053: warning: process exceeds 200 file descriptors
6.out 51238: warning: process exceeds 300 file descriptors
6.out 51314: warning: process exceeds 400 file descriptors
6.out 51342: warning: process exceeds 500 file descriptors
6.out 51414: warning: process exceeds 600 file descriptors
6.out 51158: warning: process exceeds 700 file descriptors
    250


cpu% ps | grep 6.out | wc -l
    751


Сonsumes three times more resources than it should.

> 
> I think note.c was added to resolve some particual case, and for
> the state of note.c, I don't think it has been used too much.
> 
> For example, let's remove note.c. You could obtain the same result
> in your example (all processes using the same handler) using atnotify
> because the notes are registered to the children when proccreate
> uses rfork:
> 
> void
> threadmain(int argc, char *argv[])
> {
>         atnotify(handler_alarm, 1);
> 
> ./5.out | grep end | wc -l
>         80
> 
> If you have to use a different handler for each processes you can't
> use atnotify because of RFMEM, but you can use the syscalls notify
> and noted:
> 
> #include <u.h> 
> #include <libc.h> 
> #include <thread.h>
> 
> static void
> handler_alarm(void *, char *msg)
> {
>         if(strstr(msg, "alarm")){
>                 print("yes");
>                 noted(NCONT);
>                 return; /* just in case */
>         }
>         noted(NDFLT);
> }
> 
> static void
> proc_udp(void *)
> {
>         char resp[512];
>         char req[] = "request";
>         int fd;
>         notify(handler_alarm);
>         if((fd = dial("udp!185.157.221.201!5678", nil, nil, nil)) >=
> 0){
>                 if(write(fd, req, strlen(req)) == strlen(req)){
>                         fprint(1, "start\n");
>                         alarm(2000);
>                         read(fd, resp, sizeof(resp));
>                         alarm(0);
>                         fprint(1, "end\n");
>                 }
>                 close(fd);
>         }
>         threadexits(nil);
> }
> 
> int mainstacksize = 5242880;
> 
> void
> threadmain(int argc, char *argv[])
> {
>         for(int i = 0; i < 80; i++)
>                 proccreate(proc_udp, nil, 10240);
>         sleep(5000);
>         threadexitsall(nil);
>   }
> 
> ./5.out | grep end | wc -l
>         80
> 
> Threadnotify is trying to do an atnotify that works with RFMEM,
> but to do that onnote should be allocated to grow or shrink (or
> have a size thinking in the maximum number of processes the program
> could spawn, not the number of handlers a process could register
> as in atnotify), instead of pointers to handlers, it should be an
> array of pointers to arrays of handlers allocated by each process.
> 
> Now again, does the notes mechanism actually fit in libthread? If
> it does it should be fixed, if not removed.
> 


Regards,
Andrej


------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M813e896da7e1be8ab8338198
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-21 13:22                                               ` adr
@ 2022-06-28 15:28                                                 ` adr
  2022-06-28 16:43                                                   ` ori
                                                                     ` (2 more replies)
  0 siblings, 3 replies; 49+ messages in thread
From: adr @ 2022-06-28 15:28 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 11131 bytes --]

Andrey, if you want to use different note handlers per process (with a big
number of processes) using libthread, this may be helpful.

The idea is this:

An array of handlers for all processes which can be changed by all processes.
When a note is received by a process, this array takes priority.

An array of pointers to structures of the type

struct Onnote
{
        int pid;
        int (*fn[NFN])(void*, char*);
};

initially of size PPCHUNK (I set it to 100, experiment with that),
but it can grow if necessary (but not shrink, I think this would
be overkilling).

These structures are allocated the first time a process record a
handler and freed when the process exits (or by calling
threadcancelnotes(), note that this function can free other processes'
function handlers, maybe should be better to make some restrictions)

The use of "in" in threadnotify(int (*f)(void*, char*), int in) is:

in > 0 : set the handler for the calling process.
in == 0 : clear the handler for the calling process.
in == -1 : clear the handler for all processes (except those who has
            registered it already for themselves).
in < -1 : set the handler for all processes.

There is no use of threadnotify with "in < 0" in /sys/src, so nothing is broken.

As you are using 9front and they are serving their sources with
9p, here is a diff to their sources. I haven't compiled it in
9front, though. Note that if you want to compile the system with
this changes, you have to eliminate the copy of note.c at
/sys/src/cmd/execnet (it seems that note.c was added afterwards as
I thought).

I haven't test it too much, this has been more like a time-destroyer
pastime.

adr
--- /tmp/main.c
+++ /sys/src/libthread/main.c
@@ -28,6 +28,10 @@
        _qlockinit(_threadrendezvous);
        _sysfatal = _threadsysfatal;
        __assert = _threadassert;
+       onnote = mallocz(PPCHUNK*sizeof(uintptr), 1);
+       if(!onnote)
+               sysfatal("Malloc of size %d failed: %r", PPCHUNK*sizeof(uintptr));
+       onnotesize = PPCHUNK;
        notify(_threadnote);
        if(mainstacksize == 0)
                mainstacksize = 8*1024;
--- /tmp/note.c
+++ /sys/src/libthread/note.c
@@ -5,9 +5,9 @@

  int   _threadnopasser;

-#define        NFN             33
  #define       ERRLEN  48
  typedef struct Note Note;
+
  struct Note
  {
        Lock            inuse;
@@ -17,62 +17,155 @@

  static Note   notes[128];
  static Note   *enotes = notes+nelem(notes);
-static int             (*onnote[NFN])(void*, char*);
-static int             onnotepid[NFN];
+Onnote **onnote;
+int onnotesize;
+static int (*onnoteall[NFN])(void*, char*);
  static Lock   onnotelock;

  int
  threadnotify(int (*f)(void*, char*), int in)
  {
-       int i, topid;
-       int (*from)(void*, char*), (*to)(void*, char*);
+       int i, j;

-       if(in){
-               from = nil;
-               to = f;
-               topid = _threadgetproc()->pid;
-       }else{
-               from = f;
-               to = nil;
-               topid = 0;
-       }
        lock(&onnotelock);
-       for(i=0; i<NFN; i++)
-               if(onnote[i]==from){
-                       onnote[i] = to;
-                       onnotepid[i] = topid;
+
+       /* add note for all processes */
+       if(in < -1){
+               for(i=0; i<NFN; i++)
+                       if(onnoteall[i] == f){
+                               unlock(&onnotelock);
+                               return 1;
+                       }
+               for(i=0; i<NFN; i++)
+                       if(onnoteall[i] == nil){
+                               onnoteall[i] = f;
+                               break;
+                       }
+               unlock(&onnotelock);
+               return i<NFN;
+       }
+
+       /* remove note for all processes */
+       if(in == -1){
+               for(i=0; i<NFN; i++)
+                       if(onnoteall[i] == f){
+                               onnoteall[i] = nil;
+                               break;
+                       }
+               unlock(&onnotelock);
+               return i<NFN;
+       }
+
+       /* remove note for current process */
+       if(!in){
+               for(i=0; i<onnotesize; i++){
+                       if(onnote[i]!=nil && onnote[i]->pid==_threadgetproc()->pid){
+                               for(j=0; j<NFN; j++){
+                                       if(onnote[i]->fn[j] == f){
+                                               onnote[i]->fn[j] = 0;
+                                               break;
+                                       }
+                               }
+                               unlock(&onnotelock);
+                               return j<NFN;
+                       }
+               }
+               unlock(&onnotelock);
+               return i<onnotesize;
+       }
+
+       /* add note for current process */
+       for(i=0; i<onnotesize; i++)
+               if(onnote[i] && onnote[i]->pid==_threadgetproc()->pid)
+                       break;
+
+       /* process has already a slot */
+       if(i < onnotesize){
+               for(j=0; j<NFN; j++){
+                       if(onnote[i]->fn[j] == nil){
+                               onnote[i]->fn[j] = f;
+                               break;
+                       }
+               }
+               unlock(&onnotelock);
+               return j<NFN;
+ 
+       }
+
+       for(i=0; i<onnotesize; i++)
+               if(!onnote[i])
+                       break;
+
+       /* there is no free slot */
+       if(i == onnotesize){
+               onnotesize += PPCHUNK;
+               onnote = realloc(onnote, onnotesize*sizeof(uintptr));
+               if(!onnote){
+                       unlock(&onnotelock);
+                       sysfatal("Malloc of size %d failed: %r", onnotesize*sizeof(uintptr));
+               }
+               memset(onnote+i+1, 0, PPCHUNK-1);
+       }
+
+       onnote[i]=mallocz(sizeof(Onnote), 1);
+       if(!onnote[i]){
+               unlock(&onnotelock);
+               sysfatal("Malloc of size %d failed: %r", sizeof(Onnote));
+       }
+       onnote[i]->pid = _threadgetproc()->pid;
+       onnote[i]->fn[0] = f;
+       unlock(&onnotelock);
+       return 1;
+}
+
+void
+threadcancelnotes(int pid)
+{
+       int i;
+
+       lock(&onnotelock);
+       for(i=0; i<onnotesize; i++)
+               if(onnote[i] && onnote[i]->pid==pid){
+                       free(onnote[i]);
+                       onnote[i] = nil;
                        break;
                }
        unlock(&onnotelock);
-       return i<NFN;
+       return;
  }

  static void
  delayednotes(Proc *p, void *v)
  {
-       int i;
+       int i, j, all;
        Note *n;
-       char s[ERRMAX];
-       int (*fn)(void*, char*);
+       int (*f)(void*, char*);

        if(!p->pending)
                return;

        p->pending = 0;
+       all = j = 0;
        for(n=notes; n<enotes; n++){
                if(n->proc == p){
-                       strcpy(s, n->s);
-                       n->proc = nil;
-                       unlock(&n->inuse);
-
-                       for(i=0; i<NFN; i++){
-                               if(onnotepid[i]!=p->pid || (fn = onnote[i])==nil)
-                                       continue;
-                               if((*fn)(v, s))
-                                       break;
+                       for(i=0; i<NFN; i++)
+                               if(f=onnoteall[i])
+                                       if((*f)(v, n->s)){
+                                               all = 1;
+                                               break;
+                                       }
+                       if(!all){
+                               for(i=0; i<onnotesize; i++)
+                                       if(onnote[i] && onnote[i]->pid==p->pid){
+                                               for(j=0; j<NFN; j++)
+                                                       if(f=onnote[i]->fn[j])
+                                                               if((*f)(v, n->s))
+                                                                       break;
+                                               break;
+                                       }
                        }
-                       if(i==NFN){
-                               _threaddebug(DBGNOTE, "Unhandled note %s, proc %p", n->s, p);
+                       if(!all && (i==onnotesize || j==NFN)){
+                               _threaddebug(DBGNOTE, "Unhandled note %s, proc %p\n", n->s, p);
                                if(v != nil)
                                        noted(NDFLT);
                                else if(strncmp(n->s, "sys:", 4)==0)
@@ -79,6 +172,8 @@
                                        abort();
                                threadexitsall(n->s);
                        }
+                       n->proc = nil;
+                       unlock(&n->inuse);
                }
        }
  }
@@ -94,7 +189,7 @@
                noted(NDFLT);

        if(_threadexitsallstatus){
-               _threaddebug(DBGNOTE, "Threadexitsallstatus = '%s'", _threadexitsallstatus);
+               _threaddebug(DBGNOTE, "Threadexitsallstatus = '%s'\n", _threadexitsallstatus);
                _exits(_threadexitsallstatus);
        }

--- /tmp/sched.c
+++ /sys/src/libthread/sched.c
@@ -157,6 +157,7 @@
                t = runthread(p);
                if(t == nil){
                        _threaddebug(DBGSCHED, "all threads gone; exiting");
+                       threadcancelnotes(p->pid);
                        unlinkproc(p);
                        _schedexit(p);  /* frees proc */
                }
--- /tmp/thread.h
+++ /sys/include/thread.h
@@ -97,6 +97,7 @@
  void  threadkillgrp(int);     /* kill threads in group */
  void  threadmain(int argc, char *argv[]);
  int   threadnotify(int (*f)(void*, char*), int in);
+void threadcancelnotes(int pid);
  int   threadid(void);
  int   threadpid(int);
  int   threadsetgrp(int);              /* set thread group, return old */
--- /tmp/threadimpl.h
+++ /sys/src/libthread/threadimpl.h
@@ -192,3 +192,15 @@
  #define       _threaddebug(flag, ...) if((_threaddebuglevel&(flag))==0){}else _threadprint(__VA_ARGS__)

  #define ioproc_arg(io, type)  (va_arg((io)->arg, type))
+
+#define        PPCHUNK 100
+#define        NFN 33
+typedef struct Onnote Onnote;
+struct Onnote
+{
+       int pid;
+       int (*fn[NFN])(void*, char*);
+};
+extern Onnote **onnote;
+extern int onnotesize;
+void _threadnote(void*, char*);
------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M401cfa47db4a93cc273cac83
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-diff; name=thread.patch, Size: 6088 bytes --]

--- /tmp/main.c
+++ /sys/src/libthread/main.c
@@ -28,6 +28,10 @@
 	_qlockinit(_threadrendezvous);
 	_sysfatal = _threadsysfatal;
 	__assert = _threadassert;
+	onnote = mallocz(PPCHUNK*sizeof(uintptr), 1);
+	if(!onnote)
+		sysfatal("Malloc of size %d failed: %r", PPCHUNK*sizeof(uintptr));
+	onnotesize = PPCHUNK;
 	notify(_threadnote);
 	if(mainstacksize == 0)
 		mainstacksize = 8*1024;
--- /tmp/note.c
+++ /sys/src/libthread/note.c
@@ -5,9 +5,9 @@
 
 int	_threadnopasser;
 
-#define	NFN		33
 #define	ERRLEN	48
 typedef struct Note Note;
+
 struct Note
 {
 	Lock		inuse;
@@ -17,62 +17,155 @@
 
 static Note	notes[128];
 static Note	*enotes = notes+nelem(notes);
-static int		(*onnote[NFN])(void*, char*);
-static int		onnotepid[NFN];
+Onnote **onnote;
+int onnotesize;
+static int (*onnoteall[NFN])(void*, char*);
 static Lock	onnotelock;
 
 int
 threadnotify(int (*f)(void*, char*), int in)
 {
-	int i, topid;
-	int (*from)(void*, char*), (*to)(void*, char*);
+	int i, j;
 
-	if(in){
-		from = nil;
-		to = f;
-		topid = _threadgetproc()->pid;
-	}else{
-		from = f;
-		to = nil;
-		topid = 0;
-	}
 	lock(&onnotelock);
-	for(i=0; i<NFN; i++)
-		if(onnote[i]==from){
-			onnote[i] = to;
-			onnotepid[i] = topid;
+
+	/* add note for all processes */
+	if(in < -1){
+		for(i=0; i<NFN; i++)
+			if(onnoteall[i] == f){
+				unlock(&onnotelock);
+				return 1;
+			}
+		for(i=0; i<NFN; i++)
+			if(onnoteall[i] == nil){
+				onnoteall[i] = f;
+				break;
+			}
+		unlock(&onnotelock);
+		return i<NFN;
+	}
+
+	/* remove note for all processes */
+	if(in == -1){
+		for(i=0; i<NFN; i++)
+			if(onnoteall[i] == f){
+				onnoteall[i] = nil;
+				break;
+			}
+		unlock(&onnotelock);
+		return i<NFN;
+	}
+
+	/* remove note for current process */
+	if(!in){
+		for(i=0; i<onnotesize; i++){
+			if(onnote[i]!=nil && onnote[i]->pid==_threadgetproc()->pid){
+				for(j=0; j<NFN; j++){
+					if(onnote[i]->fn[j] == f){
+						onnote[i]->fn[j] = 0;
+						break;
+					}
+				}
+				unlock(&onnotelock);
+				return j<NFN;
+			}
+		}
+		unlock(&onnotelock);
+		return i<onnotesize;
+	}
+
+	/* add note for current process */
+	for(i=0; i<onnotesize; i++)
+		if(onnote[i] && onnote[i]->pid==_threadgetproc()->pid)
+			break;
+
+	/* process has already a slot */
+	if(i < onnotesize){
+		for(j=0; j<NFN; j++){
+			if(onnote[i]->fn[j] == nil){
+				onnote[i]->fn[j] = f;
+				break;
+			}
+		}
+		unlock(&onnotelock);
+		return j<NFN;
+		
+	}
+
+	for(i=0; i<onnotesize; i++)
+		if(!onnote[i])
+			break;
+
+	/* there is no free slot */
+	if(i == onnotesize){
+		onnotesize += PPCHUNK;
+		onnote = realloc(onnote, onnotesize*sizeof(uintptr));
+		if(!onnote){
+			unlock(&onnotelock);
+			sysfatal("Malloc of size %d failed: %r", onnotesize*sizeof(uintptr));
+		}
+		memset(onnote+i+1, 0, PPCHUNK-1);
+	}
+
+	onnote[i]=mallocz(sizeof(Onnote), 1);
+	if(!onnote[i]){
+		unlock(&onnotelock);
+		sysfatal("Malloc of size %d failed: %r", sizeof(Onnote));
+	}
+	onnote[i]->pid = _threadgetproc()->pid;
+	onnote[i]->fn[0] = f;
+	unlock(&onnotelock);
+	return 1;
+}
+
+void
+threadcancelnotes(int pid)
+{
+	int i;
+
+	lock(&onnotelock);
+	for(i=0; i<onnotesize; i++)
+		if(onnote[i] && onnote[i]->pid==pid){
+			free(onnote[i]);
+			onnote[i] = nil;
 			break;
 		}
 	unlock(&onnotelock);
-	return i<NFN;
+	return;
 }
 
 static void
 delayednotes(Proc *p, void *v)
 {
-	int i;
+	int i, j, all;
 	Note *n;
-	char s[ERRMAX];
-	int (*fn)(void*, char*);
+	int (*f)(void*, char*);
 
 	if(!p->pending)
 		return;
 
 	p->pending = 0;
+	all = j = 0;
 	for(n=notes; n<enotes; n++){
 		if(n->proc == p){
-			strcpy(s, n->s);
-			n->proc = nil;
-			unlock(&n->inuse);
-
-			for(i=0; i<NFN; i++){
-				if(onnotepid[i]!=p->pid || (fn = onnote[i])==nil)
-					continue;
-				if((*fn)(v, s))
-					break;
+			for(i=0; i<NFN; i++)
+				if(f=onnoteall[i])
+					if((*f)(v, n->s)){
+						all = 1;
+						break;
+					}
+			if(!all){
+				for(i=0; i<onnotesize; i++)
+					if(onnote[i] && onnote[i]->pid==p->pid){
+						for(j=0; j<NFN; j++)
+							if(f=onnote[i]->fn[j])
+								if((*f)(v, n->s))
+									break;
+						break;
+					}
 			}
-			if(i==NFN){
-				_threaddebug(DBGNOTE, "Unhandled note %s, proc %p", n->s, p);
+			if(!all && (i==onnotesize || j==NFN)){
+				_threaddebug(DBGNOTE, "Unhandled note %s, proc %p\n", n->s, p);
 				if(v != nil)
 					noted(NDFLT);
 				else if(strncmp(n->s, "sys:", 4)==0)
@@ -79,6 +172,8 @@
 					abort();
 				threadexitsall(n->s);
 			}
+			n->proc = nil;
+			unlock(&n->inuse);
 		}
 	}
 }
@@ -94,7 +189,7 @@
 		noted(NDFLT);
 
 	if(_threadexitsallstatus){
-		_threaddebug(DBGNOTE, "Threadexitsallstatus = '%s'", _threadexitsallstatus);
+		_threaddebug(DBGNOTE, "Threadexitsallstatus = '%s'\n", _threadexitsallstatus);
 		_exits(_threadexitsallstatus);
 	}
 
--- /tmp/sched.c
+++ /sys/src/libthread/sched.c
@@ -157,6 +157,7 @@
 		t = runthread(p);
 		if(t == nil){
 			_threaddebug(DBGSCHED, "all threads gone; exiting");
+			threadcancelnotes(p->pid);
 			unlinkproc(p);
 			_schedexit(p);	/* frees proc */
 		}
--- /tmp/thread.h
+++ /sys/include/thread.h
@@ -97,6 +97,7 @@
 void	threadkillgrp(int);	/* kill threads in group */
 void	threadmain(int argc, char *argv[]);
 int	threadnotify(int (*f)(void*, char*), int in);
+void threadcancelnotes(int pid);
 int	threadid(void);
 int	threadpid(int);
 int	threadsetgrp(int);		/* set thread group, return old */
--- /tmp/threadimpl.h
+++ /sys/src/libthread/threadimpl.h
@@ -192,3 +192,15 @@
 #define	_threaddebug(flag, ...)	if((_threaddebuglevel&(flag))==0){}else _threadprint(__VA_ARGS__)
 
 #define ioproc_arg(io, type)	(va_arg((io)->arg, type))
+
+#define	PPCHUNK 100
+#define	NFN 33
+typedef struct Onnote Onnote;
+struct Onnote
+{
+	int pid;
+	int (*fn[NFN])(void*, char*);
+};
+extern Onnote **onnote;
+extern int onnotesize;
+void _threadnote(void*, char*);

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-28 15:28                                                 ` adr
@ 2022-06-28 16:43                                                   ` ori
  2022-06-28 18:19                                                   ` adr
  2022-06-28 19:09                                                   ` andrey100100100
  2 siblings, 0 replies; 49+ messages in thread
From: ori @ 2022-06-28 16:43 UTC (permalink / raw)
  To: 9fans

Quoth adr <adr@SDF.ORG>:
> Andrey, if you want to use different note handlers per process (with a big
> number of processes) using libthread, this may be helpful.
> 
> The idea is this:
> 
> An array of handlers for all processes which can be changed by all processes.
> When a note is received by a process, this array takes priority.
> 
> An array of pointers to structures of the type

take a look at privalloc; I suspect a number of libthread
data structures could benefit from being thread-local.



------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-Mbf51d6063d50b8c495966c35
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-28 15:28                                                 ` adr
  2022-06-28 16:43                                                   ` ori
@ 2022-06-28 18:19                                                   ` adr
  2022-06-28 18:28                                                     ` adr
  2022-06-28 19:09                                                   ` andrey100100100
  2 siblings, 1 reply; 49+ messages in thread
From: adr @ 2022-06-28 18:19 UTC (permalink / raw)
  To: 9fans

On Tue, 28 Jun 2022, adr wrote:
> 
> Andrey, if you want to use different note handlers per process (with a big
> number of processes) using libthread, this may be helpful.
>
> The idea is this:
>
> An array of handlers for all processes which can be changed by all processes.
> When a note is received by a process, this array takes priority.
>
> An array of pointers to structures of the type
>
> struct Onnote
> {
>       int pid;
>       int (*fn[NFN])(void*, char*);
> };
>
> initially of size PPCHUNK (I set it to 100, experiment with that),
> but it can grow if necessary (but not shrink, I think this would
> be overkilling).
>
> These structures are allocated the first time a process record a
> handler and freed when the process exits (or by calling
> threadcancelnotes(), note that this function can free other processes'
> function handlers, maybe should be better to make some restrictions)
>
> The use of "in" in threadnotify(int (*f)(void*, char*), int in) is:
>
> in > 0 : set the handler for the calling process.
> in == 0 : clear the handler for the calling process.
> in == -1 : clear the handler for all processes (except those who has
>           registered it already for themselves).
> in < -1 : set the handler for all processes.
>
> There is no use of threadnotify with "in < 0" in /sys/src, so nothing is 
> broken.
>
> As you are using 9front and they are serving their sources with
> 9p, here is a diff to their sources. I haven't compiled it in
> 9front, though. Note that if you want to compile the system with
> this changes, you have to eliminate the copy of note.c at
> /sys/src/cmd/execnet (it seems that note.c was added afterwards as
> I thought).
>
> I haven't test it too much, this has been more like a time-destroyer
> pastime.

This just evade going through the arrays twice. For the current
value of NFN it doesn't make too much a difference, note that this
structures are locked. It just was hurting my eyes.
--- /tmp/main.c
+++ /sys/src/libthread/main.c
@@ -28,6 +28,10 @@
        _qlockinit(_threadrendezvous);
        _sysfatal = _threadsysfatal;
        __assert = _threadassert;
+       onnote = mallocz(PPCHUNK*sizeof(uintptr), 1);
+       if(!onnote)
+               sysfatal("Malloc of size %d failed: %r", PPCHUNK*sizeof(uintptr));
+       onnotesize = PPCHUNK;
        notify(_threadnote);
        if(mainstacksize == 0)
                mainstacksize = 8*1024;
--- /tmp/note.c
+++ /sys/src/libthread/note.c
@@ -5,7 +5,6 @@

  int   _threadnopasser;

-#define        NFN             33
  #define       ERRLEN  48
  typedef struct Note Note;
  struct Note
@@ -17,62 +16,161 @@

  static Note   notes[128];
  static Note   *enotes = notes+nelem(notes);
-static int             (*onnote[NFN])(void*, char*);
-static int             onnotepid[NFN];
+Onnote **onnote;
+int onnotesize;
+static int (*onnoteall[NFN])(void*, char*);
  static Lock   onnotelock;

  int
  threadnotify(int (*f)(void*, char*), int in)
  {
-       int i, topid;
-       int (*from)(void*, char*), (*to)(void*, char*);
+       int i, j, n;

-       if(in){
-               from = nil;
-               to = f;
-               topid = _threadgetproc()->pid;
-       }else{
-               from = f;
-               to = nil;
-               topid = 0;
-       }
        lock(&onnotelock);
-       for(i=0; i<NFN; i++)
-               if(onnote[i]==from){
-                       onnote[i] = to;
-                       onnotepid[i] = topid;
+
+       /* add note for all processes */
+       if(in < -1){
+               n = -1;
+               for(i=0; i<NFN; i++){
+                       if(onnoteall[i] == f){
+                               unlock(&onnotelock);
+                               return 1;
+                       }
+                       if(onnoteall[i]==nil && n==-1)
+                               n = i;
+               }
+               if(n > -1)
+                       onnoteall[n] = f;
+               unlock(&onnotelock);
+               return n>-1;
+       }
+
+       /* remove note for all processes */
+       if(in == -1){
+               for(i=0; i<NFN; i++)
+                       if(onnoteall[i] == f){
+                               onnoteall[i] = nil;
+                               break;
+                       }
+               unlock(&onnotelock);
+               return i<NFN;
+       }
+
+       /* remove note for current process */
+       if(!in){
+               for(i=0; i<onnotesize; i++){
+                       if(onnote[i]!=nil && onnote[i]->pid==_threadgetproc()->pid){
+                               for(j=0; j<NFN; j++){
+                                       if(onnote[i]->fn[j] == f){
+                                               onnote[i]->fn[j] = 0;
+                                               break;
+                                       }
+                               }
+                               unlock(&onnotelock);
+                               return j<NFN;
+                       }
+               }
+               unlock(&onnotelock);
+               return 0;
+       }
+
+       /* add note for current process */
+       for(i=0; i<onnotesize; i++)
+               if(onnote[i] && onnote[i]->pid==_threadgetproc()->pid)
+                       break;
+
+       /* process has already a slot */
+       if(i < onnotesize){
+               n = -1;
+               for(j=0; j<NFN; j++){
+                       if(onnote[i]->fn[j] == f){
+                               unlock(&onnotelock);
+                               return 1;
+                       }
+                       if(onnote[i]->fn[j] == nil)
+                               n = j;
+               }
+               if(n > -1)
+                       onnote[i]->fn[n] = f;
+               unlock(&onnotelock);
+               return n>-1;
+ 
+       }
+
+       for(i=0; i<onnotesize; i++)
+               if(!onnote[i])
+                       break;
+
+       /* there is no free slot */
+       if(i == onnotesize){
+               onnotesize += PPCHUNK;
+               onnote = realloc(onnote, onnotesize*sizeof(uintptr));
+               if(!onnote){
+                       unlock(&onnotelock);
+                       sysfatal("Malloc of size %d failed: %r", onnotesize*sizeof(uintptr));
+               }
+               memset(onnote+i+1, 0, PPCHUNK-1);
+       }
+
+       onnote[i]=mallocz(sizeof(Onnote), 1);
+       if(!onnote[i]){
+               unlock(&onnotelock);
+               sysfatal("Malloc of size %d failed: %r", sizeof(Onnote));
+       }
+       onnote[i]->pid = _threadgetproc()->pid;
+       onnote[i]->fn[0] = f;
+       unlock(&onnotelock);
+       return 1;
+}
+
+void
+threadcancelnotes(int pid)
+{
+       int i;
+
+       lock(&onnotelock);
+       for(i=0; i<onnotesize; i++)
+               if(onnote[i] && onnote[i]->pid==pid){
+                       free(onnote[i]);
+                       onnote[i] = nil;
                        break;
                }
        unlock(&onnotelock);
-       return i<NFN;
+       return;
  }

  static void
  delayednotes(Proc *p, void *v)
  {
-       int i;
+       int i, j, all;
        Note *n;
-       char s[ERRMAX];
-       int (*fn)(void*, char*);
+       int (*f)(void*, char*);

        if(!p->pending)
                return;

        p->pending = 0;
+       all = j = 0;
        for(n=notes; n<enotes; n++){
                if(n->proc == p){
-                       strcpy(s, n->s);
-                       n->proc = nil;
-                       unlock(&n->inuse);
-
-                       for(i=0; i<NFN; i++){
-                               if(onnotepid[i]!=p->pid || (fn = onnote[i])==nil)
-                                       continue;
-                               if((*fn)(v, s))
-                                       break;
+                       for(i=0; i<NFN; i++)
+                               if(f=onnoteall[i])
+                                       if((*f)(v, n->s)){
+                                               all = 1;
+                                               break;
+                                       }
+                       if(!all){
+                               for(i=0; i<onnotesize; i++)
+                                       if(onnote[i] && onnote[i]->pid==p->pid){
+                                               for(j=0; j<NFN; j++)
+                                                       if(f=onnote[i]->fn[j])
+                                                               if((*f)(v, n->s))
+                                                                       break;
+                                               break;
+                                       }
                        }
-                       if(i==NFN){
-                               _threaddebug(DBGNOTE, "Unhandled note %s, proc %p", n->s, p);
+                       if(!all && (i==onnotesize || j==NFN)){
+                               _threaddebug(DBGNOTE, "Unhandled note %s, proc %p\n", n->s, p);
                                if(v != nil)
                                        noted(NDFLT);
                                else if(strncmp(n->s, "sys:", 4)==0)
@@ -79,6 +177,8 @@
                                        abort();
                                threadexitsall(n->s);
                        }
+                       n->proc = nil;
+                       unlock(&n->inuse);
                }
        }
  }
@@ -94,7 +194,7 @@
                noted(NDFLT);

        if(_threadexitsallstatus){
-               _threaddebug(DBGNOTE, "Threadexitsallstatus = '%s'", _threadexitsallstatus);
+               _threaddebug(DBGNOTE, "Threadexitsallstatus = '%s'\n", _threadexitsallstatus);
                _exits(_threadexitsallstatus);
        }

--- /tmp/sched.c
+++ /sys/src/libthread/sched.c
@@ -157,6 +157,7 @@
                t = runthread(p);
                if(t == nil){
                        _threaddebug(DBGSCHED, "all threads gone; exiting");
+                       threadcancelnotes(p->pid);
                        unlinkproc(p);
                        _schedexit(p);  /* frees proc */
                }
--- /tmp/thread.h
+++ /sys/include/thread.h
@@ -97,6 +97,7 @@
  void  threadkillgrp(int);     /* kill threads in group */
  void  threadmain(int argc, char *argv[]);
  int   threadnotify(int (*f)(void*, char*), int in);
+void threadcancelnotes(int pid);
  int   threadid(void);
  int   threadpid(int);
  int   threadsetgrp(int);              /* set thread group, return old */
--- /tmp/threadimpl.h
+++ /sys/src/libthread/threadimpl.h
@@ -192,3 +192,15 @@
  #define       _threaddebug(flag, ...) if((_threaddebuglevel&(flag))==0){}else _threadprint(__VA_ARGS__)

  #define ioproc_arg(io, type)  (va_arg((io)->arg, type))
+
+#define        PPCHUNK 100
+#define        NFN 33
+typedef struct Onnote Onnote;
+struct Onnote
+{
+       int pid;
+       int (*fn[NFN])(void*, char*);
+};
+extern Onnote **onnote;
+extern int onnotesize;
+void _threadnote(void*, char*);

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-Mfa798aacff90ff14ee10e8cf
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-28 18:19                                                   ` adr
@ 2022-06-28 18:28                                                     ` adr
  0 siblings, 0 replies; 49+ messages in thread
From: adr @ 2022-06-28 18:28 UTC (permalink / raw)
  To: 9fans

On Tue, 28 Jun 2022, adr wrote:
> This just evade going through the arrays twice. For the current
> value of NFN it doesn't make too much a difference, note that this
> structures are locked. It just was hurting my eyes.

Sorry for the noise, bad patch.
--- /tmp/main.c
+++ /sys/src/libthread/main.c
@@ -28,6 +28,10 @@
        _qlockinit(_threadrendezvous);
        _sysfatal = _threadsysfatal;
        __assert = _threadassert;
+       onnote = mallocz(PPCHUNK*sizeof(uintptr), 1);
+       if(!onnote)
+               sysfatal("Malloc of size %d failed: %r", PPCHUNK*sizeof(uintptr));
+       onnotesize = PPCHUNK;
        notify(_threadnote);
        if(mainstacksize == 0)
                mainstacksize = 8*1024;
--- /tmp/note.c
+++ /sys/src/libthread/note.c
@@ -5,7 +5,6 @@

  int   _threadnopasser;

-#define        NFN             33
  #define       ERRLEN  48
  typedef struct Note Note;
  struct Note
@@ -17,62 +16,161 @@

  static Note   notes[128];
  static Note   *enotes = notes+nelem(notes);
-static int             (*onnote[NFN])(void*, char*);
-static int             onnotepid[NFN];
+Onnote **onnote;
+int onnotesize;
+static int (*onnoteall[NFN])(void*, char*);
  static Lock   onnotelock;

  int
  threadnotify(int (*f)(void*, char*), int in)
  {
-       int i, topid;
-       int (*from)(void*, char*), (*to)(void*, char*);
+       int i, j, n;

-       if(in){
-               from = nil;
-               to = f;
-               topid = _threadgetproc()->pid;
-       }else{
-               from = f;
-               to = nil;
-               topid = 0;
-       }
        lock(&onnotelock);
-       for(i=0; i<NFN; i++)
-               if(onnote[i]==from){
-                       onnote[i] = to;
-                       onnotepid[i] = topid;
+
+       /* add note for all processes */
+       if(in < -1){
+               n = -1;
+               for(i=0; i<NFN; i++){
+                       if(onnoteall[i] == f){
+                               unlock(&onnotelock);
+                               return 1;
+                       }
+                       if(onnoteall[i]==nil && n==-1)
+                               n = i;
+               }
+               if(n > -1)
+                       onnoteall[n] = f;
+               unlock(&onnotelock);
+               return n>-1;
+       }
+
+       /* remove note for all processes */
+       if(in == -1){
+               for(i=0; i<NFN; i++)
+                       if(onnoteall[i] == f){
+                               onnoteall[i] = nil;
+                               break;
+                       }
+               unlock(&onnotelock);
+               return i<NFN;
+       }
+
+       /* remove note for current process */
+       if(!in){
+               for(i=0; i<onnotesize; i++){
+                       if(onnote[i]!=nil && onnote[i]->pid==_threadgetproc()->pid){
+                               for(j=0; j<NFN; j++){
+                                       if(onnote[i]->fn[j] == f){
+                                               onnote[i]->fn[j] = 0;
+                                               break;
+                                       }
+                               }
+                               unlock(&onnotelock);
+                               return j<NFN;
+                       }
+               }
+               unlock(&onnotelock);
+               return 0;
+       }
+
+       /* add note for current process */
+       for(i=0; i<onnotesize; i++)
+               if(onnote[i] && onnote[i]->pid==_threadgetproc()->pid)
+                       break;
+
+       /* process has already a slot */
+       if(i < onnotesize){
+               n = -1;
+               for(j=0; j<NFN; j++){
+                       if(onnote[i]->fn[j] == f){
+                               unlock(&onnotelock);
+                               return 1;
+                       }
+                       if(onnote[i]->fn[j]==nil && n==-1)
+                               n = j;
+               }
+               if(n > -1)
+                       onnote[i]->fn[n] = f;
+               unlock(&onnotelock);
+               return n>-1;
+ 
+       }
+
+       for(i=0; i<onnotesize; i++)
+               if(!onnote[i])
+                       break;
+
+       /* there is no free slot */
+       if(i == onnotesize){
+               onnotesize += PPCHUNK;
+               onnote = realloc(onnote, onnotesize*sizeof(uintptr));
+               if(!onnote){
+                       unlock(&onnotelock);
+                       sysfatal("Malloc of size %d failed: %r", onnotesize*sizeof(uintptr));
+               }
+               memset(onnote+i+1, 0, PPCHUNK-1);
+       }
+
+       onnote[i]=mallocz(sizeof(Onnote), 1);
+       if(!onnote[i]){
+               unlock(&onnotelock);
+               sysfatal("Malloc of size %d failed: %r", sizeof(Onnote));
+       }
+       onnote[i]->pid = _threadgetproc()->pid;
+       onnote[i]->fn[0] = f;
+       unlock(&onnotelock);
+       return 1;
+}
+
+void
+threadcancelnotes(int pid)
+{
+       int i;
+
+       lock(&onnotelock);
+       for(i=0; i<onnotesize; i++)
+               if(onnote[i] && onnote[i]->pid==pid){
+                       free(onnote[i]);
+                       onnote[i] = nil;
                        break;
                }
        unlock(&onnotelock);
-       return i<NFN;
+       return;
  }

  static void
  delayednotes(Proc *p, void *v)
  {
-       int i;
+       int i, j, all;
        Note *n;
-       char s[ERRMAX];
-       int (*fn)(void*, char*);
+       int (*f)(void*, char*);

        if(!p->pending)
                return;

        p->pending = 0;
+       all = j = 0;
        for(n=notes; n<enotes; n++){
                if(n->proc == p){
-                       strcpy(s, n->s);
-                       n->proc = nil;
-                       unlock(&n->inuse);
-
-                       for(i=0; i<NFN; i++){
-                               if(onnotepid[i]!=p->pid || (fn = onnote[i])==nil)
-                                       continue;
-                               if((*fn)(v, s))
-                                       break;
+                       for(i=0; i<NFN; i++)
+                               if(f=onnoteall[i])
+                                       if((*f)(v, n->s)){
+                                               all = 1;
+                                               break;
+                                       }
+                       if(!all){
+                               for(i=0; i<onnotesize; i++)
+                                       if(onnote[i] && onnote[i]->pid==p->pid){
+                                               for(j=0; j<NFN; j++)
+                                                       if(f=onnote[i]->fn[j])
+                                                               if((*f)(v, n->s))
+                                                                       break;
+                                               break;
+                                       }
                        }
-                       if(i==NFN){
-                               _threaddebug(DBGNOTE, "Unhandled note %s, proc %p", n->s, p);
+                       if(!all && (i==onnotesize || j==NFN)){
+                               _threaddebug(DBGNOTE, "Unhandled note %s, proc %p\n", n->s, p);
                                if(v != nil)
                                        noted(NDFLT);
                                else if(strncmp(n->s, "sys:", 4)==0)
@@ -79,6 +177,8 @@
                                        abort();
                                threadexitsall(n->s);
                        }
+                       n->proc = nil;
+                       unlock(&n->inuse);
                }
        }
  }
@@ -94,7 +194,7 @@
                noted(NDFLT);

        if(_threadexitsallstatus){
-               _threaddebug(DBGNOTE, "Threadexitsallstatus = '%s'", _threadexitsallstatus);
+               _threaddebug(DBGNOTE, "Threadexitsallstatus = '%s'\n", _threadexitsallstatus);
                _exits(_threadexitsallstatus);
        }

--- /tmp/sched.c
+++ /sys/src/libthread/sched.c
@@ -157,6 +157,7 @@
                t = runthread(p);
                if(t == nil){
                        _threaddebug(DBGSCHED, "all threads gone; exiting");
+                       threadcancelnotes(p->pid);
                        unlinkproc(p);
                        _schedexit(p);  /* frees proc */
                }
--- /tmp/thread.h
+++ /sys/include/thread.h
@@ -97,6 +97,7 @@
  void  threadkillgrp(int);     /* kill threads in group */
  void  threadmain(int argc, char *argv[]);
  int   threadnotify(int (*f)(void*, char*), int in);
+void threadcancelnotes(int pid);
  int   threadid(void);
  int   threadpid(int);
  int   threadsetgrp(int);              /* set thread group, return old */
--- /tmp/threadimpl.h
+++ /sys/src/libthread/threadimpl.h
@@ -192,3 +192,15 @@
  #define       _threaddebug(flag, ...) if((_threaddebuglevel&(flag))==0){}else _threadprint(__VA_ARGS__)

  #define ioproc_arg(io, type)  (va_arg((io)->arg, type))
+
+#define        PPCHUNK 100
+#define        NFN 33
+typedef struct Onnote Onnote;
+struct Onnote
+{
+       int pid;
+       int (*fn[NFN])(void*, char*);
+};
+extern Onnote **onnote;
+extern int onnotesize;
+void _threadnote(void*, char*);

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-Mc1bc6e87aa53de84e2f20729
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-28 15:28                                                 ` adr
  2022-06-28 16:43                                                   ` ori
  2022-06-28 18:19                                                   ` adr
@ 2022-06-28 19:09                                                   ` andrey100100100
  2022-06-28 19:42                                                     ` adr
  2 siblings, 1 reply; 49+ messages in thread
From: andrey100100100 @ 2022-06-28 19:09 UTC (permalink / raw)
  To: 9fans

В Вт, 28/06/2022 в 15:28 +0000, adr пишет:
> Andrey, if you want to use different note handlers per process (with
> a big
> number of processes) using libthread, this may be helpful.
> 
> The idea is this:
> 
> An array of handlers for all processes which can be changed by all
> processes.
> When a note is received by a process, this array takes priority.
> 
> An array of pointers to structures of the type
> 
> struct Onnote
> {
>         int pid;
>         int (*fn[NFN])(void*, char*);
> };
> 
> initially of size PPCHUNK (I set it to 100, experiment with that),
> but it can grow if necessary (but not shrink, I think this would
> be overkilling).
> 
> These structures are allocated the first time a process record a
> handler and freed when the process exits (or by calling
> threadcancelnotes(), note that this function can free other
> processes'
> function handlers, maybe should be better to make some restrictions)
> 
> The use of "in" in threadnotify(int (*f)(void*, char*), int in) is:
> 
> in > 0 : set the handler for the calling process.
> in == 0 : clear the handler for the calling process.
> in == -1 : clear the handler for all processes (except those who has
>             registered it already for themselves).
> in < -1 : set the handler for all processes.
> 
> There is no use of threadnotify with "in < 0" in /sys/src, so nothing
> is broken.
> 
> As you are using 9front and they are serving their sources with
> 9p, here is a diff to their sources. I haven't compiled it in
> 9front, though. Note that if you want to compile the system with
> this changes, you have to eliminate the copy of note.c at
> /sys/src/cmd/execnet (it seems that note.c was added afterwards as
> I thought).
> 
> I haven't test it too much, this has been more like a time-destroyer
> pastime.
> 
> adr
> --- /tmp/main.c
> +++ /sys/src/libthread/main.c
> @@ -28,6 +28,10 @@
>         _qlockinit(_threadrendezvous);
>         _sysfatal = _threadsysfatal;
>         __assert = _threadassert;
> +       onnote = mallocz(PPCHUNK*sizeof(uintptr), 1);
> +       if(!onnote)
> +               sysfatal("Malloc of size %d failed: %r",
> PPCHUNK*sizeof(uintptr));
> +       onnotesize = PPCHUNK;
>         notify(_threadnote);
>         if(mainstacksize == 0)
>                 mainstacksize = 8*1024;
> --- /tmp/note.c
> +++ /sys/src/libthread/note.c
> @@ -5,9 +5,9 @@
> 
>   int   _threadnopasser;
> 
> -#define        NFN             33
>   #define       ERRLEN  48
>   typedef struct Note Note;
> +
>   struct Note
>   {
>         Lock            inuse;
> @@ -17,62 +17,155 @@
> 
>   static Note   notes[128];
>   static Note   *enotes = notes+nelem(notes);
> -static int             (*onnote[NFN])(void*, char*);
> -static int             onnotepid[NFN];
> +Onnote **onnote;
> +int onnotesize;
> +static int (*onnoteall[NFN])(void*, char*);
>   static Lock   onnotelock;
> 
>   int
>   threadnotify(int (*f)(void*, char*), int in)
>   {
> -       int i, topid;
> -       int (*from)(void*, char*), (*to)(void*, char*);
> +       int i, j;
> 
> -       if(in){
> -               from = nil;
> -               to = f;
> -               topid = _threadgetproc()->pid;
> -       }else{
> -               from = f;
> -               to = nil;
> -               topid = 0;
> -       }
>         lock(&onnotelock);
> -       for(i=0; i<NFN; i++)
> -               if(onnote[i]==from){
> -                       onnote[i] = to;
> -                       onnotepid[i] = topid;
> +
> +       /* add note for all processes */
> +       if(in < -1){
> +               for(i=0; i<NFN; i++)
> +                       if(onnoteall[i] == f){
> +                               unlock(&onnotelock);
> +                               return 1;
> +                       }
> +               for(i=0; i<NFN; i++)
> +                       if(onnoteall[i] == nil){
> +                               onnoteall[i] = f;
> +                               break;
> +                       }
> +               unlock(&onnotelock);
> +               return i<NFN;
> +       }
> +
> +       /* remove note for all processes */
> +       if(in == -1){
> +               for(i=0; i<NFN; i++)
> +                       if(onnoteall[i] == f){
> +                               onnoteall[i] = nil;
> +                               break;
> +                       }
> +               unlock(&onnotelock);
> +               return i<NFN;
> +       }
> +
> +       /* remove note for current process */
> +       if(!in){
> +               for(i=0; i<onnotesize; i++){
> +                       if(onnote[i]!=nil && onnote[i]-
> >pid==_threadgetproc()->pid){
> +                               for(j=0; j<NFN; j++){
> +                                       if(onnote[i]->fn[j] == f){
> +                                               onnote[i]->fn[j] = 0;
> +                                               break;
> +                                       }
> +                               }
> +                               unlock(&onnotelock);
> +                               return j<NFN;
> +                       }
> +               }
> +               unlock(&onnotelock);
> +               return i<onnotesize;
> +       }
> +
> +       /* add note for current process */
> +       for(i=0; i<onnotesize; i++)
> +               if(onnote[i] && onnote[i]->pid==_threadgetproc()-
> >pid)
> +                       break;
> +
> +       /* process has already a slot */
> +       if(i < onnotesize){
> +               for(j=0; j<NFN; j++){
> +                       if(onnote[i]->fn[j] == nil){
> +                               onnote[i]->fn[j] = f;
> +                               break;
> +                       }
> +               }
> +               unlock(&onnotelock);
> +               return j<NFN;
> + 
> +       }
> +
> +       for(i=0; i<onnotesize; i++)
> +               if(!onnote[i])
> +                       break;
> +
> +       /* there is no free slot */
> +       if(i == onnotesize){
> +               onnotesize += PPCHUNK;
> +               onnote = realloc(onnote, onnotesize*sizeof(uintptr));
> +               if(!onnote){
> +                       unlock(&onnotelock);
> +                       sysfatal("Malloc of size %d failed: %r",
> onnotesize*sizeof(uintptr));
> +               }
> +               memset(onnote+i+1, 0, PPCHUNK-1);
> +       }
> +
> +       onnote[i]=mallocz(sizeof(Onnote), 1);
> +       if(!onnote[i]){
> +               unlock(&onnotelock);
> +               sysfatal("Malloc of size %d failed: %r",
> sizeof(Onnote));
> +       }
> +       onnote[i]->pid = _threadgetproc()->pid;
> +       onnote[i]->fn[0] = f;
> +       unlock(&onnotelock);
> +       return 1;
> +}
> +
> +void
> +threadcancelnotes(int pid)
> +{
> +       int i;
> +
> +       lock(&onnotelock);
> +       for(i=0; i<onnotesize; i++)
> +               if(onnote[i] && onnote[i]->pid==pid){
> +                       free(onnote[i]);
> +                       onnote[i] = nil;
>                         break;
>                 }
>         unlock(&onnotelock);
> -       return i<NFN;
> +       return;
>   }
> 
>   static void
>   delayednotes(Proc *p, void *v)
>   {
> -       int i;
> +       int i, j, all;
>         Note *n;
> -       char s[ERRMAX];
> -       int (*fn)(void*, char*);
> +       int (*f)(void*, char*);
> 
>         if(!p->pending)
>                 return;
> 
>         p->pending = 0;
> +       all = j = 0;
>         for(n=notes; n<enotes; n++){
>                 if(n->proc == p){
> -                       strcpy(s, n->s);
> -                       n->proc = nil;
> -                       unlock(&n->inuse);
> -
> -                       for(i=0; i<NFN; i++){
> -                               if(onnotepid[i]!=p->pid || (fn =
> onnote[i])==nil)
> -                                       continue;
> -                               if((*fn)(v, s))
> -                                       break;
> +                       for(i=0; i<NFN; i++)
> +                               if(f=onnoteall[i])
> +                                       if((*f)(v, n->s)){
> +                                               all = 1;
> +                                               break;
> +                                       }
> +                       if(!all){
> +                               for(i=0; i<onnotesize; i++)
> +                                       if(onnote[i] && onnote[i]-
> >pid==p->pid){
> +                                               for(j=0; j<NFN; j++)
> +                                                      
> if(f=onnote[i]->fn[j])
> +                                                              
> if((*f)(v, n->s))
> +                                                                    
>    break;
> +                                               break;
> +                                       }
>                         }
> -                       if(i==NFN){
> -                               _threaddebug(DBGNOTE, "Unhandled note
> %s, proc %p", n->s, p);
> +                       if(!all && (i==onnotesize || j==NFN)){
> +                               _threaddebug(DBGNOTE, "Unhandled note
> %s, proc %p\n", n->s, p);
>                                 if(v != nil)
>                                         noted(NDFLT);
>                                 else if(strncmp(n->s, "sys:", 4)==0)
> @@ -79,6 +172,8 @@
>                                         abort();
>                                 threadexitsall(n->s);
>                         }
> +                       n->proc = nil;
> +                       unlock(&n->inuse);
>                 }
>         }
>   }
> @@ -94,7 +189,7 @@
>                 noted(NDFLT);
> 
>         if(_threadexitsallstatus){
> -               _threaddebug(DBGNOTE, "Threadexitsallstatus = '%s'",
> _threadexitsallstatus);
> +               _threaddebug(DBGNOTE, "Threadexitsallstatus =
> '%s'\n", _threadexitsallstatus);
>                 _exits(_threadexitsallstatus);
>         }
> 
> --- /tmp/sched.c
> +++ /sys/src/libthread/sched.c
> @@ -157,6 +157,7 @@
>                 t = runthread(p);
>                 if(t == nil){
>                         _threaddebug(DBGSCHED, "all threads gone;
> exiting");
> +                       threadcancelnotes(p->pid);
>                         unlinkproc(p);
>                         _schedexit(p);  /* frees proc */
>                 }
> --- /tmp/thread.h
> +++ /sys/include/thread.h
> @@ -97,6 +97,7 @@
>   void  threadkillgrp(int);     /* kill threads in group */
>   void  threadmain(int argc, char *argv[]);
>   int   threadnotify(int (*f)(void*, char*), int in);
> +void threadcancelnotes(int pid);
>   int   threadid(void);
>   int   threadpid(int);
>   int   threadsetgrp(int);              /* set thread group, return
> old */
> --- /tmp/threadimpl.h
> +++ /sys/src/libthread/threadimpl.h
> @@ -192,3 +192,15 @@
>   #define       _threaddebug(flag, ...)
> if((_threaddebuglevel&(flag))==0){}else _threadprint(__VA_ARGS__)
> 
>   #define ioproc_arg(io, type)  (va_arg((io)->arg, type))
> +
> +#define        PPCHUNK 100
> +#define        NFN 33
> +typedef struct Onnote Onnote;
> +struct Onnote
> +{
> +       int pid;
> +       int (*fn[NFN])(void*, char*);
> +};
> +extern Onnote **onnote;
> +extern int onnotesize;
> +void _threadnote(void*, char*);
> ------------------------------------------

Thanks for the patch.


Regards,
Andrej

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-Mb89a47f334b083f180f89f9a
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [9fans] syscall silently kill processes
  2022-06-28 19:09                                                   ` andrey100100100
@ 2022-06-28 19:42                                                     ` adr
  0 siblings, 0 replies; 49+ messages in thread
From: adr @ 2022-06-28 19:42 UTC (permalink / raw)
  To: 9fans

On Tue, 28 Jun 2022, andrey100100100@gmail.com wrote:
> Thanks for the patch.

It's just to play with it, note that onnote should be just passed
once.  I'll post another patch if things work ok.

adr

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M348ddb20db1a1bf17ce2b189
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2022-06-28 19:43 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-17  9:37 [9fans] syscall silently kill processes andrey100100100
2022-06-17 13:46 ` Thaddeus Woskowiak
2022-06-17 14:11   ` Jacob Moody
2022-06-17 14:39     ` Thaddeus Woskowiak
2022-06-17 15:06     ` andrey100100100
2022-06-17 16:08       ` Skip Tavakkolian
2022-06-17 16:11         ` Skip Tavakkolian
2022-06-17 16:16           ` Skip Tavakkolian
2022-06-17 17:42             ` adr
2022-06-17 16:11       ` Jacob Moody
2022-06-17 18:48         ` andrey100100100
2022-06-17 19:28           ` Jacob Moody
2022-06-17 21:15           ` adr
2022-06-18  6:40             ` andrey100100100
2022-06-18  8:37               ` adr
2022-06-18  9:22                 ` adr
2022-06-18 12:53                   ` Jacob Moody
2022-06-18 22:03                     ` andrey100100100
2022-06-19  5:54                     ` adr
2022-06-19  6:13                       ` Jacob Moody
2022-06-18 22:22                   ` andrey100100100
2022-06-18 16:57                 ` andrey100100100
2022-06-19  2:40                   ` adr
2022-06-19  5:01                     ` adr
2022-06-19  8:52                       ` andrey100100100
2022-06-19 10:32                         ` adr
2022-06-19 11:40                           ` andrey100100100
2022-06-19 12:01                             ` andrey100100100
2022-06-19 15:10                           ` andrey100100100
2022-06-19 16:41                             ` adr
2022-06-19 21:22                               ` andrey100100100
2022-06-19 21:26                                 ` andrey100100100
2022-06-20  4:41                                 ` adr
2022-06-20  5:39                                   ` andrey100100100
2022-06-20  5:59                                   ` adr
2022-06-20 15:56                                     ` andrey100100100
2022-06-20 22:29                                       ` Skip Tavakkolian
2022-06-21  7:07                                         ` andrey100100100
2022-06-21 11:26                                           ` adr
2022-06-21 13:03                                             ` andrey100100100
2022-06-21 13:22                                               ` adr
2022-06-28 15:28                                                 ` adr
2022-06-28 16:43                                                   ` ori
2022-06-28 18:19                                                   ` adr
2022-06-28 18:28                                                     ` adr
2022-06-28 19:09                                                   ` andrey100100100
2022-06-28 19:42                                                     ` adr
2022-06-21 13:47                                             ` andrey100100100
2022-06-21  7:22                                         ` adr

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).