9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: andrey100100100@gmail.com
To: 9fans@9fans.net
Subject: Re: [9fans] syscall silently kill processes
Date: Sun, 19 Jun 2022 01:03:47 +0300	[thread overview]
Message-ID: <a0bca5be62fcc8a129ed4fcf7528c94cf75146e3.camel@gmail.com> (raw)
In-Reply-To: <55d376e1-fddb-135c-c7e3-ffca9ed621d7@posixcafe.org>

В Сб, 18/06/2022 в 06:53 -0600, Jacob Moody пишет:
> On 6/18/22 03:22, adr wrote:
> > On Sat, 18 Jun 2022, adr wrote:
> > 
> > > On Sat, 18 Jun 2022, andrey100100100@gmail.com wrote:
> > > 
> > > > ---------------------------------------------
> > > > 
> > > > cpu% 6.out | grep end | wc -l
> > > >     33
> > > > 
> > > > 
> > > > Problem in unregistered handlers.
> > > 
> > > But unregistered handlers shouldn't be a problem. The process is
> > > been killed when alarm sends the note. That's why the code worked
> > > removing the read statement, the alarm is set off and the note is
> > > not sent before the process ends. I just don't see why the
> > > process
> > > is been killed. The documentation describes another behavior. To
> > > me it smells like bug barbecue (corrupted onnote?). Maybe I got
> > > something wrong, bear with me.
> > > 
> > > > > Note that you could register the handler in threadmain and
> > > > > avoid
> > > > > completely this issue, but as I said before, something seems
> > > > > wrong
> > > > > to me here.
> > > > 
> > > > I'm don't understand how handler in threadmain would solve the
> > > > problem.
> > > > I need in 'alarm' on per process basis.
> > > 
> > > You need alarm() in every process, but you don't need to register
> > > the
> > > same handler 80 times!
> > > 
> > > adr.
> > 
> > I think there is some confussion here, so I'll explain myself a
> > little more.
> > 
> > Lets change your last example to not use libthread:
> > 
> > #include <u.h>
> > #include <libc.h>
> > 
> > int
> > handler_alarm(void *, char *msg)
> > {
> >          if(strstr(msg, "alarm")){
> >                  return 1;
> >          }
> > 
> >          return 0;
> > }
> > 
> > int
> > test(void)
> > {
> >          if(atnotify(handler_alarm, 1) == 0){
> >                  fprint(1, "handler not registered\n");
> >          }
> > 
> >          alarm(10);
> >          fprint(1, "start\n");
> >          sleep(40);
> >          fprint(1, "end\n");
> >          alarm(0);
> > 
> >          return 0;
> > }
> > 
> > void
> > main()
> > {
> >          for(int i = 0; i < 80; i++){
> >                  test();
> >          }
> > 
> >          exits(nil);
> > }
> > 
> > You see, after the NFNth iteration of test(), onnot[NFN] in
> > atnotify
> > will be full, the handlers wont be registered but the code will
> > work without any problem. It doesn't matter, the first handler in
> > onnot[] will be executed. I fact you only need one handler there,
> > not
> > 80, you should move atnotify to main.
> > 
> > The same should be happening with libthread. I'm really the only
> > one smelling a bug here?
> 
> No, you've got me convinced something much more wrong is going on.
> Because you're right, our read children shouldn't just be gone,
> we should return from read with an error and then print the "end"
> line.
> I've attempted to reproduce it, trying to remove the libthread/notify
> factors. I've come up with this:
> 
> #include <u.h>
> #include <libc.h>
> 
> static void
> proc_udp(void*)
> {
>         char resp[512];
>         char req[] = "request";
>         int fd;
>         int n;
>         int pid;
> 
>         fd = dial("udp!185.157.221.201!5678", nil, nil, nil);
>         if(fd < 0)
>                 exits("can't dial");
> 
>         if(write(fd, req, strlen(req)) != strlen(req))
>                 exits("can't write");
> 
>         pid = getpid();
>         fprint(1, "start %d\n", pid);
>         n = read(fd, resp, sizeof(resp)-1);
>         fprint(1, "end %d %d\n", pid, n);
>         exits(nil);
> }
> 
> void
> main(int, char**)
> {
>         int i;
>         Waitmsg *wm;
> 
>         for(i = 0; i < 10; i++){
>                 switch(fork()){
>                 case -1:
>                         sysfatal("fork %r");
>                 case 0:
>                         proc_udp(nil);
>                         sysfatal("ret");
>                 default:
>                         break;
>                 }
>         }
>         for(i = 0; i < 10; i++){
>                 wm = wait();
>                 print("proc %d died with message %s\n", wm->pid, wm-
> >msg);
>         }
>         exits(nil);
> }
> 
> This code makes it pretty obvious that we are losing some children;
> on my machine this program never exits. I see some portion of the
> readers correctly returning -1, and the parent is able to get their
> Waitmsg but not all of them.
> 

cpu% 6.out
start 20383
start 20390
start 20385
start 20389
start 20387
start 20384
start 20388
start 20381
start 20382
start 20386
end 20390 -1
end 20386 -1
end 20382 -1
end 20381 -1
end 20387 -1
end 20384 -1
proc 20390 died with message 
proc 20384 died with message 
proc 20387 died with message 
proc 20381 died with message 
proc 20382 died with message 
proc 20386 died with message 

'losed' processes stalled in read syscall:

glenda        20380    0:00   0:00       52K Await    6.out
glenda        20383    0:00   0:00       48K Pread    6.out
glenda        20385    0:00   0:00       48K Pread    6.out
glenda        20388    0:00   0:00       48K Pread    6.out
glenda        20389    0:00   0:00       48K Pread    6.out


Regards,
Andrej

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M4109aa26c6245de508c32baa
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

  reply	other threads:[~2022-06-18 22:04 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-17  9:37 andrey100100100
2022-06-17 13:46 ` Thaddeus Woskowiak
2022-06-17 14:11   ` Jacob Moody
2022-06-17 14:39     ` Thaddeus Woskowiak
2022-06-17 15:06     ` andrey100100100
2022-06-17 16:08       ` Skip Tavakkolian
2022-06-17 16:11         ` Skip Tavakkolian
2022-06-17 16:16           ` Skip Tavakkolian
2022-06-17 17:42             ` adr
2022-06-17 16:11       ` Jacob Moody
2022-06-17 18:48         ` andrey100100100
2022-06-17 19:28           ` Jacob Moody
2022-06-17 21:15           ` adr
2022-06-18  6:40             ` andrey100100100
2022-06-18  8:37               ` adr
2022-06-18  9:22                 ` adr
2022-06-18 12:53                   ` Jacob Moody
2022-06-18 22:03                     ` andrey100100100 [this message]
2022-06-19  5:54                     ` adr
2022-06-19  6:13                       ` Jacob Moody
2022-06-18 22:22                   ` andrey100100100
2022-06-18 16:57                 ` andrey100100100
2022-06-19  2:40                   ` adr
2022-06-19  5:01                     ` adr
2022-06-19  8:52                       ` andrey100100100
2022-06-19 10:32                         ` adr
2022-06-19 11:40                           ` andrey100100100
2022-06-19 12:01                             ` andrey100100100
2022-06-19 15:10                           ` andrey100100100
2022-06-19 16:41                             ` adr
2022-06-19 21:22                               ` andrey100100100
2022-06-19 21:26                                 ` andrey100100100
2022-06-20  4:41                                 ` adr
2022-06-20  5:39                                   ` andrey100100100
2022-06-20  5:59                                   ` adr
2022-06-20 15:56                                     ` andrey100100100
2022-06-20 22:29                                       ` Skip Tavakkolian
2022-06-21  7:07                                         ` andrey100100100
2022-06-21 11:26                                           ` adr
2022-06-21 13:03                                             ` andrey100100100
2022-06-21 13:22                                               ` adr
2022-06-28 15:28                                                 ` adr
2022-06-28 16:43                                                   ` ori
2022-06-28 18:19                                                   ` adr
2022-06-28 18:28                                                     ` adr
2022-06-28 19:09                                                   ` andrey100100100
2022-06-28 19:42                                                     ` adr
2022-06-29 13:14                                                       ` adr
2022-06-21 13:47                                             ` andrey100100100
2022-06-21  7:22                                         ` adr

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a0bca5be62fcc8a129ed4fcf7528c94cf75146e3.camel@gmail.com \
    --to=andrey100100100@gmail.com \
    --cc=9fans@9fans.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).