The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: bakul@bitblocks.com (Bakul Shah)
Subject: [TUHS] signals and blocked in I/O
Date: Fri, 1 Dec 2017 13:33:46 -0800	[thread overview]
Message-ID: <BADECBD9-D2B4-4788-8308-87EBF93D555A@bitblocks.com> (raw)
In-Reply-To: <20171201172603.GO3924@mcvoy.com>

On Dec 1, 2017, at 9:26 AM, Larry McVoy <lm at mcvoy.com> wrote:
> 
> On Fri, Dec 01, 2017 at 09:33:49AM -0700, Warner Losh wrote:
>>> Or what you do is kill the process, tear down all of the pages except those
>>> that are locked for I/O, leave those in the process and wait for the I/O to
>>> get done.  That might be simpler.
>> 
>> Perhaps. Even that may have issues with cleanup because you may also need
>> other pages to complete the I/O processing since I think that things like
>> aio allocate a tiny bit of memory associated with the requesting process
>> and need that memory to finish the I/O. It's certainly not a requirement
>> that all I/O initiated by userland have no extra state stored in the
>> process' address space associated with it. Since the unwinding happens at a
>> layer higher than the disk driver, who knows what those guys do, eh?
> 
> Yeah, it's not an easy fix but the problem we are having right now is that
> the system is thrashing.  Why the OOM code isn't fixing it I don't know.
> It just feels busted.

So OOM code kills a (random) process in hopes of freeing up
some pages but if this process is stuck in diskIO, nothing
can be freed and everything grinds to a halt. Is this right?

If so, one work around is to kill a process that is *not*
sleeping at an uninterruptable priority :-)/2 This is
separate from any policy of how to choose a victim.

If the queue of dirty pages is growing longer and longer
as the page washer can't keep up, this is analogous to
the bufferbloat problem in networking. You have to test
if this is what is going on. If so, may be you can figure
out how to keep the queues short.

But before any fixes, I would strongly suggest instrumenting
the code to understand what is going on and then instrument
the code further to test out various hypotheses. Once a clear
mental model is in place, the fix will be obvious!



  parent reply	other threads:[~2017-12-01 21:33 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-01 15:44 Larry McVoy
2017-12-01 15:53 ` Dan Cross
2017-12-01 16:11   ` Clem Cole
2017-12-01 16:18     ` Larry McVoy
2017-12-01 16:33       ` Warner Losh
2017-12-01 17:26         ` Larry McVoy
2017-12-01 19:10           ` Chris Torek
2017-12-01 23:21             ` Dave Horsfall
2017-12-01 21:33           ` Bakul Shah [this message]
2017-12-01 22:38             ` Larry McVoy
2017-12-01 23:03               ` Ralph Corderoy
2017-12-01 23:09                 ` Larry McVoy
2017-12-01 23:42                   ` Bakul Shah
2017-12-02  0:48                     ` Larry McVoy
2017-12-02  1:40                       ` Bakul Shah
2017-12-03 13:50                       ` Ralph Corderoy
2017-12-04 16:36                       ` arnold
2017-12-04 16:58                         ` Arthur Krewat
2017-12-04 17:19                         ` Warner Losh
2017-12-05  2:12                           ` Bakul Shah
2017-12-04 22:07                         ` Dave Horsfall
2017-12-04 22:54                           ` Ron Natalie
2017-12-04 22:56                             ` Warner Losh
2017-12-05  0:49                               ` Dave Horsfall
2017-12-05  0:58                                 ` Arthur Krewat
2017-12-05  2:15                                 ` Dave Horsfall
2017-12-05  2:54                                   ` Clem cole
2017-12-02 14:59                   ` Theodore Ts'o
2017-12-01 16:01 ` Dave Horsfall
2017-12-01 16:24 ` Warner Losh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BADECBD9-D2B4-4788-8308-87EBF93D555A@bitblocks.com \
    --to=bakul@bitblocks.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).