[TUHS] signals and blocked in I/O

The Unix Heritage Society mailing list
 help / color / mirror / Atom feed

* [TUHS] signals and blocked in I/O
@ 2017-12-01 15:44 Larry McVoy
  2017-12-01 15:53 ` Dan Cross
                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Larry McVoy @ 2017-12-01 15:44 UTC (permalink / raw)


Does anyone remember the reason that processes blocked in I/O don't catch
signals?  When did that become a thing, was that part of the original
design or did that happen in BSD?

I'm asking because I'm banging on FreeBSD and I can wedge it hard, to
the point that it won't recover, by just beating on tons of memory.
-- 
---
Larry McVoy            	     lm at mcvoy.com             http://www.mcvoy.com/lm 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-01 15:44 [TUHS] signals and blocked in I/O Larry McVoy
@ 2017-12-01 15:53 ` Dan Cross
  2017-12-01 16:11   ` Clem Cole
  2017-12-01 16:01 ` Dave Horsfall
  2017-12-01 16:24 ` Warner Losh
  2 siblings, 1 reply; 30+ messages in thread
From: Dan Cross @ 2017-12-01 15:53 UTC (permalink / raw)


On Fri, Dec 1, 2017 at 10:44 AM, Larry McVoy <lm at mcvoy.com> wrote:

> Does anyone remember the reason that processes blocked in I/O don't catch
> signals?  When did that become a thing, was that part of the original
> design or did that happen in BSD?
>
> I'm asking because I'm banging on FreeBSD and I can wedge it hard, to
> the point that it won't recover, by just beating on tons of memory.


My understanding was that signal delivery only happens when the process is
*running* in the kernel. If the process is sleeping on IO then it's not
running, so the signal isn't delivered.

        - Dan C.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20171201/004c1caf/attachment.html>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-01 15:44 [TUHS] signals and blocked in I/O Larry McVoy
  2017-12-01 15:53 ` Dan Cross
@ 2017-12-01 16:01 ` Dave Horsfall
  2017-12-01 16:24 ` Warner Losh
  2 siblings, 0 replies; 30+ messages in thread
From: Dave Horsfall @ 2017-12-01 16:01 UTC (permalink / raw)

On Fri, 1 Dec 2017, Larry McVoy wrote:

> Does anyone remember the reason that processes blocked in I/O don't 
> catch signals?  When did that become a thing, was that part of the 
> original design or did that happen in BSD?

Something to do with pending DMA transfers?

> I'm asking because I'm banging on FreeBSD and I can wedge it hard, to 
> the point that it won't recover, by just beating on tons of memory.

That happens on my MacBook a lot :-)  It "only" has 4GB memory, the most 
it will ever take.  In the meantime, my FreeBSD server, pretty much 
running only Sendmail and Apache (and BIND), gets along just fine with 
just 512MB (yes, really; it does not run X or anything).

Sigh...  Sometimes I miss my old 11/40 with 124kw of memory and a handful 
of users, the biggest memory hog probably being NROFF (we didn't have VI, 
but used EM)...

Hey, anyone remember Editor for Mortals?  The author ought to be shot for 
naming something just one keystroke away from "rm file"...

-- 
Dave Horsfall DTM (VK2KFU)  "Those who don't understand security will suffer."

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-01 15:53 ` Dan Cross
@ 2017-12-01 16:11   ` Clem Cole
  2017-12-01 16:18     ` Larry McVoy
  0 siblings, 1 reply; 30+ messages in thread
From: Clem Cole @ 2017-12-01 16:11 UTC (permalink / raw)

Right... that's one of the reasons we created AST's and tried to get them
in the POSIX spec and an alternative to signals.  Instead of trying to
'fix' them, we decided to  inject a new scheme with better semantics (like
being queued, prioritized, guaranteed to be delivered).  They were *.4 for
a while and got replaced with sigqueue which sort helps but does not solve
the issues.

I'm always torn, do you add a new interface and risk bloat, or try to
extend the old.   We never did it in RTU, but talked about it for Stellix
(and never did it there either), but we toyed with making AST's the kernel
primitive and then trying to build signals from them in a user space
library -> which at the time, we thought you can do.  Although there are
some strange side effects of signals that you would have really think
through.   They have been extended again with POSIX and SVR4 so I'm not so
sure,

It's hard because if I/O (like DMA) is in progress, what are the proper
semantics to exit/roll back?   When do you stop the transfer and what
happens.   When doing direct to/from disk (like a in a real-time system),
this can get tricky.

Traditional UNIX semantics were that when a DMA was started, the process is
blocked at high priority until the I/O completes.   Which is fine in the
standard case, but begs the question of what happens when things go south.
signals match an easy implementation of the V6 kernel.    BSD did try to
fix them a little, but we all know that caused as many issues as it solved.

Clem

On Fri, Dec 1, 2017 at 10:53 AM, Dan Cross <crossd at gmail.com> wrote:

> On Fri, Dec 1, 2017 at 10:44 AM, Larry McVoy <lm at mcvoy.com> wrote:
>
>> Does anyone remember the reason that processes blocked in I/O don't catch
>> signals?  When did that become a thing, was that part of the original
>> design or did that happen in BSD?
>>
>> I'm asking because I'm banging on FreeBSD and I can wedge it hard, to
>> the point that it won't recover, by just beating on tons of memory.
>
>
> My understanding was that signal delivery only happens when the process is
> *running* in the kernel. If the process is sleeping on IO then it's not
> running, so the signal isn't delivered.
>
>         - Dan C.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20171201/7cf1a165/attachment.html>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-01 16:11   ` Clem Cole
@ 2017-12-01 16:18     ` Larry McVoy
  2017-12-01 16:33       ` Warner Losh
  0 siblings, 1 reply; 30+ messages in thread
From: Larry McVoy @ 2017-12-01 16:18 UTC (permalink / raw)

On Fri, Dec 01, 2017 at 11:11:56AM -0500, Clem Cole wrote:
> It's hard because if I/O (like DMA) is in progress, what are the proper
> semantics to exit/roll back?   When do you stop the transfer and what
> happens.   When doing direct to/from disk (like a in a real-time system),
> this can get tricky.

So at first blush, what it seems like is you need a barrier to starting more
DMA's.  You take the signal, the signal changes state to say to the I/O
system "finish what you are doing but then no more".

Or what you do is kill the process, tear down all of the pages except those
that are locked for I/O, leave those in the process and wait for the I/O to
get done.  That might be simpler.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-01 15:44 [TUHS] signals and blocked in I/O Larry McVoy
  2017-12-01 15:53 ` Dan Cross
  2017-12-01 16:01 ` Dave Horsfall
@ 2017-12-01 16:24 ` Warner Losh
  2 siblings, 0 replies; 30+ messages in thread
From: Warner Losh @ 2017-12-01 16:24 UTC (permalink / raw)

On Fri, Dec 1, 2017 at 8:44 AM, Larry McVoy <lm at mcvoy.com> wrote:

> Does anyone remember the reason that processes blocked in I/O don't catch
> signals?  When did that become a thing, was that part of the original
> design or did that happen in BSD?
>
> I'm asking because I'm banging on FreeBSD and I can wedge it hard, to
> the point that it won't recover, by just beating on tons of memory.
>

It depends on the I/O, really, if the signal will work. If we're waiting
for I/O to arrive at a character device, for example, you can signal all
day long (the TTY driver depends on this, for example).

In old-school BSD, processes in disk wait state were blocked in the
filesystem layer (typically) waiting for an I/O to complete. I don't know
which came first, but I know the code is quite dependent on the I/O not
returning half-baked. There's no way cancel the I/O once it's started. And
the I/O can also be decoupled from the original process if it's being done
by one of the system threads, so you could be waiting on an I/O to complete
so a page is valid that may have been started by someone else. Tracking
back which process to signal in such circumstances is tricky. The
filesystem code assumes the buffer cache actually caches the page, so the
pages are invalid while the I/O is in progress.

Plus pages are wired for the I/O, and generally marked as invalid so any
access on them faults. Processes receiving signals in that state may need
to exit, but couldn't until those pages are unwired, so even SIGKILL there
would be useless until the I/O completed.

But I think your issues aren't so much I/O as free pages. You need free
pages in order to make progress in running your process. W/o them, you bog
down badly. The root cause is poor page laundering behavior: the system
isn't able to clean enough pages to keep up with the demand. I'm not so
sure it's signals, per se...

Warner
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20171201/053539b6/attachment-0001.html>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-01 16:18     ` Larry McVoy
@ 2017-12-01 16:33       ` Warner Losh
  2017-12-01 17:26         ` Larry McVoy
  0 siblings, 1 reply; 30+ messages in thread
From: Warner Losh @ 2017-12-01 16:33 UTC (permalink / raw)

On Fri, Dec 1, 2017 at 9:18 AM, Larry McVoy <lm at mcvoy.com> wrote:

> On Fri, Dec 01, 2017 at 11:11:56AM -0500, Clem Cole wrote:
> > It's hard because if I/O (like DMA) is in progress, what are the proper
> > semantics to exit/roll back?   When do you stop the transfer and what
> > happens.   When doing direct to/from disk (like a in a real-time system),
> > this can get tricky.
>
> So at first blush, what it seems like is you need a barrier to starting
> more
> DMA's.  You take the signal, the signal changes state to say to the I/O
> system "finish what you are doing but then no more".
>

The I/O subsystem is great at knowing disks, pages and buffers. Lousy at
knowing processes. It has no hooks into, well, anything apart from
FOOstrategy (and if you are lucky first open and last close). And
FOOstrategy is far removed from anything that the process knows about. It
knows about vmpages and open file descriptors. You'd need to plumb
something into the vnode code that could take a request for a signal. Then
that signal would need to go down to the driver somehow. And you'd need to
know what I/O was pending for pages that are in that process, and somehow
have an ID to cancel just those requests. With the layering involved, it
would be extremely tricky. Many of the io completion routines cause more
I/O to happen, especially when swapping.

Plus, modern disks have a hardware queue depth. There is no way to do say
'finish this and do no more' because there's stuff in the hardware queue.
SCSI is sensible and you may be able to selectively cancel I/O
transactions, but ATA is not sensible: you cancel the whole queue and cope
with the fallout (usually by rescheduling the I/Os).

> Or what you do is kill the process, tear down all of the pages except those
> that are locked for I/O, leave those in the process and wait for the I/O to
> get done.  That might be simpler.
>

Perhaps. Even that may have issues with cleanup because you may also need
other pages to complete the I/O processing since I think that things like
aio allocate a tiny bit of memory associated with the requesting process
and need that memory to finish the I/O. It's certainly not a requirement
that all I/O initiated by userland have no extra state stored in the
process' address space associated with it. Since the unwinding happens at a
layer higher than the disk driver, who knows what those guys do, eh?

Warner
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20171201/6d82c24d/attachment.html>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-01 16:33       ` Warner Losh
@ 2017-12-01 17:26         ` Larry McVoy
  2017-12-01 19:10           ` Chris Torek
  2017-12-01 21:33           ` Bakul Shah
  0 siblings, 2 replies; 30+ messages in thread
From: Larry McVoy @ 2017-12-01 17:26 UTC (permalink / raw)


On Fri, Dec 01, 2017 at 09:33:49AM -0700, Warner Losh wrote:
> > Or what you do is kill the process, tear down all of the pages except those
> > that are locked for I/O, leave those in the process and wait for the I/O to
> > get done.  That might be simpler.
> 
> Perhaps. Even that may have issues with cleanup because you may also need
> other pages to complete the I/O processing since I think that things like
> aio allocate a tiny bit of memory associated with the requesting process
> and need that memory to finish the I/O. It's certainly not a requirement
> that all I/O initiated by userland have no extra state stored in the
> process' address space associated with it. Since the unwinding happens at a
> layer higher than the disk driver, who knows what those guys do, eh?

Yeah, it's not an easy fix but the problem we are having right now is that
the system is thrashing.  Why the OOM code isn't fixing it I don't know.
It just feels busted.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-01 17:26         ` Larry McVoy
@ 2017-12-01 19:10           ` Chris Torek
  2017-12-01 23:21             ` Dave Horsfall
  2017-12-01 21:33           ` Bakul Shah
  1 sibling, 1 reply; 30+ messages in thread
From: Chris Torek @ 2017-12-01 19:10 UTC (permalink / raw)


FYI, the way signals actually work is that they are delivered
only by crossing through the user/kernel boundary.  (That mostly
even includes those that kill a process entirely, except that
there's some optimizations there for some cases.  If the CPU
is running in user space for some *other* process, we have to
arrange for the signalled process to be scheduled to run, or
if it's on another CPU, take an interrupt so as to get into the
kernel and hence cross that boundary.)

This part is natural, since signal delivery means "run a function
in user space, with user privileges". If you're currently *in*
the kernel on behalf of some user process, that means you have to
get *out* of it in that same process.

This is why signals interrupt system calls, making them return
with EINTR.  To avoid having system calls that haven't really
started -- have not done anything yet -- from returning EINTR,
the BSD and POSIX SA_RESTART options work by just making the
"resume" program counter address point back to the "make system
call" instruction.  Essentially:

    frame->pc -= instruction_size;

This can't be done if the system call has actually done something,
so read() or write() on a slow device like a tty will return a
short count (not -1 and EINTR).

The BSD implementation these days is to call the various *sleep
kernel functions (msleep, tsleep, etc) with PCATCH "or"-ed into
the priority to indicate that it is OK to have a signal interrupt
the sleep.  If so, the sleep call returns ERESTART (a special
error code that the syscall handler notices and does the "subtract
from frame->pc" trick).  In the old days, PCATCH was implied by
the sleep priority; all we did was make it an explicit flag,
and do manual unwinding (the V6 kernel used the equivalent of
longjmp to get to the EINTR-returner so that the main line code
path did not have to check).  The I/O subsystem generally calls
*sleep without PCATCH, though.

Chris


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-01 17:26         ` Larry McVoy
  2017-12-01 19:10           ` Chris Torek
@ 2017-12-01 21:33           ` Bakul Shah
  2017-12-01 22:38             ` Larry McVoy
  1 sibling, 1 reply; 30+ messages in thread
From: Bakul Shah @ 2017-12-01 21:33 UTC (permalink / raw)

On Dec 1, 2017, at 9:26 AM, Larry McVoy <lm at mcvoy.com> wrote:
> 
> On Fri, Dec 01, 2017 at 09:33:49AM -0700, Warner Losh wrote:
>>> Or what you do is kill the process, tear down all of the pages except those
>>> that are locked for I/O, leave those in the process and wait for the I/O to
>>> get done.  That might be simpler.
>> 
>> Perhaps. Even that may have issues with cleanup because you may also need
>> other pages to complete the I/O processing since I think that things like
>> aio allocate a tiny bit of memory associated with the requesting process
>> and need that memory to finish the I/O. It's certainly not a requirement
>> that all I/O initiated by userland have no extra state stored in the
>> process' address space associated with it. Since the unwinding happens at a
>> layer higher than the disk driver, who knows what those guys do, eh?
> 
> Yeah, it's not an easy fix but the problem we are having right now is that
> the system is thrashing.  Why the OOM code isn't fixing it I don't know.
> It just feels busted.

So OOM code kills a (random) process in hopes of freeing up
some pages but if this process is stuck in diskIO, nothing
can be freed and everything grinds to a halt. Is this right?

If so, one work around is to kill a process that is *not*
sleeping at an uninterruptable priority :-)/2 This is
separate from any policy of how to choose a victim.

If the queue of dirty pages is growing longer and longer
as the page washer can't keep up, this is analogous to
the bufferbloat problem in networking. You have to test
if this is what is going on. If so, may be you can figure
out how to keep the queues short.

But before any fixes, I would strongly suggest instrumenting
the code to understand what is going on and then instrument
the code further to test out various hypotheses. Once a clear
mental model is in place, the fix will be obvious!

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-01 21:33           ` Bakul Shah
@ 2017-12-01 22:38             ` Larry McVoy
  2017-12-01 23:03               ` Ralph Corderoy
  0 siblings, 1 reply; 30+ messages in thread
From: Larry McVoy @ 2017-12-01 22:38 UTC (permalink / raw)


On Fri, Dec 01, 2017 at 01:33:46PM -0800, Bakul Shah wrote:
> > Yeah, it's not an easy fix but the problem we are having right now is that
> > the system is thrashing.  Why the OOM code isn't fixing it I don't know.
> > It just feels busted.
> 
> So OOM code kills a (random) process in hopes of freeing up
> some pages but if this process is stuck in diskIO, nothing
> can be freed and everything grinds to a halt. Is this right?

Yep, exactly.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-01 22:38             ` Larry McVoy
@ 2017-12-01 23:03               ` Ralph Corderoy
  2017-12-01 23:09                 ` Larry McVoy
  0 siblings, 1 reply; 30+ messages in thread
From: Ralph Corderoy @ 2017-12-01 23:03 UTC (permalink / raw)


Hi Larry,

> > So OOM code kills a (random) process in hopes of freeing up some
> > pages but if this process is stuck in diskIO, nothing can be freed
> > and everything grinds to a halt.
>
> Yep, exactly.

Is that because the pages have been dirty for so long they've reached
the VM-writeback timeout even though there's no pressure to use them for
something else?  Or has that been lengthened because you don't fear
power loss wiping volatile RAM?

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-01 23:03               ` Ralph Corderoy
@ 2017-12-01 23:09                 ` Larry McVoy
  2017-12-01 23:42                   ` Bakul Shah
  2017-12-02 14:59                   ` Theodore Ts'o
  0 siblings, 2 replies; 30+ messages in thread
From: Larry McVoy @ 2017-12-01 23:09 UTC (permalink / raw)

On Fri, Dec 01, 2017 at 11:03:02PM +0000, Ralph Corderoy wrote:
> Hi Larry,
> 
> > > So OOM code kills a (random) process in hopes of freeing up some
> > > pages but if this process is stuck in diskIO, nothing can be freed
> > > and everything grinds to a halt.
> >
> > Yep, exactly.
> 
> Is that because the pages have been dirty for so long they've reached
> the VM-writeback timeout even though there's no pressure to use them for
> something else?  Or has that been lengthened because you don't fear
> power loss wiping volatile RAM?

I'm tinkering with the pageout daemon so I'm trying to apply memory
pressure.  I have 10 25GB processes (25GB malloced) and the processes just
walk the memory over and over.  This is on a 256GB main memory machine
(2 socket haswell, 28 cpus, 28 1TB SSDs, on loan from Netflix).

It's the old "10 pounds of shit in a 5 pound bag" problem, same old stuff,
just a bigger bag.

The problem is that OOM can't kill the processes that are the problem,
they are stuck in disk wait.  That's why I started asking why can't you
kill a process that's in the middle of I/O.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-01 19:10           ` Chris Torek
@ 2017-12-01 23:21             ` Dave Horsfall
  0 siblings, 0 replies; 30+ messages in thread
From: Dave Horsfall @ 2017-12-01 23:21 UTC (permalink / raw)


On Fri, 1 Dec 2017, Chris Torek wrote:

[...]

> This part is natural, since signal delivery means "run a function in 
> user space, with user privileges". If you're currently *in* the kernel 
> on behalf of some user process, that means you have to get *out* of it 
> in that same process.

Which leads to the classic comment (it's right up there with the famous 
"line 2238") along the lines of "we ask a process to do something to 
itself".

-- 
Dave Horsfall DTM (VK2KFU)  "Those who don't understand security will suffer."


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-01 23:09                 ` Larry McVoy
@ 2017-12-01 23:42                   ` Bakul Shah
  2017-12-02  0:48                     ` Larry McVoy
  2017-12-02 14:59                   ` Theodore Ts'o
  1 sibling, 1 reply; 30+ messages in thread
From: Bakul Shah @ 2017-12-01 23:42 UTC (permalink / raw)

On Fri, 01 Dec 2017 15:09:34 -0800 Larry McVoy <lm at mcvoy.com> wrote:
Larry McVoy writes:
> On Fri, Dec 01, 2017 at 11:03:02PM +0000, Ralph Corderoy wrote:
> > Hi Larry,
> > 
> > > > So OOM code kills a (random) process in hopes of freeing up some
> > > > pages but if this process is stuck in diskIO, nothing can be freed
> > > > and everything grinds to a halt.
> > >
> > > Yep, exactly.
> > 
> > Is that because the pages have been dirty for so long they've reached
> > the VM-writeback timeout even though there's no pressure to use them for
> > something else?  Or has that been lengthened because you don't fear
> > power loss wiping volatile RAM?
> 
> I'm tinkering with the pageout daemon so I'm trying to apply memory
> pressure.  I have 10 25GB processes (25GB malloced) and the processes just
> walk the memory over and over.  This is on a 256GB main memory machine
> (2 socket haswell, 28 cpus, 28 1TB SSDs, on loan from Netflix).

How many times do processes walk their memory before this condition
occurs? 

So what may be happening is that a process references a page,
it page faults, the kernel finds its phys page has been paged
out, so it looks for a free page and once a free page is
found, the process will block on page in. Or if there is no
free page, it has to wait until some other dirty page is paged
out (but this would be a different wait queue).  As more and
more processes do this, the system runs out of all free pages.

Can you find out how many processes are waiting under what
conditions, how long they wait and how these queue lengths are
changing over time?  You can use a ring buffer to capture last
2^N measurements and dump them in the debugger when everything
grinds to a halt.

> It's the old "10 pounds of shit in a 5 pound bag" problem, same old stuff,
> just a bigger bag.
> 
> The problem is that OOM can't kill the processes that are the problem,
> they are stuck in disk wait.  That's why I started asking why can't you
> kill a process that's in the middle of I/O.

The OS equivalent of RED (random early drop) would be if a
process kills itself. e.g. when some critical metric crosses a
highwater mark.

Another option would be to return with an EFAULT and the
process can either kill itself or free up the page or
something. [I have used EFAULT to dyanmically allocate *more*
pages but no reason why the same can be used to free up
memory!]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-01 23:42                   ` Bakul Shah
@ 2017-12-02  0:48                     ` Larry McVoy
  2017-12-02  1:40                       ` Bakul Shah
                                         ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Larry McVoy @ 2017-12-02  0:48 UTC (permalink / raw)

On Fri, Dec 01, 2017 at 03:42:15PM -0800, Bakul Shah wrote:
> On Fri, 01 Dec 2017 15:09:34 -0800 Larry McVoy <lm at mcvoy.com> wrote:
> Larry McVoy writes:
> > On Fri, Dec 01, 2017 at 11:03:02PM +0000, Ralph Corderoy wrote:
> > > Hi Larry,
> > > 
> > > > > So OOM code kills a (random) process in hopes of freeing up some
> > > > > pages but if this process is stuck in diskIO, nothing can be freed
> > > > > and everything grinds to a halt.
> > > >
> > > > Yep, exactly.
> > > 
> > > Is that because the pages have been dirty for so long they've reached
> > > the VM-writeback timeout even though there's no pressure to use them for
> > > something else?  Or has that been lengthened because you don't fear
> > > power loss wiping volatile RAM?
> > 
> > I'm tinkering with the pageout daemon so I'm trying to apply memory
> > pressure.  I have 10 25GB processes (25GB malloced) and the processes just
> > walk the memory over and over.  This is on a 256GB main memory machine
> > (2 socket haswell, 28 cpus, 28 1TB SSDs, on loan from Netflix).
> 
> How many times do processes walk their memory before this condition
> occurs? 

Until free memory goes to ~0.  That's the point, I'm trying to 
improve things when there is too much pressure on memory.

> So what may be happening is that a process references a page,
> it page faults, the kernel finds its phys page has been paged
> out, so it looks for a free page and once a free page is
> found, the process will block on page in. Or if there is no
> free page, it has to wait until some other dirty page is paged
> out (but this would be a different wait queue).  As more and
> more processes do this, the system runs out of all free pages.

Yeah.

> Can you find out how many processes are waiting under what
> conditions, how long they wait and how these queue lengths are
> changing over time?  

So I have 10 processes, they all run until the system starts to
thrash, then they are all in wait mode for memory but there isn't
any (and there is no swap configured).

The fundamental problem is that they are sleeping waiting for memory to
be freed.  They are NOT in I/O mode, there is no DMA happening, this is
main memory, it is not backed by swap, there is no swap.  So they are
sleeping waiting for the pageout daemon to free some memory.  It's not
going to free their memory because there is no place to stash (no swap).
So it's trying to free other memory.

The real question is where did they go to sleep and why did they sleep
without PCATCH on?  If I can find that place where they are trying to
alloc a page and failed and they go to sleep there, I could either

a) commit seppuku because we are out of memory and I'm part of the problem
b) go into a sleep / wakeup / check signals loop

I am reminded by you all that we ask the process to do it to itself but
there does seem to be a way to sleep and respect signals, the tty stuff
does that.  So if I can find this place, determine that I'm just asking
for memory, not I/O, and sleep with PCATCH on then I might be golden.

Where "golden" means I can kill the process and the OOM thread could do
it for me.

Thoughts?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-02  0:48                     ` Larry McVoy
@ 2017-12-02  1:40                       ` Bakul Shah
  2017-12-03 13:50                       ` Ralph Corderoy
  2017-12-04 16:36                       ` arnold
  2 siblings, 0 replies; 30+ messages in thread
From: Bakul Shah @ 2017-12-02  1:40 UTC (permalink / raw)


On Fri, 01 Dec 2017 16:48:50 -0800 Larry McVoy <lm at mcvoy.com> wrote:
Larry McVoy writes:
> On Fri, Dec 01, 2017 at 03:42:15PM -0800, Bakul Shah wrote:
> > On Fri, 01 Dec 2017 15:09:34 -0800 Larry McVoy <lm at mcvoy.com> wrote:
> > Larry McVoy writes:
> > > On Fri, Dec 01, 2017 at 11:03:02PM +0000, Ralph Corderoy wrote:
> > > > Hi Larry,
> > > > 
> > > > > > So OOM code kills a (random) process in hopes of freeing up some
> > > > > > pages but if this process is stuck in diskIO, nothing can be freed
> > > > > > and everything grinds to a halt.
> > > > >
> > > > > Yep, exactly.
> > > > 
> > > > Is that because the pages have been dirty for so long they've reached
> > > > the VM-writeback timeout even though there's no pressure to use them fo
> r
> > > > something else?  Or has that been lengthened because you don't fear
> > > > power loss wiping volatile RAM?
> > > 
> > > I'm tinkering with the pageout daemon so I'm trying to apply memory
> > > pressure.  I have 10 25GB processes (25GB malloced) and the processes jus
> t
> > > walk the memory over and over.  This is on a 256GB main memory machine
> > > (2 socket haswell, 28 cpus, 28 1TB SSDs, on loan from Netflix).
> > 
> > How many times do processes walk their memory before this condition
> > occurs? 
> 
> Until free memory goes to ~0.  That's the point, I'm trying to 
> improve things when there is too much pressure on memory.

You said 10x25GB but you have 256GB. So there is still
6GB left...

> 
> > So what may be happening is that a process references a page,
> > it page faults, the kernel finds its phys page has been paged
> > out, so it looks for a free page and once a free page is
> > found, the process will block on page in. Or if there is no
> > free page, it has to wait until some other dirty page is paged
> > out (but this would be a different wait queue).  As more and
> > more processes do this, the system runs out of all free pages.
> 
> Yeah.
> 
> > Can you find out how many processes are waiting under what
> > conditions, how long they wait and how these queue lengths are
> > changing over time?  
> 
> So I have 10 processes, they all run until the system starts to
> thrash, then they are all in wait mode for memory but there isn't
> any (and there is no swap configured).
> 
> The fundamental problem is that they are sleeping waiting for memory to
> be freed.  They are NOT in I/O mode, there is no DMA happening, this is
> main memory, it is not backed by swap, there is no swap.  So they are
> sleeping waiting for the pageout daemon to free some memory.  It's not
> going to free their memory because there is no place to stash (no swap).
> So it's trying to free other memory.

This confuses me. Before I make more false assumptions,
can you show the code?

> The real question is where did they go to sleep and why did they sleep
> without PCATCH on?  If I can find that place where they are trying to
> alloc a page and failed and they go to sleep there, I could either

Can you use kgdb to find out where they sleep?

> a) commit seppuku because we are out of memory and I'm part of the problem
> b) go into a sleep / wakeup / check signals loop
> 
> I am reminded by you all that we ask the process to do it to itself but
> there does seem to be a way to sleep and respect signals, the tty stuff
> does that.  So if I can find this place, determine that I'm just asking
> for memory, not I/O, and sleep with PCATCH on then I might be golden.

Won't kgdb tell you? Or you can insert printfs. You should also print
something in your program as it walks a few pages and see if this happens
at almost the same time (when all pages are used up).

tty code probably assumes memory shortfall is a short term issue
(which is likely). Not the case with your test (unless I misguessed).

> Where "golden" means I can kill the process and the OOM thread could do
> it for me.
> 
> Thoughts?

Now I am starting to think this happens as soon as all the
phys pages are used up. But looking at your program would
help.

[removd cc: TUHS]


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-01 23:09                 ` Larry McVoy
  2017-12-01 23:42                   ` Bakul Shah
@ 2017-12-02 14:59                   ` Theodore Ts'o
  1 sibling, 0 replies; 30+ messages in thread
From: Theodore Ts'o @ 2017-12-02 14:59 UTC (permalink / raw)

On Fri, Dec 01, 2017 at 03:09:34PM -0800, Larry McVoy wrote:
> 
> It's the old "10 pounds of shit in a 5 pound bag" problem, same old stuff,
> just a bigger bag.
> 
> The problem is that OOM can't kill the processes that are the problem,
> they are stuck in disk wait.  That's why I started asking why can't you
> kill a process that's in the middle of I/O.

You may need to solve the problem much earlier, by write throttling
the process which is generating so many dirty pages in the first
place.  At one point Linux would press-gang the badly behaved process
which was generating lots of dirty pages into helping to deactivate
and clean pages; it doesn't do this any more, but stopping processes
which are being badly behaved until the writeback daemons can catch up
is certainly kinder than OOM-killing the bad process.

Are you using ZFS?  It does have a write throttling knob, apparently.

    	      	       	    	   	 	    - Ted

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-02  0:48                     ` Larry McVoy
  2017-12-02  1:40                       ` Bakul Shah
@ 2017-12-03 13:50                       ` Ralph Corderoy
  2017-12-04 16:36                       ` arnold
  2 siblings, 0 replies; 30+ messages in thread
From: Ralph Corderoy @ 2017-12-03 13:50 UTC (permalink / raw)


Hi Larry,

> The fundamental problem is that they are sleeping waiting for memory
> to be freed.  They are NOT in I/O mode, there is no DMA happening,
> this is main memory, it is not backed by swap, there is no swap.  So
> they are sleeping waiting for the pageout daemon to free some memory.
> It's not going to free their memory because there is no place to stash
> (no swap).  So it's trying to free other memory.

Right.

> The real question is where did they go to sleep

Reading through from vm_fault(), is it vm_waitpfault()?
http://fxr.watson.org/fxr/source/vm/vm_page.c?im=3?v=FREEBSD55#L2749

Or, chase down where some of sysctl(3)'s CTL_VM constants are used to
see how their behaviour is effected?

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-02  0:48                     ` Larry McVoy
  2017-12-02  1:40                       ` Bakul Shah
  2017-12-03 13:50                       ` Ralph Corderoy
@ 2017-12-04 16:36                       ` arnold
  2017-12-04 16:58                         ` Arthur Krewat
                                           ` (2 more replies)
  2 siblings, 3 replies; 30+ messages in thread
From: arnold @ 2017-12-04 16:36 UTC (permalink / raw)

Larry McVoy <lm at mcvoy.com> wrote:

> So I have 10 processes, they all run until the system starts to
> thrash, then they are all in wait mode for memory but there isn't
> any (and there is no swap configured).

Um, pardon me for asking the obvious question, but why not just configure
a few gigs of swap to give the OS some breathing room?

Most modern systems let you use a regular old file in the filesystem
for swap space, instead of having to repartition your disk and using a
dedicated partition. I'd be suprised if your *BSD box didn't let you do
that too.  It's a little slower, but a gazillion times more convenient.

Just a thought,

Arnold

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-04 16:36                       ` arnold
@ 2017-12-04 16:58                         ` Arthur Krewat
  2017-12-04 17:19                         ` Warner Losh
  2017-12-04 22:07                         ` Dave Horsfall
  2 siblings, 0 replies; 30+ messages in thread
From: Arthur Krewat @ 2017-12-04 16:58 UTC (permalink / raw)

I've had some strange experiences with Solaris 11, where if I didn't 
have enough swap, I'd get OOM issues with Oracle databases (SGA) on 
startup and other various things.

While Oracle docs say it's OK to have a small swap area if you have 
plenty of RAM, I've experienced the exact opposite. If the ZFS arc is 
large (and it always is), and something needs memory, if there isn't 
enough swap to "guarantee" the allocation, the allocation will fail. 
(Disregarding the fact that the ZFS arc can be tuned, and now Solaris 
11.3 has an even better setting, user_reserve_hint_pct). It won't wait 
to flush out ZFS arc to make room, it'll just fail outright.

With a huge swap area that's at least half the size of RAM (in boxes 
that range from 96GB to 256GB), even though it'll never touch the swap, 
it works just fine.

Now, I understand your issue is with BSD, but if it's ZFS, perhaps 
there's something not-so-different about the two environments.

$.02

On 12/4/2017 11:36 AM, arnold at skeeve.com wrote:
> Larry McVoy <lm at mcvoy.com> wrote:
>
>> So I have 10 processes, they all run until the system starts to
>> thrash, then they are all in wait mode for memory but there isn't
>> any (and there is no swap configured).
> Um, pardon me for asking the obvious question, but why not just configure
> a few gigs of swap to give the OS some breathing room?
>
> Most modern systems let you use a regular old file in the filesystem
> for swap space, instead of having to repartition your disk and using a
> dedicated partition. I'd be suprised if your *BSD box didn't let you do
> that too.  It's a little slower, but a gazillion times more convenient.
>
> Just a thought,
>
> Arnold
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-04 16:36                       ` arnold
  2017-12-04 16:58                         ` Arthur Krewat
@ 2017-12-04 17:19                         ` Warner Losh
  2017-12-05  2:12                           ` Bakul Shah
  2017-12-04 22:07                         ` Dave Horsfall
  2 siblings, 1 reply; 30+ messages in thread
From: Warner Losh @ 2017-12-04 17:19 UTC (permalink / raw)

On Mon, Dec 4, 2017 at 9:36 AM, <arnold at skeeve.com> wrote:

> Larry McVoy <lm at mcvoy.com> wrote:
>
> > So I have 10 processes, they all run until the system starts to
> > thrash, then they are all in wait mode for memory but there isn't
> > any (and there is no swap configured).
>
> Um, pardon me for asking the obvious question, but why not just configure
> a few gigs of swap to give the OS some breathing room?
>
> Most modern systems let you use a regular old file in the filesystem
> for swap space, instead of having to repartition your disk and using a
> dedicated partition. I'd be suprised if your *BSD box didn't let you do
> that too.  It's a little slower, but a gazillion times more convenient.
>

The deployed systems in the field have swap space configured. But if we're
not careful we can still run out. Larry's tests are at the extreme limits,
to be sure, and not having swap exacerbates the bad behavior at the limit.
If he's trying to ameliorate bad effects, doing it on worst case scenario
certainly helps. The BSD box allows for it, and it's not that much slower,
but it kinda misses the point of the exercise.

Also, the systems in questions often operate at the limits of the disk
subsystem, so additional write traffic is undesirable. They are also SSDs,
so doubly undesirable where avoidable.

Warner
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20171204/c88ccdc9/attachment.html>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-04 16:36                       ` arnold
  2017-12-04 16:58                         ` Arthur Krewat
  2017-12-04 17:19                         ` Warner Losh
@ 2017-12-04 22:07                         ` Dave Horsfall
  2017-12-04 22:54                           ` Ron Natalie
  2 siblings, 1 reply; 30+ messages in thread
From: Dave Horsfall @ 2017-12-04 22:07 UTC (permalink / raw)


On Mon, 4 Dec 2017, arnold at skeeve.com wrote:

> Most modern systems let you use a regular old file in the filesystem for 
> swap space, instead of having to repartition your disk and using a 
> dedicated partition. I'd be suprised if your *BSD box didn't let you do 
> that too.  It's a little slower, but a gazillion times more convenient.

Doesn't it have to be a contiguous file (for DMA), or is scatter/gather 
now supported?

-- 
Dave Horsfall DTM (VK2KFU)  "Those who don't understand security will suffer."


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-04 22:07                         ` Dave Horsfall
@ 2017-12-04 22:54                           ` Ron Natalie
  2017-12-04 22:56                             ` Warner Losh
  0 siblings, 1 reply; 30+ messages in thread
From: Ron Natalie @ 2017-12-04 22:54 UTC (permalink / raw)

Nothing says a page has to be loaded in one DMA.    The swap file isn't
allocated any different than any other file on the LINUX systems.    About
the only thing you have to do is make
sure that all the blocks are populated.    UNIX normally allocates the files
as sparse and the swap code doesn't want to have to worry about allocating
blocks when it comes to paging out.

-----Original Message-----
From: TUHS [mailto:tuhs-bounces@minnie.tuhs.org] On Behalf Of Dave Horsfall
Sent: Monday, December 4, 2017 5:07 PM
To: The Eunuchs Hysterical Society
Subject: Re: [TUHS] signals and blocked in I/O

On Mon, 4 Dec 2017, arnold at skeeve.com wrote:

> Most modern systems let you use a regular old file in the filesystem 
> for swap space, instead of having to repartition your disk and using a 
> dedicated partition. I'd be suprised if your *BSD box didn't let you 
> do that too.  It's a little slower, but a gazillion times more convenient.

Doesn't it have to be a contiguous file (for DMA), or is scatter/gather now
supported?

--
Dave Horsfall DTM (VK2KFU)  "Those who don't understand security will
suffer."

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-04 22:54                           ` Ron Natalie
@ 2017-12-04 22:56                             ` Warner Losh
  2017-12-05  0:49                               ` Dave Horsfall
  0 siblings, 1 reply; 30+ messages in thread
From: Warner Losh @ 2017-12-04 22:56 UTC (permalink / raw)


Pages are pages. The filesystem handles the details for the offset of each
one. There's no contiguous on disk requirement.

Warner

On Mon, Dec 4, 2017 at 3:54 PM, Ron Natalie <ron at ronnatalie.com> wrote:

> Nothing says a page has to be loaded in one DMA.    The swap file isn't
> allocated any different than any other file on the LINUX systems.    About
> the only thing you have to do is make
> sure that all the blocks are populated.    UNIX normally allocates the
> files
> as sparse and the swap code doesn't want to have to worry about allocating
> blocks when it comes to paging out.
>
>
>
> -----Original Message-----
> From: TUHS [mailto:tuhs-bounces at minnie.tuhs.org] On Behalf Of Dave
> Horsfall
> Sent: Monday, December 4, 2017 5:07 PM
> To: The Eunuchs Hysterical Society
> Subject: Re: [TUHS] signals and blocked in I/O
>
> On Mon, 4 Dec 2017, arnold at skeeve.com wrote:
>
> > Most modern systems let you use a regular old file in the filesystem
> > for swap space, instead of having to repartition your disk and using a
> > dedicated partition. I'd be suprised if your *BSD box didn't let you
> > do that too.  It's a little slower, but a gazillion times more
> convenient.
>
> Doesn't it have to be a contiguous file (for DMA), or is scatter/gather now
> supported?
>
> --
> Dave Horsfall DTM (VK2KFU)  "Those who don't understand security will
> suffer."
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20171204/e550663e/attachment-0001.html>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-04 22:56                             ` Warner Losh
@ 2017-12-05  0:49                               ` Dave Horsfall
  2017-12-05  0:58                                 ` Arthur Krewat
  2017-12-05  2:15                                 ` Dave Horsfall
  0 siblings, 2 replies; 30+ messages in thread
From: Dave Horsfall @ 2017-12-05  0:49 UTC (permalink / raw)


On Mon, 4 Dec 2017, Warner Losh wrote:

[ Swap files ]

> Pages are pages. The filesystem handles the details for the offset of 
> each one. There's no contiguous on disk requirement.

I've used at least one *ix OS that required it be both contiguous and 
pre-allocated; Slowaris?

-- 
Dave Horsfall DTM (VK2KFU)  "Those who don't understand security will suffer."


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-05  0:49                               ` Dave Horsfall
@ 2017-12-05  0:58                                 ` Arthur Krewat
  2017-12-05  2:15                                 ` Dave Horsfall
  1 sibling, 0 replies; 30+ messages in thread
From: Arthur Krewat @ 2017-12-05  0:58 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 710 bytes --]

On 12/4/2017 7:49 PM, Dave Horsfall wrote:
> On Mon, 4 Dec 2017, Warner Losh wrote:
>
> [ Swap files ]
>
>> Pages are pages. The filesystem handles the details for the offset of 
>> each one. There's no contiguous on disk requirement.
>
> I've used at least one *ix OS that required it be both contiguous and 
> pre-allocated; Slowaris?
>
Nope, unless mkfile is is guaranteed to give you contiguous space:

# uname -a
SunOS vmsol8 5.8 Generic_108529-05 i86pc i386 i86pc
# mkfile 100m a.a
# swap -a /a.a
# swap -l
swapfile             dev  swaplo blocks   free
/dev/dsk/c0d0s1     102,1       8 1048936 1048896
/a.a                  -        8 204792 204792





^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-04 17:19                         ` Warner Losh
@ 2017-12-05  2:12                           ` Bakul Shah
  0 siblings, 0 replies; 30+ messages in thread
From: Bakul Shah @ 2017-12-05  2:12 UTC (permalink / raw)

On Mon, 04 Dec 2017 10:19:19 -0700 Warner Losh <imp at bsdimp.com> wrote:
Warner Losh writes:
> 
> On Mon, Dec 4, 2017 at 9:36 AM, <arnold at skeeve.com> wrote:
> 
> > Larry McVoy <lm at mcvoy.com> wrote:
> >
> > > So I have 10 processes, they all run until the system starts to
> > > thrash, then they are all in wait mode for memory but there isn't
> > > any (and there is no swap configured).
> >
> > Um, pardon me for asking the obvious question, but why not just configure
> > a few gigs of swap to give the OS some breathing room?
> >
> > Most modern systems let you use a regular old file in the filesystem
> > for swap space, instead of having to repartition your disk and using a
> > dedicated partition. I'd be suprised if your *BSD box didn't let you do
> > that too.  It's a little slower, but a gazillion times more convenient.
> 
> The deployed systems in the field have swap space configured. But if we're
> not careful we can still run out. Larry's tests are at the extreme limits,
> to be sure, and not having swap exacerbates the bad behavior at the limit.

Conceptually they are somewhat different cases.

1) In the no-swap case *no* pages will be available once all
   the phys pages are used up. To recover, the kernel must
   free up space by killing some process.

2) In the swap case pages will eventually be available. The
   paging rate becomes the bottleneck but no need to kill
   any process.

In case 1) there may be other disk traffic to confuse things.
But eventually that should die down. However I am not sure
FreeBSD has a clean way to indicate there are no free pages.
It calls OOM logic as a last resort. In case 2) OOM should
never be called.

I think the issue is that the kernel is unable to kill a
process that is waiting for a free page. Since all of them are
doing the same thing, the system hangs. I think this wait has
to be made interruptable to be able to kill a process. Of
course that may break something else....

May be better to avoid the whole issue in the first place for
a service latency sensitive production machine.  By providing
a way for a process to ask that there is enough backing store
for any mmaped pages (and thus avoid swapping or running out
of memory). Not sure if originally BSD did provide such a
guarantee...

> If he's trying to ameliorate bad effects, doing it on worst case scenario
> certainly helps. The BSD box allows for it, and it's not that much slower,
> but it kinda misses the point of the exercise.

> Also, the systems in questions often operate at the limits of the disk
> subsystem, so additional write traffic is undesirable. They are also SSDs,
> so doubly undesirable where avoidable.

This can be tested by writing some custom logic: instead of
writing out a page, compress it and stash in memory (or
ramdisk). The test can write highly compressable patterns.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-05  0:49                               ` Dave Horsfall
  2017-12-05  0:58                                 ` Arthur Krewat
@ 2017-12-05  2:15                                 ` Dave Horsfall
  2017-12-05  2:54                                   ` Clem cole
  1 sibling, 1 reply; 30+ messages in thread
From: Dave Horsfall @ 2017-12-05  2:15 UTC (permalink / raw)

On Tue, 5 Dec 2017, Dave Horsfall wrote:

> I've used at least one *ix OS that required it be both contiguous and 
> pre-allocated; Slowaris?

OK, likely not, but I've used literally dozens of systems[*] over the last 
40 years or so; this one was definitely Unix-ish (otherwise I wouldn't've 
used it) and it definitely required a contiguous file for ease of DMA.

Wish I could remember it, but the ol' memory is one of the first things to 
go when in your 60s, and I never could remember what comes afterwards...

[*]
Pretty much AIX to Xenix (when you pronounce it with a "Z").

-- 
Dave Horsfall DTM (VK2KFU)  "Those who don't understand security will suffer."

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [TUHS] signals and blocked in I/O
  2017-12-05  2:15                                 ` Dave Horsfall
@ 2017-12-05  2:54                                   ` Clem cole
  0 siblings, 0 replies; 30+ messages in thread
From: Clem cole @ 2017-12-05  2:54 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1034 bytes --]

I can not speak for others but both Masscomp’s Real Time Unix and Stellar’s Stellix supported contiguous files and could swap to them.  And yes they were preallocated at creat(2) time. 

Sent from my PDP-7 Running UNIX V0 expect things to be almost but not quite. 

> On Dec 4, 2017, at 9:15 PM, Dave Horsfall <dave at horsfall.org> wrote:
> 
>> On Tue, 5 Dec 2017, Dave Horsfall wrote:
>> 
>> I've used at least one *ix OS that required it be both contiguous and pre-allocated; Slowaris?
> 
> OK, likely not, but I've used literally dozens of systems[*] over the last 40 years or so; this one was definitely Unix-ish (otherwise I wouldn't've used it) and it definitely required a contiguous file for ease of DMA.
> 
> Wish I could remember it, but the ol' memory is one of the first things to go when in your 60s, and I never could remember what comes afterwards...
> 
> [*]
> Pretty much AIX to Xenix (when you pronounce it with a "Z").
> 
> -- 
> Dave Horsfall DTM (VK2KFU)  "Those who don't understand security will suffer."


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2017-12-05  2:54 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-01 15:44 [TUHS] signals and blocked in I/O Larry McVoy
2017-12-01 15:53 ` Dan Cross
2017-12-01 16:11   ` Clem Cole
2017-12-01 16:18     ` Larry McVoy
2017-12-01 16:33       ` Warner Losh
2017-12-01 17:26         ` Larry McVoy
2017-12-01 19:10           ` Chris Torek
2017-12-01 23:21             ` Dave Horsfall
2017-12-01 21:33           ` Bakul Shah
2017-12-01 22:38             ` Larry McVoy
2017-12-01 23:03               ` Ralph Corderoy
2017-12-01 23:09                 ` Larry McVoy
2017-12-01 23:42                   ` Bakul Shah
2017-12-02  0:48                     ` Larry McVoy
2017-12-02  1:40                       ` Bakul Shah
2017-12-03 13:50                       ` Ralph Corderoy
2017-12-04 16:36                       ` arnold
2017-12-04 16:58                         ` Arthur Krewat
2017-12-04 17:19                         ` Warner Losh
2017-12-05  2:12                           ` Bakul Shah
2017-12-04 22:07                         ` Dave Horsfall
2017-12-04 22:54                           ` Ron Natalie
2017-12-04 22:56                             ` Warner Losh
2017-12-05  0:49                               ` Dave Horsfall
2017-12-05  0:58                                 ` Arthur Krewat
2017-12-05  2:15                                 ` Dave Horsfall
2017-12-05  2:54                                   ` Clem cole
2017-12-02 14:59                   ` Theodore Ts'o
2017-12-01 16:01 ` Dave Horsfall
2017-12-01 16:24 ` Warner Losh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).