9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* Re: [9fans] Announcing: writable /proc/pid/ns
       [not found] <<fe41879c1001020924m7e964c42g3788eb7a8a5af982@mail.gmail.com>
@ 2010-01-02 17:28 ` erik quanstrom
  0 siblings, 0 replies; 8+ messages in thread
From: erik quanstrom @ 2010-01-02 17:28 UTC (permalink / raw)
  To: 9fans

On Sat Jan  2 12:26:46 EST 2010, akumar@mail.nanosouffle.net wrote:
> > can you give an example of a use of this feature that can't be
> > accomplished by plumbing "Local 9fs $server"?
>
> Wanting to provide access to sources repo or data on external
> hdd, to a program running in background (say, httpd), where the
> connection to the respective server has recently gone down.
> Without having to restart the program.

why is restarting httpd a problem?  (httpd should be started
from listen(1) anyway.)

- erik



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] Announcing: writable /proc/pid/ns
       [not found] <<b2d3848d1001021140n42a14c70vaa2d591479ad1df@mail.gmail.com>
@ 2010-01-02 19:54 ` erik quanstrom
  0 siblings, 0 replies; 8+ messages in thread
From: erik quanstrom @ 2010-01-02 19:54 UTC (permalink / raw)
  To: 9fans

> A very good point, and I hope you don't think the response "trust the
> user to administer their system and accept that it is possible to do
> broken things" is trying to dodge the issue.

no, i don't.  i think that is a reasonable answer.  however,
it does change the use case considerably.  and it makes
me wary of building tools based on this feature, excepting
emergency tools ment to attempt system rescue.

to me, the manipulating the ns through the fs would make the most
sense if one could also create a new process through /proc.
but the reasons for this depend on some wild and unfinished
ideas.

> to the CPU connection to another machine. The fact that rewriting a
> namespace doesn't change the chan associated with a currently open
> file descriptor imposes a bit of sanity assurance that standard
> filesystem operatings won't go berserk just because they were in the
> middle of a write when you wrote a ns operation to their ns file.

this does have the potential to create vast confusion.  i should
have mentioned it before.

- erik



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [9fans] Announcing: writable /proc/pid/ns
@ 2010-01-02 19:40 mycroftiv 9gridchan
  0 siblings, 0 replies; 8+ messages in thread
From: mycroftiv 9gridchan @ 2010-01-02 19:40 UTC (permalink / raw)
  To: 9fans

>i'm also not convinced that changing the namespace
>of a running proecss is a safe operation, in general.
>for example, wikifs creates lock files.  suppose you
>swap out the directory with someone holding the lock.

>what's the plan for dealing with this difficulty?

A very good point, and I hope you don't think the response "trust the
user to administer their system and accept that it is possible to do
broken things" is trying to dodge the issue. It is unquestionably true
that programs generally dont expect their namespace to be modified
underneath them, and doing so could cause them to lose track of files
they need, write data to the wrong place, etc. In my testing and work
i have mostly been concerned with causing problems a layer down -
inadvertently corrupting kernel memory structures, causing refcounting
to go awry, things of that nature. If the potential problems are
limited to the fact that rewriting process namespaces is sometimes a
mistake, I don't think that is different than the fact that you can
cause broken things to happen by doing other operations supported by
/proc.

For what it's worth, in my testing things have tended to work as I
hoped and expected, even when trying things that felt a bit 'risky'
like modifying the namespace of the exportfs process serving /mnt/term
to the CPU connection to another machine. The fact that rewriting a
namespace doesn't change the chan associated with a currently open
file descriptor imposes a bit of sanity assurance that standard
filesystem operatings won't go berserk just because they were in the
middle of a write when you wrote a ns operation to their ns file.
However, I admit to not having the comprehensive knowledge of Plan 9
kernel internals I'd need to make a definitive statement that my
patches to this couldn't potentially be used to cause unintended
behavior inside the kernel.

Again, thanks for the interest.
~Mycroftiv



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] Announcing: writable /proc/pid/ns
       [not found] <<b2d3848d1001021052y48d098cdqeb0291be91aead0f@mail.gmail.com>
@ 2010-01-02 19:07 ` erik quanstrom
  0 siblings, 0 replies; 8+ messages in thread
From: erik quanstrom @ 2010-01-02 19:07 UTC (permalink / raw)
  To: 9fans

i should have started out by stating that i find
what you've done here interesting.  i'm just not
sure how it's useful yet.  at least to me.

i'm also not convinced that changing the namespace
of a running proecss is a safe operation, in general.
for example, wikifs creates lock files.  suppose you
swap out the directory with someone holding the lock.

what's the plan for dealing with this difficulty?

- erik



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [9fans] Announcing: writable /proc/pid/ns
@ 2010-01-02 18:52 mycroftiv 9gridchan
  0 siblings, 0 replies; 8+ messages in thread
From: mycroftiv 9gridchan @ 2010-01-02 18:52 UTC (permalink / raw)
  To: 9fans

>can you give an example of a use of this feature that can't be
>accomplished by plumbing "Local 9fs $server"?

Most obviously, plumbing a Local command can't rewrite the namespace
of processes on a remote server whose /proc you are importing. It also
cannot be used to make modifications targeted at a single specific
running process. It is a good mechanism, and one I make use of (a set
of scripts I wrote for grid resource indexing uses it heavily), but it
isn't a general purpose namespace manipulation tool. Processes like
service listeners started by cpurc are unlikely to have a plumber in
their namespace, and starting a lot of extra plumber processes to act
as namespace agents doesnt seem like a sensible approach.

In the context of a user using a single machine as a self sufficient
environment the plumb Local trick is probably just fine for most of
their namespace manipulation needs, but the context of trying to build
a larger grid where multiple machines are all providing services 'a la
carte' and a single cpu may be hosting processes with widely divergent
namespaces, more general and precise tools are useful. In the other
post I made today I discussed a modified boot system that creates both
a small self-sufficient ramdisk based environment and a standard
disk-fileserver based one - being able to shift which set of resources
the active processes such a machine will reference has been useful to
me in making sure I can keep services available even if I need to
reboot a fileserver node.

Philosophically, I think that if the freedom of per process namespaces
is a good thing - which I certainly believe it is - then making
process namespaces as flexible and precisely controllable as possible
enhances that quality. Because this modification was only done very
recently, I haven't yet had time to start building some of the scripts
that can make use of it - but as an example, i think a script that can
'synchronize' the namespace of two given processes by finding binds
and mounts present in the ns of one but not in the other and then
issuing the commands to make the target process ns match the given
model would be a nice thing to have. I can supply more examples from
my own usage but I'm probably already past my word quota for the day.
To sum up, I think that the idea of controlling the ns of processes
spread across a multimachine grid via mporting multiple /proc and
using scripted tools has obvious utility for dynamic grid computing
where service nodes can enter and leave the resource matrix freely.

~mycroftiv



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] Announcing: writable /proc/pid/ns
  2010-01-02 17:08 ` erik quanstrom
@ 2010-01-02 17:24   ` Akshat Kumar
  0 siblings, 0 replies; 8+ messages in thread
From: Akshat Kumar @ 2010-01-02 17:24 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> can you give an example of a use of this feature that can't be
> accomplished by plumbing "Local 9fs $server"?

Wanting to provide access to sources repo or data on external
hdd, to a program running in background (say, httpd), where the
connection to the respective server has recently gone down.
Without having to restart the program.


ak



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] Announcing: writable /proc/pid/ns
       [not found] <<b2d3848d1001020612w51eb261ao5cd830e1dc5db8f3@mail.gmail.com>
@ 2010-01-02 17:08 ` erik quanstrom
  2010-01-02 17:24   ` Akshat Kumar
  0 siblings, 1 reply; 8+ messages in thread
From: erik quanstrom @ 2010-01-02 17:08 UTC (permalink / raw)
  To: 9fans

> In addition to adding in new bindings to running processes like rio,
> plumber, dossrv, and exportfs, this mechanism is also fully network
> transparent and useful when importing /proc from remote machines.
> Rewriting the namespace of remote processes is a powerful mechanism
> for fine-grained interactive control. Aux/lines can be used for
> wholesale modifications to a namespace.

can you give an example of a use of this feature that can't be
accomplished by plumbing "Local 9fs $server"?

- erik



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [9fans] Announcing: writable /proc/pid/ns
@ 2010-01-02 14:12 mycroftiv 9gridchan
  0 siblings, 0 replies; 8+ messages in thread
From: mycroftiv 9gridchan @ 2010-01-02 14:12 UTC (permalink / raw)
  To: 9fans

In celebration of the arrival of 2010, the 9gridchan.org community
gridding development team - aka one guy with a basement full of
ethernet cables - would like to announce several new tools for Plan 9.
In this post I'll talk about writable /proc/pid/ns, and in a later
message, "rootless" post-kernel load booting. Everything mentioned is
available on sources now in contrib/mycroftiv. All of this software
receives testing and use on three native hardware Plan 9 systems and a
swarm of qemu VMs. mycroftiv/writeprocns contains all files relevant
to this post, modified versions of 3 kernel source files in
/sys/src/9/port.

Motivation: Per process namespaces are one of the glories of Plan 9.
Getting the most out of Plan 9, especially a grid of machines,
requires fine-grained control of namespace construction. There are
some occasional inconveniences caused by the fact that currently
running processes other than the shell do not have a consistent
mechanism for acquiring newly made mounts or binds. Plan 9 already has
a representation of process namespace available in /proc and processes
may freely modify their own namespace at runtime. Making /proc/pid/ns
act as a control interface to trigger modifications to the namespace
of a running process seems consistent with the design.

Writable /proc/pid/ns is simple in usage: you can perform arbitrary
namespace operations on running processes you own just by echoing the
standard command to that processes' ns file. This can be used for
purposes such as bringing newly mounted services into the namespace of
your running plumber, or adding a mount underneath your running rio.
Example:

9fs sources
ps |grep rio
echo 'mount /srv/sources /n/sources' >/proc/863/ns   #first rio proc

Open new windows within rio and the sources mount is in place.
Standard bind and mount flags and spec and unmount are all supported,
but all mounts are done without an auth file descriptor. This is not
as much of a limitation as it might seem because any external mount
requiring auth can be made available locally non-authed via /srv - and
in the most common case of a 9fs connection to a fossil server, fossil
will accept non-authed mounts of a previously authed file descriptor.
Import takes a flag (-s srvname) to post a /srv which will not require
additional reauthentication.

In addition to adding in new bindings to running processes like rio,
plumber, dossrv, and exportfs, this mechanism is also fully network
transparent and useful when importing /proc from remote machines.
Rewriting the namespace of remote processes is a powerful mechanism
for fine-grained interactive control. Aux/lines can be used for
wholesale modifications to a namespace.

Implementation: simple conceptually. Writing a namespace operation to
the ns file in /proc produces a parallel sequence of actions as that
process itself issuing  the equivalent syscall. The existing routines
in 9/port/sysfile.c and 9/port/chan.c are all written to operate on
'up', the current process - so I created near-identical versions of
the syscalls and channel operations which take a Proc *targp parameter
and address resources via targp-> rather than up->. This does create a
bit of inelegant duplication but has the advantage of leaving all the
existing namespace operation code paths untouched.

I hope this approach is fundamentally sound, and I have attempted to
test it extensively on my local grid of native and virtual machines. I
have not found any bugs or inconsistencies, but given the importance
of chan.c I think this code would need additional review and testing
before use on production machines. I would like to submit an evolved
version of these patches to the main distribution after some review
and testing by more experienced plan 9 kernel programmers, because I
believe the functionality of modifying the ns of processes you own is
useful and the mechanism of simply writing the standard ns commands to
the ns file is clear and in harmony with the overall system.

I would like to also acknowledge the work done on "namespace
crossings" as described by
http://www.cs.cmu.edu/~412/history/2006F/nscross/ - this differs in
purpose and implementation but springs from somewhat similar
motivations. I haven't investigated the code but I'm sure its more
sophisticated than my snarf+paste based approach!

All the modifications are to files in /sys/src/9/port, so bind -b
/n/sources/contrib/mycroftiv/writeprocns /sys/src/9/port and then
compile the kernel of your choice from within that namespace to test
without modifying your original kernel source. A console message is
printed for each ns command as it is initiated from within devproc.c -
these are not error messages. If they irritate you, comment them out
in the new procnsreq function at the end of the modified devproc.c.

mycroftiv@sphericalharmony.com
Ben Kidwell
9gridchan.org provides a variety of public plan 9 services
project channel: #plan9chan on irc.freenode.net for 9gridchan
questions, tech support, suggestions
also in #plan9 for general Plan 9 discussion
Thanks as always to all other Plan 9 authors, developers, maintainers
and community for the world's best OS and software



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-01-02 19:54 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <<fe41879c1001020924m7e964c42g3788eb7a8a5af982@mail.gmail.com>
2010-01-02 17:28 ` [9fans] Announcing: writable /proc/pid/ns erik quanstrom
     [not found] <<b2d3848d1001021140n42a14c70vaa2d591479ad1df@mail.gmail.com>
2010-01-02 19:54 ` erik quanstrom
2010-01-02 19:40 mycroftiv 9gridchan
     [not found] <<b2d3848d1001021052y48d098cdqeb0291be91aead0f@mail.gmail.com>
2010-01-02 19:07 ` erik quanstrom
  -- strict thread matches above, loose matches on Subject: below --
2010-01-02 18:52 mycroftiv 9gridchan
     [not found] <<b2d3848d1001020612w51eb261ao5cd830e1dc5db8f3@mail.gmail.com>
2010-01-02 17:08 ` erik quanstrom
2010-01-02 17:24   ` Akshat Kumar
2010-01-02 14:12 mycroftiv 9gridchan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).