From: Zvi Gilboa <zg7s@eservices.virginia.edu>
To: <musl@lists.openwall.com>
Subject: Re: question: hard-coded file descriptors in stdin/stdout/stderr
Date: Thu, 14 Mar 2013 15:34:02 -0400 [thread overview]
Message-ID: <5142262A.4060606@eservices.virginia.edu> (raw)
In-Reply-To: <20130314181759.GC19010@port70.net>
>> how do you implement fork?
My main objective was to be efficient and avoid "ugliness," that is,
eliminate all improbabilities on the parent's side (and obviously also
the child's), and accordingly ensure that the state following each step
in the implementation is exactly what I expect it to be. If you are
familiar with the Cygwin implementation, then this must ring a bell with
respect to LoadLibrary (kernel32.dll) and the fact that does not allow
you to specify the library's base address.
It is important to note that one main problem with the Cygwin
implementation (which I believe has been acknowledged by their
developers as well), is that it does not "disable" the loading of
hard-linked (compile-time linked) libraries, and thus has no control
over the bases addresses of these modules. This forces the parent to
check "after the fact" whether the base addresses of modules in the
parent and child agree, and accordingly "die" if they don't... with
respect to dynamically-linked libraries, the Cygwin fork() tries very
hard to "help" Windows choose the right base-address, yet with no
guarantee as for the outcome of that effort (so more improbability and
after-the-fact checking). For these and several other reasons, I have
taken a rather different approach.
Following that very long introduction, here is an outline of what I
do... quite a few steps, yet each of them is extremely fast and fully
determinable (note that I'm only using the Native API, so "create
process" refers to NtCreateProcess in ntdll.dll, not to the various APIs
in kernel32.dll)
0) normalize the heap and perform some basic sanity checks
1) "backup" the parent's import table, mark all relevant handles as
inheritable
2) nullify the parent's import table, then obtain handle to its
executable section (ZwOpenSection)
3) create process with NO console attached --> using the above section,
then adjust necessary settings (process group [aka "job"], etc.)
4) clone process sections (enumerate the sections, map them to the
child, efficient crc32 checks to verify synchronization allow us to take
advantage of the speed of mapping, and accordingly use
ZwWriteProcessMemory only where necessary)
--> the main advantage of using ZwOpenSection and ZwMapViewOfSection
(which lies at the heart of the whole thing) is that it allows us to
specify the section's (aka module's, dll's) base address.
5) clone process thread with new EIP (&child)
--> this clones the current thread's stack and sets the EIP in the child
to &child
6) resume thread:
--> the child is in child(), waiting for an event to be signaled (to be
sure, at hand is an NT signal)
--> since child has resumed, we now have access to all features of
PEB_LDR_DATA
--> nonetheless, the child is waiting for us, and could wait forever --
until we either signal the event, or terminate it
--> which guarantees accurate knowledge of the parent about the state of
the child process
7) update child's PEB_LDR_DATA, and gracefully "restore" its import table
--> so that GetModuleHandle returns a HANDLE
--> so that resources can be retrieved
--> and so that DllMain gets called whenever a new thread is being created
* the mapped sections (modules) are already there, so there is no need
to call LdrLoadLibrary :)
* as long as the parent and child have the same loaded modules, no true
knowledge of what LdrLoadLibrary does is necessary. This is true since
the entire PEB_LDR_DATA resides in the process's address space.
* in other words: we do not implement LdrLoadLibrary, but rather ensure
that in the end of the day, there would be no difference between what we
did, and what LdrLoadLibrary would have done...
8) clone process heap (not for the faint of heart)
9) re-attach to console (if needed)
10) signal event so that the child process continues to execute
11) child: set EAX to 0, restore esp, ret --> so that fork() returns zero
12) parent: restore the import table (this is not needed in and of
itself, but better safe than sorry)
13) parent: return child_pid as required by fork()
AND THAT'S IT!:) As said above, many steps, yet all are extremely fast
and fully determinable.
>> so you need to do hacks if you want posix
on windows
YES, indeed. Although I'd rather call them "translations":) But again,
it would be nice if I didn't have to worry about about hard-coded
numbers. Correct use of the pipes is possible as well. It is actually
quite easy to duplicate handles on Windows, as well as connect multiple
processes to the same console and/or console handles. One simply needs
to experiment with the low-level functions (for instance
WriteConsoleInput, which in fact has some basic documentation on MSDN).
So, all in all taken, does that mean my suggestion has been accepted?
On 03/14/2013 02:17 PM, Szabolcs Nagy wrote:
> * Zvi Gilboa <zg7s@eservices.virginia.edu> [2013-03-14 13:51:19 -0400]:
>> ... since you are asking... inspired by musl-libc, I am currently
>> writing a win32/win64 open-source library that implements/provides
>> POSIX system calls (see note below). I believe that having a
>> powerful libc with an MIT license available on Windows would
>> actually be of great value to the open source community for all
> ok
>
>> The main issue here is that the standard file descriptors on Windows
>> are -10 (STD_INPUT_HANDLE), -11 (STD_OUTPUT_HANDLE), and -12
> ouch
>
> i think windows file handles have very
> different semantics than posix fds
> (dup2, fork,..)
>
> so you need to do hacks if you want posix
> on windows
>
>> * as for psxcalls: this is a C library, that exclusively uses the
>> Native API. It attaches to ntdll.dll during run-time, and can thus
>> be compiled as a "native" static library with no external
>> dependencies. While it is currently in its very initial stage (and
>> not yet online), the major "obstacle" features -- including fork()
>> -- have already been implemented. To remove all doubts, I am aware
>> of Cygwin's existence, yet am looking for a high-performance
>> solution that would be both "clean"
>> (psxcalls-->libc-->user-library-or-application), and flexibly
>> licensed.
> how do you implement fork?
>
> (windows version of some tools actually allow redirecting
> io of child processes i wonder how your fork would interact
> with that, eg gawk can read/write to "/dev/fd/n" so what
> would happen if you fork such a child, can you communicate
> with it with pipes?)
>
> (at some point there was discussion about porting musl to
> various systems including windows and the conclusion was
> that it should be done on the syscall layer and fork would
> be ugly and slow no matter what you do)
next prev parent reply other threads:[~2013-03-14 19:34 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-14 16:18 Zvi Gilboa
2013-03-14 17:17 ` Szabolcs Nagy
2013-03-14 17:51 ` Zvi Gilboa
2013-03-14 18:17 ` Szabolcs Nagy
2013-03-14 19:34 ` Zvi Gilboa [this message]
[not found] ` <CAFipMOE4xkYBYb1rEDtB0T8+Nfgs9cEG_=Va1=PKN4H6CLDHMw@mail.gmail.com>
2013-03-14 19:57 ` Zvi Gilboa
2013-03-15 8:33 ` Rich Felker
2013-03-15 11:43 ` LM
2013-03-15 14:46 ` Zvi Gilboa
2013-03-15 18:43 ` Rich Felker
2013-03-15 18:55 ` Zvi Gilboa
2013-03-15 19:03 ` Rich Felker
2013-03-15 19:20 ` Zvi Gilboa
2013-03-18 3:14 ` Rob Landley
2013-03-18 3:26 ` Rich Felker
2013-03-18 3:50 ` Strake
2013-03-18 4:08 ` Rich Felker
2013-03-18 4:30 ` Rob Landley
2013-03-18 4:09 ` Rob Landley
2013-03-18 3:28 ` Zvi Gilboa
2013-03-18 4:22 ` Rob Landley
2013-03-18 4:38 ` Zvi Gilboa
2013-03-18 3:06 ` Rob Landley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5142262A.4060606@eservices.virginia.edu \
--to=zg7s@eservices.virginia.edu \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).