From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id EAA28554; Tue, 13 Apr 2004 04:15:35 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id EAA28508 for ; Tue, 13 Apr 2004 04:15:33 +0200 (MET DST) Received: from glockenspiel.complete.org (glockenspiel.complete.org [69.10.152.57]) by nez-perce.inria.fr (8.12.10/8.12.10) with ESMTP id i3D2GVjq018229 for ; Tue, 13 Apr 2004 04:16:31 +0200 Received: from localhost (localhost [127.0.0.1]) by glockenspiel.complete.org (Postfix) with ESMTP id C39F336A; Mon, 12 Apr 2004 21:15:35 -0500 (CDT) Received: from glockenspiel.complete.org ([127.0.0.1]) by localhost (glockenspiel [127.0.0.1]) (amavisd-new, port 10025) with ESMTP id 20932-09; Mon, 12 Apr 2004 21:15:34 -0500 (CDT) Received: from erwin.complete.org (unknown [12.149.180.20]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "erwin.complete.org", Issuer "John Goerzen -- Root CA" (verified OK)) by glockenspiel.complete.org (Postfix) with ESMTP id CEFF631C; Mon, 12 Apr 2004 21:15:33 -0500 (CDT) Received: from katherina.lan.complete.org (katherina.lan.complete.org [10.200.0.4]) by erwin.complete.org (Postfix) with ESMTP id 50F7F16AC; Mon, 12 Apr 2004 21:15:23 -0500 (CDT) Received: by katherina.lan.complete.org (Postfix, from userid 1000) id C194D14001; Mon, 12 Apr 2004 21:15:36 -0500 (CDT) Date: Mon, 12 Apr 2004 21:15:36 -0500 From: John Goerzen To: Brian Hurt Cc: caml-list@inria.fr Subject: Re: [Caml-list] Threading: Using and Building Message-ID: <20040413021536.GA969@complete.org> References: <20040412151948.GA11370@excelhustler.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.5.1+cvs20040105i X-Virus-Scanned: by amavisd-new-20030616-p7 (Debian) at complete.org X-Miltered: at nez-perce by Joe's j-chkmail ("http://j-chkmail.ensmp.fr")! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; caml-list:01 threading:01 2004:99 threading:01 threads:01 non-blocking:01 slows:01 mpi:01 python:01 python:01 threads:01 gotchas:01 ocaml's:01 forking:01 streaming:99 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk X-Status: X-Keywords: X-UID: 263 On Mon, Apr 12, 2004 at 07:37:09PM -0500, Brian Hurt wrote: > The threading is all user-space threading. It doesn't take advantage of > multiple CPUs, doesn't use system threads, if one thread blocks they all > block, etc. Which explains the existance of the separate Unix library for threadying; I'm assuming that this library uses non-blocking fd's... > There are two problems with multithreading. First, it makes the GC more > difficult and more costly. Currently, the GC runs in the same system > thread as everything else, and thus it doesn't have synchronization > issues. In a multi-threaded environment, you have synchronization issues > which slows things down. Second, most people don't know how to write > safe, correct, efficient multithreaded programs. It's harder than it > looks. I think something like MPI between seperate processes would be a > better way to take advantage of multi-processors. I know little about the GC issue, save that two other garbage-collected languages I have used (Java and Python) are able to work around it. Python does have a global interpreter lock that protects sections of Python code in certain situations, I believe. However, Python uses true threads, so it is fine to call blocking system calls in threads (it will not block the entire program). You are correct that there are gotchas with multithreaded programming. However, there are some significant advantages. One is that there is no need to establish lines of communication between one process and another -- that saves a significant amount of work right there. It's also pretty easy in most systems to fire up a thread to run a function and check for (or be notified about) its result later. OCaml's design, minimizing the need to update variables, seems to lend itself nicely towards threadsafe programming. I think that, overall, the fact that "some people can't do x correctly" is not a reason to make x unsupported in a language. After all, we've seen programs all over that have buggy calls to open(2) -- especially those writing in /tmp. That doesn't mean we have to turn off open(2) on our systems :-) The alternatives are not necessarily easier anyway. Forking and communicating over pipes involves the development of a streaming protocol and a way to send data across the pipe and decode it at the other end; existing objects cannot just be reused. Synchronization can still be an issue, especially regarding open file descriptors, user interfaces, and disk files. Deadlock can occur on pipes due to any number of reasons as well. > On Unix, you don't lose much doing multi-proccess (forks) instead of > threads. Switching between processes in Unix isn't any slower than > switching between threads within a process. But I've seen benchmarks My primary concern here lies not with the performance difference between forking and threads in the general case, but rather the design differences. For certain tasks, threading is just easier. And it looks like OCaml has some serious limitations on threading. -- John ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners