From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id 6DDCBBC57 for ; Wed, 17 Nov 2010 12:34:36 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Aq0DAP9L40xQW+UMgWdsb2JhbACiQRUBARYiIsBMhUsEilg X-IronPort-AV: E=Sophos;i="4.59,210,1288566000"; d="scan'208";a="78765630" Received: from lo.gmane.org ([80.91.229.12]) by mail4-smtp-sop.national.inria.fr with ESMTP; 17 Nov 2010 12:34:36 +0100 Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1PIgHZ-0005ES-FO for caml-list@inria.fr; Wed, 17 Nov 2010 12:34:33 +0100 Received: from avelizy-155-1-94-54.w90-35.abo.wanadoo.fr ([90.35.89.54]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 17 Nov 2010 12:34:33 +0100 Received: from sylvain by avelizy-155-1-94-54.w90-35.abo.wanadoo.fr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 17 Nov 2010 12:34:33 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: caml-list@inria.fr From: Sylvain Le Gall Subject: Re: SMP multithreading Date: Wed, 17 Nov 2010 11:34:19 +0000 (UTC) Message-ID: References: <20101115182737.42b8dcae@loki.yggdrasil.draxit.de> <8762vwdx09.fsf@frosties.localnet> X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: avelizy-155-1-94-54.w90-35.abo.wanadoo.fr User-Agent: slrn/pre1.0.0-18 (Linux) X-Spam: no; 0.00; le-gall:01 le-gall:01 speedups:01 ocaml:01 runtime:01 speedup:01 speedup:01 ocaml:01 cameleon:01 maillist:98 acb:98 15.:98 1.5:98 utilize:98 threads:01 On 17-11-2010, Goswin von Brederlow wrote: > Sylvain Le Gall writes: > >> Hi, >> >> On 15-11-2010, Wolfgang Draxinger wrote: >>> Hi, >>> >>> I've just read >>> http://caml.inria.fr/pub/ml-archives/caml-list/2002/11/64c14acb90cb14bedb2cacb73338fb15.en.html >>> in particular this paragraph: >>>| What about hyperthreading? Well, I believe it's the last convulsive >>>| movement of SMP's corpse :-) We'll see how it goes market-wise. At >>>| any rate, the speedups announced for hyperthreading in the Pentium 4 >>>| are below a factor of 1.5; probably not enough to offset the overhead >>>| of making the OCaml runtime system thread-safe. >>> >>> This reads just like the "640k ought be enough for everyone". Multicore >>> systems are the standard today. Even the cheapest consumer machines >>> come with at least two cores. Once can easily get 6 core machines today. >>> >>> Still thinking SMP was a niche and was dying? >>> >> >> Hyperthreading was never remarkable about performance or whatever and is >> probably not pure SMP (emulated SMP maybe?). > > Hyperthreading is a hack to better utilize idle cpu sub units. The CPU > has multiple complete sets of registers, one per hyper thread. Execution > of the threads is interleaved. Now when one thread is doing some > floating point operation the cpu switches over to another thread and > lets it do some integer aritmetic. But that assumes the threads are > using different sub units. If they are using the same unit then they > just block each other and no speedup occurs. > > The speedup of hyperthreading is purely from avoiding dead cycles when > one thread waits for something. On te other hand the cache is shared > between threads so per thread it is smaller and more easily > trashed. Hyperthreading can be much slower too. > Indeed, the HT extension was designed to reduce pipeline bubbles, which most of the time occurs when you need to load data from a slow memory (slow = RAM as opposed to L1/L2 cache). In the old time of my P4, ocaml was performing quite well on the processor. One story about it: while compiling cameleon on it, I often get into "thermal warning" (the CPU was overheating). I think it could have been related to the fact the CPU idle level was very low (e.g. no pipeline bubble). I always thought that this was related to the fact the minor heap can be stored inside the cache and that reduces the hit/miss factor (i.e. avoid fetching data in RAM). I have never really tested this hypothesis. Maybe you can tell me your opinion about this? Regards, Sylvain Le Gall