From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by yquem.inria.fr (Postfix) with ESMTP id 16B45BB84 for ; Sat, 19 Jul 2008 14:41:24 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AtsCALt9gUjAbSoIiGdsb2JhbACSPgEBAQ8gmyI X-IronPort-AV: E=Sophos;i="4.31,215,1215381600"; d="scan'208";a="13270970" Received: from einhorn.in-berlin.de ([192.109.42.8]) by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-SHA; 19 Jul 2008 14:41:23 +0200 X-Envelope-From: oliver@first.in-berlin.de Received: from einhorn.in-berlin.de (localhost [127.0.0.1]) by einhorn.in-berlin.de (8.13.6/8.13.6/Debian-1) with ESMTP id m6JCfItn012241 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Sat, 19 Jul 2008 14:41:19 +0200 Received: (from www-data@localhost) by einhorn.in-berlin.de (8.13.6/8.13.6/Submit) id m6JCfI9o012239; Sat, 19 Jul 2008 14:41:18 +0200 X-Authentication-Warning: einhorn.in-berlin.de: www-data set sender to oliver@first.in-berlin.de using -f Received: from dslb-088-073-068-085.pools.arcor-ip.net (dslb-088-073-068-085.pools.arcor-ip.net [88.73.68.85]) by webmail.in-berlin.de (IMP) with HTTP for ; Sat, 19 Jul 2008 14:41:18 +0200 Message-ID: <1216471278.4881e0ee770e7@webmail.in-berlin.de> Date: Sat, 19 Jul 2008 14:41:18 +0200 From: Oliver Bandel To: Kuba Ober Cc: caml-list@yquem.inria.fr Subject: Re: [Caml-list] thousands of CPU cores References: <200807100944.29221.peng.zang@gmail.com> <200807101500.03079.jon@ffconsultancy.com> <200807151039.51519.ober.14@osu.edu> In-Reply-To: <200807151039.51519.ober.14@osu.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit User-Agent: Internet Messaging Program (IMP) 3.2.6 X-Scanned-By: MIMEDefang_at_IN-Berlin_e.V. on 192.109.42.8 X-Spam: no; 0.00; bandel:01 in-berlin:01 byte:01 wikipedia:01 wiki:01 speedup:01 amortize:01 byte:01 mmaped:01 mmaped:01 cheers:01 beginner's:01 ocaml:01 bug:01 sram:98 Zitat von Kuba Ober : [...] > If you count the "efficiency" of such out-of-the-blue uncached truly > random > access in terms of clock cycles, current hardware may be 1-2 orders > of > magnitude less efficient than almost any 8-bit microcontroller out > there... > On most MCUs you can read a random byte out of the SRAM in say 1-4 > clock > cycles. On your commonplace modern multicore CPU, it may take a > hundred clock > cycles to do the same, and essentially the same amount of time in > terms of the > wall clock (a 2GHz CPU has only 100 times faster clock than a run of > the mill > 20MHz MCU). [...] Given a RAM with a certain clock frequency, when the Microcontroller works at the same frequency as the RAM, and a CPU of a typical computer uses a much higher frequency, it's normal that the CPU has to wait longer. That's thze reasion why cache-RAM is used on more then one level. But there are also processors, that can lookup RAM while at the same time working on instructions that were fetched before. Some DSPs are really fast... for example Analog Devices' TigerShark can do between one and four operations in one clock cycle (on average two instructions per cycle). And itÄs a single-core DSP. When it runs at 600 MHz, it can do 32-Bit Floating-Point operations with 3.6 GFlops. http://www.analog.com/en/embedded-processing-dsp/tigersharc/content/tigersharc_benchmarks/fca.html OK, this is a very specific processor and comparing it with CPU's of the computers we are using today, is maybe a littlebid inproper. But I just wanted to say: a good design can do a lot things possible, which may not be used in many CPUs. For example the TigerShark also has links to other processors and therefore can be used in multi-processor systems. http://www.analog.com/en/embedded-processing-dsp/tigersharc/content/tigersharc_processor__architectural_features/fca.html http://www.analog.com/en/embedded-processing-dsp/tigersharc/content/tigersharc_architectural_backgrounder/fca.html And the idea of Links between processors was used in the 1980's by T-400 and T-800 from INMOS: http://en.wikipedia.org/wiki/Transputer It seems, they were too far ahead to be commercially successful. Ciao, Oliver P.S.: Remember the Altivec unit of G4-processor, for example... ...they also gave good speedup and Math-speed. So, a good CPU-design can give advanatges in speed. > > What I'm trying to say is that such random, small memory accesses > highlight > the inherent message passing / transactional overhead of the hardware > implementation. Those overheads amortize when you run real number > tasks, > not a made-up cold single byte access of course. But they are there. > > It's akin to mmaped file: you can use CPU's MMU to implement it in > the > usual OS/stock hardware framework, or you can have an FPGA handle > memory > transactions and talk directly to the hard drive. It doesn't change > the > fact that it's still a mmaped file :) > > Cheers, Kuba > > _______________________________________________ > Caml-list mailing list. Subscription management: > http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list > Archives: http://caml.inria.fr > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs >