From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id 95C6B800A3 for ; Wed, 21 Dec 2016 18:35:45 +0100 (CET) Authentication-Results: mail2-smtp-roc.national.inria.fr; spf=None smtp.pra=alain.frisch@lexifi.com; spf=None smtp.mailfrom=alain.frisch@lexifi.com; spf=None smtp.helo=postmaster@vrout10.yaziba.net Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of alain.frisch@lexifi.com) identity=pra; client-ip=185.56.204.34; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="alain.frisch@lexifi.com"; x-sender="alain.frisch@lexifi.com"; x-conformance=sidf_compatible Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of alain.frisch@lexifi.com) identity=mailfrom; client-ip=185.56.204.34; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="alain.frisch@lexifi.com"; x-sender="alain.frisch@lexifi.com"; x-conformance=sidf_compatible Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@vrout10.yaziba.net) identity=helo; client-ip=185.56.204.34; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="alain.frisch@lexifi.com"; x-sender="postmaster@vrout10.yaziba.net"; x-conformance=sidf_compatible IronPort-PHdr: =?us-ascii?q?9a23=3AhBu91hy72C4rxK/XCy+O+j09IxM/srCxBDY+r6Qd?= =?us-ascii?q?0e8QIJqq85mqBkHD//Il1AaPBtSAraIewLqG++C4ACpbvsbH6ChDOLV3FDY7yu?= =?us-ascii?q?wu1zQ6B8CEDUCpZNXLVAcdWPp4aVl+4nugOlJUEsutL3fbo3m18CJAUk6nbVk9?= =?us-ascii?q?Dq3PF4XTl8W60fyps92WOl0QxWn1XbQnHRKqpACZnMAMh4xzYvIgzQfAp3FBYe?= =?us-ascii?q?JR1EtnIFuSm1D34cLmuNZM/j5c88k98MpYVKz8eexsTLpWCxwpPno5odb3sh3b?= =?us-ascii?q?SAKJ4D0QXzNFvABPBl3s6Bj7WN/fqCrhveo1jCCeNMzwC74uWC+p749vRQ/phi?= =?us-ascii?q?ZBPDk8pjKEwvdshb5W9Ury7yd0xJTZNcTMbKJz?= X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0A/AAArvFpYhyLMOLldGgEBAQECAQEBA?= =?us-ascii?q?QgBAQEBFQEBAQECAQEBAQgBAQEBgwoBAQEBAXsvWo5ClV4Olw8qhXgCgiARAQE?= =?us-ascii?q?BAQEBAQEBAQESAQEBCgsJCR0wgjMEARUBBIIWAQEBAwEjBBFBBQsLGAICJgICV?= =?us-ascii?q?wYBDAgBAYhgDAEJqC2BbDyLCgEBAQEBBQEBAQEBAQEhgQuFK4F9glyHRIJdAQS?= =?us-ascii?q?ad4F6jz+KIoYvij+DZYQNAjWBKFGDdoFph3qBTwEBAQ?= X-IPAS-Result: =?us-ascii?q?A0A/AAArvFpYhyLMOLldGgEBAQECAQEBAQgBAQEBFQEBAQE?= =?us-ascii?q?CAQEBAQgBAQEBgwoBAQEBAXsvWo5ClV4Olw8qhXgCgiARAQEBAQEBAQEBAQESA?= =?us-ascii?q?QEBCgsJCR0wgjMEARUBBIIWAQEBAwEjBBFBBQsLGAICJgICVwYBDAgBAYhgDAE?= =?us-ascii?q?JqC2BbDyLCgEBAQEBBQEBAQEBAQEhgQuFK4F9glyHRIJdAQSad4F6jz+KIoYvi?= =?us-ascii?q?j+DZYQNAjWBKFGDdoFph3qBTwEBAQ?= X-IronPort-AV: E=Sophos;i="5.33,384,1477954800"; d="scan'208";a="251071957" Received: from vrout10-bl.yaziba.net (HELO vrout10.yaziba.net) ([185.56.204.34]) by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-SHA; 21 Dec 2016 18:35:44 +0100 Received: from mtaout10.int.yaziba.net (mtaout10.int.yaziba.net [10.4.20.36]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by vrout10.yaziba.net (mx10.yaziba.net) with ESMTPS id 5EF3B52193; Wed, 21 Dec 2016 18:35:44 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by mtaout10.int.yaziba.net (Postfix) with ESMTP id 5EA3B160325; Wed, 21 Dec 2016 18:35:44 +0100 (CET) X-Virus-Scanned: amavisd-new at Received: from mtaout10.int.yaziba.net ([127.0.0.1]) by localhost (mtaout10.int.yaziba.net [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id ASCKX7gPHgTz; Wed, 21 Dec 2016 18:35:44 +0100 (CET) Received: from [10.0.210.201] (unknown [185.23.92.144]) by mtaout10.int.yaziba.net (Postfix) with ESMTPSA id 344D616034B; Wed, 21 Dec 2016 18:35:44 +0100 (CET) To: Yotam Barnoy , Gerd Stolpmann References: <7bc766a2-d460-524b-35ca-89609a34b719@tu-berlin.de> <1482148297.4629.19.camel@gerd-stolpmann.de> <0F7D3B1B3C4B894D824F5B822E3E5A172CFB9581@IRSMSX102.ger.corp.intel.com> <1482165686.4629.28.camel@gerd-stolpmann.de> <1506c421-618a-28d4-9a7b-7ba49baf47fa@lexifi.com> <45f9251b-c362-ac9f-fa71-07ab42087c90@lexifi.com> <1482337872.4629.108.camel@gerd-stolpmann.de> Cc: =?UTF-8?B?RnLDqWTDqXJpYyBCb3Vy?= , "Soegtrop, Michael" , =?UTF-8?Q?Christoph_H=c3=b6ger?= , "caml-list@inria.fr" From: Alain Frisch Message-ID: Date: Wed, 21 Dec 2016 18:35:42 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-DRWEB-SCAN: ok X-VRSPAM-SCORE: -100 X-VRSPAM-STATE: legit X-VRSPAM-CAUSE: gggruggvucftvghtrhhoucdtuddrfeelfedrkedvgdelvdcutefuodetggdotefrucfrrhhofhhilhgvmecuggftfghnshhusghstghrihgsvgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhepuffvfhfhkffffgggjggtgfesthejredttdefjeenucfhrhhomheptehlrghinhcuhfhrihhstghhuceorghlrghinhdrfhhrihhstghhsehlvgigihhfihdrtghomheqnecuffhomhgrihhnpehlvgigihhfihdrtghomhenucfkphepudekhedrvdefrdelvddrudeggeenucfrrghrrghmpehmohguvgepshhmthhpohhuth X-VRSPAM-EXTCAUSE: mhhouggvpehsmhhtphhouhht Subject: Re: [Caml-list] Closing the performance gap to C On 21/12/2016 17:39, Yotam Barnoy wrote: > As far as I understand, the current calling convention has to do with > generic functions. Indeed. If you write: let f x = x +. 1. let g y = f (y +. 1) you cannot simply choose a specific calling convention more suited to fp arguments/results, because the same function can be used as: let h l = List.map f l Currently, the code would be compiled as (pseudo code with explicit boxing/unboxing, and explicit function symbol and closures): let f$direct x = box (unbox x +. 1.) let f = closure(f$direct) let g$direct y = f$direct (box (unbox y. +. 1.)) let g y = closure(g$direct) let h y = List.map f l > And Alain, even if we create these alternate calling conventions, the > tricky part is still detecting when it's ok to call using direct > calling conventions. That's the hard part, and it's best done by > Flambda with its inlining. The point is that it's not really hard: there is already some support in the compiler to detect direct calls and apply a specific calling convention (cf Clambda.Udirect_apply vs Clambda.Ugeneric_apply). It is "just" that this specific calling convention does not do any specialization (the same function body is used for the direct calling convention and the generic one). The suggested approach would compile the code above as (assuming no inling of f'd nor g'd body): let f$unboxed x = x +. 1. let f$direct x = box (f$unboxed (unbox x)) let f = closure(f$direct) let g$unboxed y = unbox (f$direct (box (y. +. 1.))) = unbox (box (f$unboxed (unbox (box (y. +. 1.))))) (* short-circuit the wrapper *) = f$unboxed (y. +. 1.) (* collapse unbox(box(..))->.. *) let g$direct y = box (g$unboxed (unboxed y)) let h l = List.map f l The fact that g can be compiled to a direct call to f (and not to an arbitrary function) is already known to the compiler. And at the level of cmm, one can already describe functions that take unboxed floats (cf Cmm.machtype, and the fun_args field in Cmm.fundecl). What is missing is mostly just the plumbing of type information to know that f and g should be compiled as above (i.e. keep track of the type of their arguments and result), and arrange so that direct calls short-circuit the boxing wrapper. Now of course there are cases were this strategy would degrade performances: imagine a function taking a float, but using it most often as a boxed float (e.g. passing it to Printf.printf "%f"); all call sites would still pass an unboxed float (even if the float comes naturally in already boxed form) and that function would need to box the float again (possibly several times). This is exactly the same story as for the current unboxing strategy (described in http://www.lexifi.com/blog/unboxed-floats-ocaml ), where the compiler always decides to keep a local "float ref" variable in unboxed form, sometimes leading to extra boxing. But in practice, this works well and leads to low-level numerical code work with no boxing at all except at data-structure boundaries and function calls (for now). Alain