From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id 45BCD7F615 for ; Mon, 19 Dec 2016 18:20:00 +0100 (CET) Authentication-Results: mail2-smtp-roc.national.inria.fr; spf=None smtp.pra=yotambarnoy@gmail.com; spf=Pass smtp.mailfrom=yotambarnoy@gmail.com; spf=None smtp.helo=postmaster@mail-yw0-f173.google.com Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of yotambarnoy@gmail.com) identity=pra; client-ip=209.85.161.173; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="yotambarnoy@gmail.com"; x-sender="yotambarnoy@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail2-smtp-roc.national.inria.fr: domain of yotambarnoy@gmail.com designates 209.85.161.173 as permitted sender) identity=mailfrom; client-ip=209.85.161.173; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="yotambarnoy@gmail.com"; x-sender="yotambarnoy@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-yw0-f173.google.com) identity=helo; client-ip=209.85.161.173; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="yotambarnoy@gmail.com"; x-sender="postmaster@mail-yw0-f173.google.com"; x-conformance=sidf_compatible IronPort-PHdr: =?us-ascii?q?9a23=3ALC2Fghdvv/vzTA3p5jgD2qvUlGMj4u6mDksu8pMi?= =?us-ascii?q?zoh2WeGdxc67bR7h7PlgxGXEQZ/co6odzbGH6Oa/BidZuc7J8ChbNscTB1ld0Y?= =?us-ascii?q?RetjdjKfDGIHWzFOTtYS0+EZYKf35e1Fb/D3JoHt3jbUbZuHy44G1aMBz+MQ1o?= =?us-ascii?q?Ora9QdaK3Iyfntq/8JzLYghOmCH1IfYrdE33/k3tsZw4mwpuq7wwwVPjpWZSM7?= =?us-ascii?q?BY325kKEiSlFD24dqq1Jpq8C1asvRn8cNcB/bUZaM9GI1fED0je0o8/svspFGX?= =?us-ascii?q?XAyT734WW38QlQtgDA3M7RW8VZD05Hip/tFh0TWXaJWlBYs/Xi6vuuIyEEfl?= X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0A5AACjFVhYhq2hVdFDGhoBAQEBAgEBA?= =?us-ascii?q?QEIAQEBARUBAQEBAgEBAQEIAQEBAYMMAQEBAQF5gQYHgn6KSpZZiGaMKIIKKoV?= =?us-ascii?q?4AoF7Bz8UAQEBAQEBAQEBAQESAQEBCAsLCR0wgjMEARUBBIIWAQEBAgEBDBcdA?= =?us-ascii?q?QgMBx0BAwwGAwILDQICJgICIgERAQUBDgENBhMJEog1AQMPCA4uiXiRCz+MAoI?= =?us-ascii?q?EBQEegw0Fg1IKGScNVIJ8AQEBAQEFAQEBAQEBAQEYAgYSeYUrg1GBCIMDgT0Mg?= =?us-ascii?q?niCXQWIYoccinKGUoMSg2GDb5BNjhmCSRQegRQfgTlOE4M7IIIGIDQBhkwqRIF?= =?us-ascii?q?PAQEB?= X-IPAS-Result: =?us-ascii?q?A0A5AACjFVhYhq2hVdFDGhoBAQEBAgEBAQEIAQEBARUBAQE?= =?us-ascii?q?BAgEBAQEIAQEBAYMMAQEBAQF5gQYHgn6KSpZZiGaMKIIKKoV4AoF7Bz8UAQEBA?= =?us-ascii?q?QEBAQEBAQESAQEBCAsLCR0wgjMEARUBBIIWAQEBAgEBDBcdAQgMBx0BAwwGAwI?= =?us-ascii?q?LDQICJgICIgERAQUBDgENBhMJEog1AQMPCA4uiXiRCz+MAoIEBQEegw0Fg1IKG?= =?us-ascii?q?ScNVIJ8AQEBAQEFAQEBAQEBAQEYAgYSeYUrg1GBCIMDgT0MgniCXQWIYoccinK?= =?us-ascii?q?GUoMSg2GDb5BNjhmCSRQegRQfgTlOE4M7IIIGIDQBhkwqRIFPAQEB?= X-IronPort-AV: E=Sophos;i="5.33,374,1477954800"; d="scan'208";a="250733568" Received: from mail-yw0-f173.google.com ([209.85.161.173]) by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/AES128-GCM-SHA256; 19 Dec 2016 18:19:59 +0100 Received: by mail-yw0-f173.google.com with SMTP id i145so69271059ywg.2 for ; Mon, 19 Dec 2016 09:19:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=Ty2CZe+RzbNu51k1xoGgUatXJxJ+vYphwNk++YMkWwE=; b=lqqk6ZiS+V/R+PLQoMA8TUpuP0+WXBivDabTdt4z0ka5VNzVDkl7mHYUtsOkOKuKwn Au05xVVveQevxNeXk+p7/BtAv7t684y8sKIUkaMy82RhHDGgP3oXnJXn9klrvN8rdkI6 X5QxwDuWylvaEEeBUUrTwxiso7YsCWWWpByd8mMf57osKkrLZeu5EVIne7A1QamcCtv/ U2hjGE5+UIkG9jXSCmeGhBRcH5LQz2Dlwhb0s84nyvt6jyUarCEaCej0oBcboNSnluRq 0AcVaacSOmAJpjf1/5JQTeDi9ctuV/5p/L3SDAvh5IwtN0dSAWMsoae/QmsFB8aUl4nZ Okww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=Ty2CZe+RzbNu51k1xoGgUatXJxJ+vYphwNk++YMkWwE=; b=VkVEPboKFY8ipjaor1EtJuZ5Zchrobf9QiDCK6pnPODRGfLZ3xzfHoW2oytrJjei2Q o/xjhTTFBZEwENNDYe0q/AZWUou8/WG7YfFeUxpFEqCxMl86eEKud1z0hjkO0eOKvMQW uD1qTqambHOlddiDj+BUkX88vPH4P1NoPc70DN6SC02R/A3fEUurgpc3JTpTpafFqgWN zD1IVAG+YRpLCsILgMjcFS5uwleHsP+uMfi1cnn3z8lPH9TLrtYkWAC2PBG9wFeQimXz vmjozBwPuayX/pf0KYcIJxNjifaNGusXJHXOoMVzc8hZ/RtUmE/Yf2aA7faFR3a+p8cL D8wg== X-Gm-Message-State: AKaTC02RYqttCALH1FIkIpUUk3Ws1NCVU4UrCV84vbszlWD/mPN+WzHiXBpa0l3iHEs9saP/ImQnW17e5PmrtA== X-Received: by 10.129.73.65 with SMTP id w62mr13625706ywa.80.1482167997958; Mon, 19 Dec 2016 09:19:57 -0800 (PST) MIME-Version: 1.0 Received: by 10.37.18.86 with HTTP; Mon, 19 Dec 2016 09:19:37 -0800 (PST) In-Reply-To: References: <7bc766a2-d460-524b-35ca-89609a34b719@tu-berlin.de> <1482148297.4629.19.camel@gerd-stolpmann.de> <0F7D3B1B3C4B894D824F5B822E3E5A172CFB9581@IRSMSX102.ger.corp.intel.com> <1482165686.4629.28.camel@gerd-stolpmann.de> From: Yotam Barnoy Date: Mon, 19 Dec 2016 12:19:37 -0500 Message-ID: To: =?UTF-8?B?RnLDqWTDqXJpYyBCb3Vy?= Cc: Gerd Stolpmann , "Soegtrop, Michael" , =?UTF-8?Q?Christoph_H=C3=B6ger?= , "caml-list@inria.fr" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Caml-list] Closing the performance gap to C On Mon, Dec 19, 2016 at 12:09 PM, Fr=C3=A9d=C3=A9ric Bour wrote: > The problem observed is not because of spilling but simply because of flo= at > boxing and compilation of recursive calls. > The loop seems to compile down to an efficient code ended by a jump, but > float unboxing is done in a much earlier pass in the compiler (cmm). > > Passing -dcmm to the compiler, we can see that before the recursive call = to > loop the float is boxed again. > At this point, it is just a regular ocaml function call, taking boxed flo= at. > > A simpler code to observe this behavior: > > let rec test f =3D > test (f +. 1.0) > > let () =3D test 0.0 > > will box at every iteration. Yes, this is a current weak point of OCaml compilation, which I've reported here (https://caml.inria.fr/mantis/view.php?id=3D7289). The only real solution to this currently is to improve Flambda to the point that unboxing is done at the Flambda level rather than at cmmgen. Unboxing decisions are currently extremely local, which isn't good enough for recursive functions, which cannot be inlined. Currently, the only solution to this is to use a. for/while loops or b. a global mutable value instead of a parameter. -Yotam > > Le 19 d=C3=A9c. 2016 =C3=A0 17:41, Gerd Stolpmann a =C3=A9crit : > > Michael, > > look here, it's the "definitive source": > https://github.com/ocaml/ocaml/blob/trunk/asmcomp/amd64/proc.ml > > Windows is in deed different. I don't have enough insight into the > register spilling algorithm to say whether this has a significant > effect, though. OCaml code never keeps registers alive, and thus I > would bet the spilling algorithm is designed for that, and it is > probably not tried to move values preferably to xmm6-15 in order to > avoid spilling during C calls. But that's just a hypothesis... Does > somebody know? > > Gerd > > > Am Montag, den 19.12.2016, 14:52 +0000 schrieb Soegtrop, Michael: > > Dear Gerd, > > > When you call a C function like cos it is likely that this > happens because the C calling conventions do not preserve the FP > registers > (xmm*). This could be improved if the OCaml compiler tried > alternate places for > temporarily storing FP values: > > For Windows this doesn't seem to be true. See e.g.: > > https://msdn.microsoft.com/en-us/library/ms235286.aspx > > which states that XMM0..XMM5 are volatile, while XMM6..XMM15 must be > preserved. > > I think for Linix you are right. I couldn't find a better reference > than Wikipedia: > > https://en.wikipedia.org/wiki/X86_calling_conventions > > see "System V AMD64 ABI" there. > > This reference contains a good overview, which matches the above data > in table 4: > > http://www.agner.org/optimize/calling_conventions.pdf > > So on Windows, there is for sure no need to save XMM6..XMM15, while > on Linux this seems to be an issue. > > Best regards, > > Michael > Intel Deutschland GmbH > Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany > Tel: +49 89 99 8853-0, www.intel.de > Managing Directors: Christin Eisenschmid, Christian Lamprechter > Chairperson of the Supervisory Board: Nicole Lau > Registered Office: Munich > Commercial Register: Amtsgericht Muenchen HRB 186928 > > -- > ------------------------------------------------------------ > Gerd Stolpmann, Darmstadt, Germany gerd@gerd-stolpmann.de > My OCaml site: http://www.camlcity.org > Contact details: http://www.camlcity.org/contact.html > Company homepage: http://www.gerd-stolpmann.de > ------------------------------------------------------------ > >