From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/10534 Path: news.gmane.org!.POSTED!not-for-mail From: Szabolcs Nagy Newsgroups: gmane.linux.lib.musl.general Subject: Re: Model specific optimizations? Date: Thu, 29 Sep 2016 16:57:51 +0200 Message-ID: <20160929145751.GH1280@port70.net> References: <20160929142126.GB22343@voyager> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1475161092 30807 195.159.176.226 (29 Sep 2016 14:58:12 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Thu, 29 Sep 2016 14:58:12 +0000 (UTC) User-Agent: Mutt/1.6.0 (2016-04-01) To: musl@lists.openwall.com Original-X-From: musl-return-10547-gllmg-musl=m.gmane.org@lists.openwall.com Thu Sep 29 16:58:08 2016 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1bpcmi-00074x-BL for gllmg-musl@m.gmane.org; Thu, 29 Sep 2016 16:58:04 +0200 Original-Received: (qmail 32330 invoked by uid 550); 29 Sep 2016 14:58:04 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 32304 invoked from network); 29 Sep 2016 14:58:03 -0000 Mail-Followup-To: musl@lists.openwall.com Content-Disposition: inline In-Reply-To: <20160929142126.GB22343@voyager> Xref: news.gmane.org gmane.linux.lib.musl.general:10534 Archived-At: * Markus Wichmann [2016-09-29 16:21:26 +0200]: > I wanted to ask if there is any wish for the near future to support > model-specific optimizations. What I mean by that is multiple > implementations of the same function, where the best implementation is > decided at run-time. musl already does some runtime selection based on hw/kernel features (arm atomics, vdso). it could use similar approaches for micro-architecture specific optimizations. this has a maintenance cost (hard to test, hard to benchmark), code size cost (all variants has to be present at runtime) and dispatch cost (it has to happen at startup or lazily) these costs are rarely justified. (there are secondary effects: glibc dispatches memcpy at runtime so the compilers have a hard time deciding when to inline it, as a consequence sometimes -O0 gives better performance than -O3 on x86 with glibc.) > One simple example would be PowerPC's fsqrt instruction. The PowerPC > Book 1 defines it as optional and provides no way to know specifically, > if the currently running processor supports this instruction besides > executing it and seeing if you get a SIGILL. if there is no linux hwcap bit for this then we cant do much about it. runtime dispatch only works if there is a reasonable way to detect hw features (hwcap, cpuid instruction, vdso something) e.g. parsing /proc/cpuinfo to figure out the cpu and guessing features from that or registering sigill signal handlers are not ok. > Then organization: Are we going the glibc route, which gathers all > indirect functions in a single section and initializes all of the > pointers at startup (__libc_init_main()), or do we put these checks > separately in each function? glibc uses ifunc for this, musl does not support ifunc at this point.