From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/7100 Path: news.gmane.org!not-for-mail From: Szabolcs Nagy Newsgroups: gmane.linux.lib.musl.general Subject: Re: x86[_64] memset and rep stos Date: Wed, 25 Feb 2015 10:20:20 +0100 Message-ID: <20150225092020.GW32724@port70.net> References: <20150225061204.GA25485@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1424856039 24278 80.91.229.3 (25 Feb 2015 09:20:39 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 25 Feb 2015 09:20:39 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-7113-gllmg-musl=m.gmane.org@lists.openwall.com Wed Feb 25 10:20:39 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1YQY8z-000893-QU for gllmg-musl@m.gmane.org; Wed, 25 Feb 2015 10:20:37 +0100 Original-Received: (qmail 13529 invoked by uid 550); 25 Feb 2015 09:20:35 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 13480 invoked from network); 25 Feb 2015 09:20:31 -0000 Mail-Followup-To: musl@lists.openwall.com Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Xref: news.gmane.org gmane.linux.lib.musl.general:7100 Archived-At: * ?????? [2015-02-25 15:54:31 +0800]: > I'm not an expert on micro optimization, but why not use a dynamic > routine selection system which would select the optimal routine for a > given CPU during program initialization. The routine selection > algorithm could simply be a predefined static table look up. > IMO, only very small number of functions (like memset, memcpy) would > benefit from such a system, so no code size overhead to worry about. my guess is - for static linking it adds at least an extra indirection and these functions often used with small input - code size overhead: now you have to include all possible versions in libc.so - for dynamic linking there is a load time dispatch mechanism: STT_GNU_IFUNC but it's broken due to lack of specs - maintainance burden, hard to test - selecting the right algorithm at runtime is not easy but i guess eventually when more ppl use musl it will make sense to add more target specific optimizations