From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/12387 Path: news.gmane.org!.POSTED!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Updating Unicode support Date: Wed, 24 Jan 2018 16:48:53 -0500 Message-ID: <20180124214853.GZ1627@brightrain.aerifal.cx> References: <20180123015446.vera7ocpvgaqvkss@sinister.lan.codevat.com> <20180123233857.GW1627@brightrain.aerifal.cx> <20180124005133.pdcypbus23yrikgg@sinister.lan.codevat.com> <20180124062602.3nn7xiwo4mgor57y@sinister.lan.codevat.com> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1516830444 23289 195.159.176.226 (24 Jan 2018 21:47:24 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 24 Jan 2018 21:47:24 +0000 (UTC) User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-12403-gllmg-musl=m.gmane.org@lists.openwall.com Wed Jan 24 22:47:20 2018 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1eeSsp-0004pj-Uj for gllmg-musl@m.gmane.org; Wed, 24 Jan 2018 22:47:04 +0100 Original-Received: (qmail 10080 invoked by uid 550); 24 Jan 2018 21:49:05 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 10062 invoked from network); 24 Jan 2018 21:49:05 -0000 Content-Disposition: inline In-Reply-To: <20180124062602.3nn7xiwo4mgor57y@sinister.lan.codevat.com> Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:12387 Archived-At: On Tue, Jan 23, 2018 at 10:26:02PM -0800, Eric Pruitt wrote: > On Tue, Jan 23, 2018 at 04:51:33PM -0800, Eric Pruitt wrote: > > On Tue, Jan 23, 2018 at 06:38:57PM -0500, Rich Felker wrote: > > > OK. With this in mind, I hope you're also aware that musl's Unicode > > > tables are all highly optimized for size and (aside from case mapping) > > > very good speed relative to their size, and are generated mechanically > > > from the UCD files via some ugly code here: > > > > > > https://github.com/richfelker/musl-chartable-tools > > I updated my copy of musl to 1.1.18 then recompiled it with and without > my utf8proc changes using GCC 6.3.0 "-O3" targeting Linux 4.9.0 / > x86_64: > > - Original implementation: 2,762,774B (musl-1.1.18/lib/libc.a) > - utf8proc implementation: 3,055,954B (musl-1.1.18/lib/libc.a) > - The utf8proc implementation is ~11% larger. I didn't do any > performance comparisons. You're comparing the whole library, not character tables. If you compare against all of ctype, it's a 15x size increase. If you compare against just wcwidth, it's a 69x increase. Rich