From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/3813 Path: news.gmane.org!not-for-mail From: Harald Becker Newsgroups: gmane.linux.lib.musl.general Subject: Re: iconv Korean and Traditional Chinese research so far Date: Mon, 5 Aug 2013 00:39:43 +0200 Message-ID: <20130805003943.050fc58e@ralda.gmx.de> References: <20130804165152.GA32076@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1375655996 16210 80.91.229.3 (4 Aug 2013 22:39:56 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 4 Aug 2013 22:39:56 +0000 (UTC) Cc: musl@lists.openwall.com, dalias@aerifal.cx Original-X-From: musl-return-3817-gllmg-musl=m.gmane.org@lists.openwall.com Mon Aug 05 00:40:00 2013 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1V66xz-0006Kh-A2 for gllmg-musl@plane.gmane.org; Mon, 05 Aug 2013 00:39:59 +0200 Original-Received: (qmail 3172 invoked by uid 550); 4 Aug 2013 22:39:57 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 3164 invoked from network); 4 Aug 2013 22:39:57 -0000 In-Reply-To: <20130804165152.GA32076@brightrain.aerifal.cx> X-Provags-ID: V03:K0:odZtQkw8cX9eOe6uv9qeI/8YL7XqBAh0xjkbH17zYs9iNZgPnae oOSyjUrCCNRSXPEGW1l0n8sZNhJ2G2dF0+CDIoofcHQfkONir6OTrmj6EHmEN/aWQYhWrXQ Bkl5sPFCPSz3fWXVw0pYaCKEtkS3qH3F9PQs+LV+nMP0AIvf9+1wEzwxyzmlqfHxzkmZ7IL 1IppEDktNBzfQkHJxS5Hg== Xref: news.gmane.org gmane.linux.lib.musl.general:3813 Archived-At: Hi Rich ! > Worst-case, adding Korean and Traditional Chinese tables will > roughly double the size of iconv.o to around 150k. This will > noticably enlarge libc.so, but will make no difference to > static-linked programs except those using iconv. I'm hoping we > can make these additions less expensive, but I don't see a good > way yet. Oh nooo, do you really want to add this statically to the iconv version? Why cant we have all this character conversions on a state driven machine which loads its information from a external configuration file? This way we can have any kind of conversion someone likes, by just adding the configuration file for the required Unicode to X and X to Unicode conversions. State driven fsm interpreters are really small and fast and may read it's complete configuration from a file ... architecture independent file, so we may have same character conversion files for all architectures. > At some point, especially if the cost is not reduced, I will > probably add build-time options to exclude a configurable > subset of the supported character encodings. All this would go, if you do not load character conversions from a static table. Why don't you consider loading a conversion file for a given character set from predefined or configurable directory. With the name of the character set as filename. If you want to be the file in a directly read/modifiable form, you need to add a minimalistic parser, else the file contents may be considered binary data and you can just fread or mmap the file and use the data to control character set conversion. Most conversions only need minimal space, only some require bigger conversion routines. ... and for those who dislike, you just don't need to install the conversion files you do not want. -- Harald