From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/3460 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Iconv and old codepages Date: Wed, 26 Jun 2013 14:34:32 -0400 Message-ID: <20130626183432.GQ29800@brightrain.aerifal.cx> References: <20130627021539.76b69eea@sibserver.ru> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1372271686 4706 80.91.229.3 (26 Jun 2013 18:34:46 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 26 Jun 2013 18:34:46 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-3464-gllmg-musl=m.gmane.org@lists.openwall.com Wed Jun 26 20:34:48 2013 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1UruYH-0002Dp-7S for gllmg-musl@plane.gmane.org; Wed, 26 Jun 2013 20:34:45 +0200 Original-Received: (qmail 21601 invoked by uid 550); 26 Jun 2013 18:34:44 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 21591 invoked from network); 26 Jun 2013 18:34:44 -0000 Content-Disposition: inline In-Reply-To: <20130627021539.76b69eea@sibserver.ru> User-Agent: Mutt/1.5.21 (2010-09-15) Xref: news.gmane.org gmane.linux.lib.musl.general:3460 Archived-At: On Thu, Jun 27, 2013 at 02:15:39AM +0800, orc wrote: > Hi, > > How many codepages does in-musl iconv supports? > Currently I'm trying converting from "utf8" to "cp1251" and iconv() > only gives me a number of "*"'s matching the utf8 input. Is this > correct behavior and iconv() currently does not support non-UTF legacy > codepages? Even so, I still see many of them in src/locale/codepages.h > The (dirty) test program attached. > > I also noticed alternative libs thread and corresponding wiki page. > Does someone know lightweight iconv replacement as a temporary measure > (other than libiconv for example)? Should be fixed in git. In general, the state of musl's iconv is that the following charsets are supported: utf8 wchart ucs2 ucs2be ucs2le utf16 utf16be utf16le ucs4 ucs4be utf32 utf32be ucs4le utf32le ascii usascii iso646 iso646us eucjp shiftjis sjis gb18030 gbk gb2312 iso88591 latin1 iso88592 iso88593 iso88594 iso88595 iso88596 iso88597 iso88598 iso88599 iso885910 iso885911 tis620 iso885913 iso885914 iso885915 latin9 iso885916 cp1250 windows1250 cp1251 windows1251 cp1252 windows1252 cp1253 windows1253 cp1254 windows1254 cp1255 windows1255 cp1256 windows1256 cp1257 windows1257 cp1258 windows1258 koi8r koi8u Non-alphanumeric characters are ignored in matching charset names, so all combinations of hyphens and underscores are also supported with these. One caveat which should not affect your usage is that the following charsets are only supported as the "from" charset, not the "to" charset: eucjp shiftjis sjis gb18030 gbk gb2312 Until the latest commit, the legacy 8bit codepages were also broken as the "to" charset, but this breakage was unintentional. Rich