From: orc <orc@sibserver.ru>
To: musl@lists.openwall.com
Subject: Re: Iconv and old codepages
Date: Thu, 27 Jun 2013 02:56:43 +0800 [thread overview]
Message-ID: <20130627025643.242152cf@sibserver.ru> (raw)
In-Reply-To: <20130626183432.GQ29800@brightrain.aerifal.cx>
Thanks Rich for your quick answer!
On Wed, 26 Jun 2013 14:34:32 -0400
Rich Felker <dalias@aerifal.cx> wrote:
> On Thu, Jun 27, 2013 at 02:15:39AM +0800, orc wrote:
> > Hi,
> >
> > How many codepages does in-musl iconv supports?
> > Currently I'm trying converting from "utf8" to "cp1251" and iconv()
> > only gives me a number of "*"'s matching the utf8 input. Is this
> > correct behavior and iconv() currently does not support non-UTF
> > legacy codepages? Even so, I still see many of them in
> > src/locale/codepages.h The (dirty) test program attached.
> >
> > I also noticed alternative libs thread and corresponding wiki page.
> > Does someone know lightweight iconv replacement as a temporary
> > measure (other than libiconv for example)?
>
> Should be fixed in git. In general, the state of musl's iconv is that
> the following charsets are supported:
>
> utf8
> wchart
> ucs2
> ucs2be
> ucs2le
> utf16
> utf16be
> utf16le
> ucs4
> ucs4be
> utf32
> utf32be
> ucs4le
> utf32le
> ascii
> usascii
> iso646
> iso646us
> eucjp
> shiftjis
> sjis
> gb18030
> gbk
> gb2312
> iso88591
> latin1
> iso88592
> iso88593
> iso88594
> iso88595
> iso88596
> iso88597
> iso88598
> iso88599
> iso885910
> iso885911
> tis620
> iso885913
> iso885914
> iso885915
> latin9
> iso885916
> cp1250
> windows1250
> cp1251
> windows1251
> cp1252
> windows1252
> cp1253
> windows1253
> cp1254
> windows1254
> cp1255
> windows1255
> cp1256
> windows1256
> cp1257
> windows1257
> cp1258
> windows1258
> koi8r
> koi8u
So "most major encodings", yep.
Thanks, it is fixed and works now.
>
> Non-alphanumeric characters are ignored in matching charset names, so
> all combinations of hyphens and underscores are also supported with
> these.
>
> One caveat which should not affect your usage is that the following
> charsets are only supported as the "from" charset, not the "to"
> charset:
>
> eucjp
> shiftjis
> sjis
> gb18030
> gbk
> gb2312
>
> Until the latest commit, the legacy 8bit codepages were also broken as
> the "to" charset, but this breakage was unintentional.
While digging trough code I did not noticed that too.
>
>
> Rich
next prev parent reply other threads:[~2013-06-26 18:56 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-26 18:15 orc
2013-06-26 18:34 ` Rich Felker
2013-06-26 18:56 ` orc [this message]
2013-06-26 18:39 ` LM
2013-06-26 18:47 ` Rich Felker
2013-06-27 0:37 ` Isaac
2013-06-27 1:25 ` Luca Barbato
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130627025643.242152cf@sibserver.ru \
--to=orc@sibserver.ru \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).