mailing list of musl libc
 help / color / mirror / code / Atom feed
* CP850 & IBM850 codepages
@ 2014-02-25 21:44 Alan Hourihane
  2014-02-25 22:25 ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: Alan Hourihane @ 2014-02-25 21:44 UTC (permalink / raw)
  To: musl

Hi all,

First post to the list, so thank you for musl.

I was just compiling samba against musl and got this.....

checking for iconv in /usr/lib... yes
checking can we convert from CP850 to UCS2-LE?... no
checking can we convert from IBM850 to UCS2-LE?... no
checking can we convert from ASCII to UCS2-LE?... ASCII
checking can we convert from UTF-8 to UCS2-LE?... UTF-8
checking for iconv in /usr/local/lib... yes
checking can we convert from CP850 to UCS2-LE?... no
checking can we convert from IBM850 to UCS2-LE?... no
checking can we convert from ASCII to UCS2-LE?... ASCII
checking can we convert from UTF-8 to UCS2-LE?... UTF-8
configure: WARNING: Sufficient support for iconv function was not found.
     Install libiconv from http://freshmeat.net/projects/libiconv/ for 
better charset compatibility!

Looking at the code we're missing the "cp850" and "ibm850" codepages.

I'm not sure how they are derived in the musl source though.

Any help appreciated.

Thanks,

Alan.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: CP850 & IBM850 codepages
  2014-02-25 21:44 CP850 & IBM850 codepages Alan Hourihane
@ 2014-02-25 22:25 ` Rich Felker
  2014-02-25 22:31   ` Alan Hourihane
  0 siblings, 1 reply; 5+ messages in thread
From: Rich Felker @ 2014-02-25 22:25 UTC (permalink / raw)
  To: musl

On Tue, Feb 25, 2014 at 09:44:05PM +0000, Alan Hourihane wrote:
> Hi all,
> 
> First post to the list, so thank you for musl.
> 
> I was just compiling samba against musl and got this.....
> 
> checking for iconv in /usr/lib... yes
> checking can we convert from CP850 to UCS2-LE?... no
> checking can we convert from IBM850 to UCS2-LE?... no
> checking can we convert from ASCII to UCS2-LE?... ASCII
> checking can we convert from UTF-8 to UCS2-LE?... UTF-8
> checking for iconv in /usr/local/lib... yes
> checking can we convert from CP850 to UCS2-LE?... no
> checking can we convert from IBM850 to UCS2-LE?... no
> checking can we convert from ASCII to UCS2-LE?... ASCII
> checking can we convert from UTF-8 to UCS2-LE?... UTF-8
> configure: WARNING: Sufficient support for iconv function was not found.
>     Install libiconv from http://freshmeat.net/projects/libiconv/
> for better charset compatibility!
> 
> Looking at the code we're missing the "cp850" and "ibm850" codepages.
> 
> I'm not sure how they are derived in the musl source though.
> 
> Any help appreciated.

In the immediate, I think your best course of action would be to see
if you can just override these tests. It seems unlikely to me that you
would really need conversion from these legacy codepages for normal
usage of samba. (BTW I'm surprised nobody else has reported this
before... does anybody else know why it hasn't come up..?)

Adding cp850 and other DOS codepages should not be hard and should not
take up much additional size in iconv, but it's also nontrivial to do
without my tools to generate the tables, which are not published.
Publishing them is something I should really get around to doing,
since their absence affects the ability of others to modify the code
in meaningful ways; I need to apologize for not doing so already.

Rich


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: CP850 & IBM850 codepages
  2014-02-25 22:25 ` Rich Felker
@ 2014-02-25 22:31   ` Alan Hourihane
  2014-02-25 22:39     ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: Alan Hourihane @ 2014-02-25 22:31 UTC (permalink / raw)
  To: musl

On 02/25/14 22:25, Rich Felker wrote:
> On Tue, Feb 25, 2014 at 09:44:05PM +0000, Alan Hourihane wrote:
>> Hi all,
>>
>> First post to the list, so thank you for musl.
>>
>> I was just compiling samba against musl and got this.....
>>
>> checking for iconv in /usr/lib... yes
>> checking can we convert from CP850 to UCS2-LE?... no
>> checking can we convert from IBM850 to UCS2-LE?... no
>> checking can we convert from ASCII to UCS2-LE?... ASCII
>> checking can we convert from UTF-8 to UCS2-LE?... UTF-8
>> checking for iconv in /usr/local/lib... yes
>> checking can we convert from CP850 to UCS2-LE?... no
>> checking can we convert from IBM850 to UCS2-LE?... no
>> checking can we convert from ASCII to UCS2-LE?... ASCII
>> checking can we convert from UTF-8 to UCS2-LE?... UTF-8
>> configure: WARNING: Sufficient support for iconv function was not found.
>>      Install libiconv from http://freshmeat.net/projects/libiconv/
>> for better charset compatibility!
>>
>> Looking at the code we're missing the "cp850" and "ibm850" codepages.
>>
>> I'm not sure how they are derived in the musl source though.
>>
>> Any help appreciated.
> In the immediate, I think your best course of action would be to see
> if you can just override these tests. It seems unlikely to me that you
> would really need conversion from these legacy codepages for normal
> usage of samba. (BTW I'm surprised nobody else has reported this
> before... does anybody else know why it hasn't come up..?)

That's easy enough.

> Adding cp850 and other DOS codepages should not be hard and should not
> take up much additional size in iconv, but it's also nontrivial to do
> without my tools to generate the tables, which are not published.
> Publishing them is something I should really get around to doing,
> since their absence affects the ability of others to modify the code
> in meaningful ways; I need to apologize for not doing so already.
>

O.k. that makes sense as I couldn't understand the format. :-)

Thanks,

Alan.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: CP850 & IBM850 codepages
  2014-02-25 22:31   ` Alan Hourihane
@ 2014-02-25 22:39     ` Rich Felker
  2014-02-26 11:58       ` Alan Hourihane
  0 siblings, 1 reply; 5+ messages in thread
From: Rich Felker @ 2014-02-25 22:39 UTC (permalink / raw)
  To: musl

On Tue, Feb 25, 2014 at 10:31:46PM +0000, Alan Hourihane wrote:
> >Adding cp850 and other DOS codepages should not be hard and should not
> >take up much additional size in iconv, but it's also nontrivial to do
> >without my tools to generate the tables, which are not published.
> >Publishing them is something I should really get around to doing,
> >since their absence affects the ability of others to modify the code
> >in meaningful ways; I need to apologize for not doing so already.
> >
> 
> O.k. that makes sense as I couldn't understand the format. :-)

The format is basically this: legacy_chars is a table of all
codepoints that ever appear in a supported legacy codepage, with a
limit of 1024 total codepoints. The individual codepage tables are 10
bits per entry and map into this table, and they omit the initial
subrange that's identical to latin1 (and thus a one-to-one mapping to
unicode). I have tools that automatically generate these from the
unicode txt files containing the mappings.

Rich


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: CP850 & IBM850 codepages
  2014-02-25 22:39     ` Rich Felker
@ 2014-02-26 11:58       ` Alan Hourihane
  0 siblings, 0 replies; 5+ messages in thread
From: Alan Hourihane @ 2014-02-26 11:58 UTC (permalink / raw)
  To: musl

On 02/25/14 22:39, Rich Felker wrote:
> On Tue, Feb 25, 2014 at 10:31:46PM +0000, Alan Hourihane wrote:
>>> Adding cp850 and other DOS codepages should not be hard and should not
>>> take up much additional size in iconv, but it's also nontrivial to do
>>> without my tools to generate the tables, which are not published.
>>> Publishing them is something I should really get around to doing,
>>> since their absence affects the ability of others to modify the code
>>> in meaningful ways; I need to apologize for not doing so already.
>>>
>> O.k. that makes sense as I couldn't understand the format. :-)
> The format is basically this: legacy_chars is a table of all
> codepoints that ever appear in a supported legacy codepage, with a
> limit of 1024 total codepoints. The individual codepage tables are 10
> bits per entry and map into this table, and they omit the initial
> subrange that's identical to latin1 (and thus a one-to-one mapping to
> unicode). I have tools that automatically generate these from the
> unicode txt files containing the mappings.
>

Thanks Rich. I'll keep an eye out for the cp850/ibm850 table to land
when you've had chance with your tools.

Alan.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-02-26 11:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-25 21:44 CP850 & IBM850 codepages Alan Hourihane
2014-02-25 22:25 ` Rich Felker
2014-02-25 22:31   ` Alan Hourihane
2014-02-25 22:39     ` Rich Felker
2014-02-26 11:58       ` Alan Hourihane

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).