From: "Konstantin P." <ria.freelander@gmail.com>
To: musl@lists.openwall.com
Subject: Re: Draft proposed locale changes
Date: Mon, 5 Mar 2018 21:42:49 +0300 [thread overview]
Message-ID: <CAF1WSuzYsm81k=oCz9_O9+2_h5nLuQGzuZCEMb3Hg=dzb6tt5A@mail.gmail.com> (raw)
In-Reply-To: <20180305183950.GA17616@brightrain.aerifal.cx>
[-- Attachment #1: Type: text/plain, Size: 4531 bytes --]
Can you publish official po file for musl after proposed changes?
On Mon, Mar 5, 2018 at 9:39 PM, Rich Felker <dalias@libc.org> wrote:
>
> localeconv/LC_NUMERIC/LC_MONETARY
>
> Each loaded locale needs an immutable lconv structure to represent
> this data. It needs to be allocated with the locale (at locale loading
> time) since localeconv() has no provision for failure, but we can wait
> to populate it lazily, and we can put the code to populate it in
> localeconv.c so that static-linked programs that don't use this
> rarely-used interface don't have to pay for it. We could also omit
> even allocating it (56/96 bytes) if localeconv.o is not linked, but
> it's probably not worth the special-casing code to do that.
>
> The localeconv structure should be part of struct __locale_map, not
> struct __locale_struct, since it's a pure function of the data in the
> memory-mapped locale file and not a function of how that data is
> linked to a specific locale category. Putting it in __locale_struct
> would just complicate setlocale and newlocale.
>
> The obvious (but not terribly efficient) form for the data in the
> locale file is to have each lconv field as a mo-level key, as in:
>
> msgid "int_frac_digits"
> msgstr "2"
>
> A more compact form could pack them all into one, but then the order
> becomes a hidden locale-file interface boundary/ABI.
>
> For the string fields it's necessary that they each be in-place
> strings in the mo file. grouping and mon_grouping also have the
> special constraint that they need to vary by whether the arch uses
> signed or unsigned plain-char (since CHAR_MAX has special meaning) so
> the mo file needs to store both versions. That's ugly but I don't see
> any good way around it. We can probably punt on this for now just by
> not supporting grouping (i.e. only supporting locale definitions that
> don't do grouping), since it's not implemented anyway.
>
> If we support decimal_point, it should not go through the localeconv
> mechanism since it would always be needed by printf and strtod.
> Instead __get_locale should probe it right away and set a 1-bit flag
> in the __locale_map structure for these functions to consume (1-bit
> based on previous research that [.,] are the only values).
>
>
>
> nl_langinfo/LC_TIME/etc.
>
> Eliminate the currently-present wrong values for ERA* and related
> LC_TIME stuff; that gets rid of all ambiguous translation keys except
> "May". Bikeshed up some alternate key for May.
>
>
>
> strerror/LC_MESSAGES
>
> Not sure yet. One radical idea I kinda like is removing all the
> English-phrase messages from libc core and just having strerror
> produce strings like "ENOENT", "EPERM", etc. in the C locale. This
> seems to be the only option that wouldn't either moderately increase
> libc size or require translation files to match the exact current text
> in the builtin English libc messages. Users who want the current
> messages would then need an "en" locale with contents like:
>
> msgid "ENOENT"
> msgstr "No such file or directory"
>
> If we don't want this, the possible solutions look like one of:
>
> 1. Prepending the error code and a null byte (e.g. "ENOENT\0") to all
> the existing error strings, then skipping past it if the translation
> was not found.
>
> 2. Putting a second version of strerror in locale_map.c with the E*
> names in it, so it's only linked if you use locale. I strongly dislike
> this approach because it greatly increases the marginal size cost of
> doing the right thing (calling setlocale) and imposes the cost even if
> you don't use strerror at all (only setlocale).
>
> 3. Accepting that translations need to match (and perpetually be
> updated to match) error strings in musl __strerror.h. I don't like
> this much either.
>
> So I think it should be between options 1 and "zero" above. Option
> zero decreases the size of libc by nearly 1k (removing messages) but
> changes the behavior. Option 1 increases the size of libc by about 1k.
>
>
>
> LC_COLLATE
>
> No specific proposal yet. We need a data structure to map characters
> and sequences of characters to collating elements. Obviously the mo
> file's lookups could be used directly (O(log n), improved avg case if
> we ever add hash table support) but they might be heavier than we
> want. The alternative would be having a gigantic string in the mo file
> that's just "compiled" collation table data, but unless it's
> well-designed that seems like an undesirable permanent interface
> boundary.
>
>
[-- Attachment #2: Type: text/html, Size: 5343 bytes --]
next prev parent reply other threads:[~2018-03-05 18:42 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-05 18:39 Rich Felker
2018-03-05 18:42 ` Konstantin P. [this message]
2018-03-05 18:54 ` Rich Felker
2018-03-05 20:00 ` Konstantin P.
2018-03-05 21:25 ` Rich Felker
2018-03-06 1:54 ` Konstantin P.
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAF1WSuzYsm81k=oCz9_O9+2_h5nLuQGzuZCEMb3Hg=dzb6tt5A@mail.gmail.com' \
--to=ria.freelander@gmail.com \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).