From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-10.9 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 6124 invoked from network); 28 Jan 2022 18:34:19 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 28 Jan 2022 18:34:19 -0000 Received: (qmail 22505 invoked by uid 550); 28 Jan 2022 18:34:17 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 22467 invoked from network); 28 Jan 2022 18:34:16 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Irwe4l7SZ6Sxy+L/9/NynGcPlGj71QhgFpwWieCa9CM=; b=MQ661f/+zDkOtA0+AS4z39Os3crtU08+srPRbm3e/aLH4XAYfyvWofm0xdAJIXSv9V Ufn6v3lbZJRs/xuFdhgPIIJAwxVddUOPHtfU7+m4jdzSyqHKuLEha99jRJNM9AuTQ9HQ I4Ubh2Xs2uFvcA4qzV+CFVHGAqBNtPH/dwkJXu9MeUglGAnc4LKEHueDABolci/MGVUv ZIhxEuitH0ByTY4rtUrHMgce+PEQJwhFOO2hh19+LqR+/i3X7kQPGGPZ0/j5PVsxatOZ NlgEj6Vlv6gIaJtXgPlMOvd/WbrbQk/k5hTs6iVD+TTsze/55bstyqxMZf5aG7fPwCiP 0IeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Irwe4l7SZ6Sxy+L/9/NynGcPlGj71QhgFpwWieCa9CM=; b=B8xoapDXhhgYgWzKZEgiE+QNks6Uvpmio6A5sIkBXdRtddp4RGI+5g5hDV2wIlMSNb plL7hsZQL3pu0k2jfLl7+ONzh9dGMMDsXUtEZId+LEELLETXoNma0tXl7Sw/7veI/Gzk ntC9qkwJe11QumLMGy726SsA8jvSuD7wr5Tw70b3XKD8U5pq2D8q3UO/X+bn6Er9dxUg XmA07YnV57lo5Lt9fX+SHrrwyeF2EnelB50OB8eepJzzkw4yzw8OulyFN7NFQszZH0eu +HRfgvoEKGFmmuhoi5klDuXEd8dNyK82z2Xcx8tMx9q7JmKx7tEkqEvKKUj2Krc1OMkZ k5MQ== X-Gm-Message-State: AOAM533kPmn4lyv8rH0CUm+XHZZ2KIcTXb06OEeIwOIvEjq4dAFsEGkK N26C5ObxIeCGabZNv9suD0BRNypcERztOoLl8HX06bhmbm5nog== X-Google-Smtp-Source: ABdhPJzaJ5a/4iwMzq5DobA8SSddaK5LfYAMhPufOKbyCyg5mXXSgYKFlrbTut2q9Ez1jHgs1sB5EBbjMUtG0jbz3Ao= X-Received: by 2002:a05:651c:503:: with SMTP id o3mr6787780ljp.70.1643394844886; Fri, 28 Jan 2022 10:34:04 -0800 (PST) MIME-Version: 1.0 References: <20220128141049.GI7074@brightrain.aerifal.cx> <20220128180103.GJ7074@brightrain.aerifal.cx> In-Reply-To: <20220128180103.GJ7074@brightrain.aerifal.cx> From: enh Date: Fri, 28 Jan 2022 10:33:53 -0800 Message-ID: To: Rich Felker Cc: musl@lists.openwall.com Content-Type: text/plain; charset="UTF-8" Subject: Re: [musl] A journey of weird file sorting and desktop systems On Fri, Jan 28, 2022 at 10:01 AM Rich Felker wrote: > > On Fri, Jan 28, 2022 at 08:58:30AM -0800, enh wrote: > > (Android's libc maintainer here...) > > > > i'd argue this isn't a musl bug. on Android we make a clear distinction between: > > > > 1. libc's responsibilities which, to paraphrase rich, are basically > > "be unsurprising because your audience is OS/app developers who don't > > speak all the languages their users use anyway". that is: "code point > > order". > > That's not what I said. I speculated that part of the difficulty with > getting people to care is that a large number of users personally > prefer LC_COLLATE=C. Not that we should punt because of that. > > > 2. icu's responsibilities which cover all the user-facing (as opposed > > to developer-facing) stuff. i18n is *hard* and the C/POSIX APIs are, > > to be blunt, not fit for *that* purpose. there's a reason why all of > > Android/macOS/Windows (and all the browsers) ship copies of icu. > > ICU is really, *really* bad. I don't want to be encouraging people to > use it because basic functionality is missing from libc. human languages are really really messy. a lot of the complexity is inherent. as for the non-inherent, https://github.com/unicode-org/icu4x seems like a good start. > > the bug here is that a desktop file manager is assuming "i just want > > telephone book order --- how hard can it be?". the answer turns out to > > be "hard". especially when you get into fun stuff like users who *do* > > speak multiple languages and have strong expectations for how they > > sort. or places where there are multiple sort orders in common use. > > Absolutely. That's why I don't want to treat the problem half-assedly, but that's my point --- it's not the *implementation* that's the issue, it's that the C/POSIX *interfaces* are insufficient. the bar on how good a job you _can_ do within those constraints is horribly low. > but make sure we design or choose a format for the collation tables > that's simultaneously (1) efficient, (2) sufficiently expressive to > give the behaviors users may want, and (3) easy enough to understand > that users can customize it if needed. The POSIX localedef format (an > option group musl intentionally does not support) does not have any of > those properties except maybe #2. The standard Unicode format may > translate directly into something that can meet all 3; I'm not sure. > > Rich