mailing list of musl libc
 help / color / mirror / code / Atom feed
From: "A. Wilcox" <awilfox@adelielinux.org>
To: musl@lists.openwall.com
Subject: wcscoll does not collate properly, even en_US
Date: Sun, 26 Nov 2017 15:33:00 -0600	[thread overview]
Message-ID: <5A1B330C.80107@adelielinux.org> (raw)


[-- Attachment #1.1: Type: text/plain, Size: 1022 bytes --]

Hi.

My understanding is that musl does not want to support collation in
non-English languages (at least, not yet), but collation is supported in
American English.

glib's test suite is failing on musl now because the locale code is just
functional enough to make glib not skip the tests entirely (1.1.16
failed the 'setlocale is giving us the locale we set back' test), yet
collation doesn't work.  wcscoll is giving the same result as wcscmp.
This is wrong; a simple test case is attached.  Run on a glibc machine,
a FreeBSD machine, and a Solaris machine, it will output:

Amy
bug
cat
Gaz
Tom

On musl it (incorrectly) currently outputs:

Amy
Gaz
Tom
bug
cat

Does this mean my understanding was wrong and musl does not even support
AmE collation?  This is going to affect everything from `ls` to GUI file
managers like Dolphin or Nautilus to email software sorting by sender or
subject.

Regards,
--arw

-- 
A. Wilcox (awilfox)
Project Lead, Adélie Linux
http://adelielinux.org

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.2: wcscoll-test.c --]
[-- Type: text/x-csrc; name="wcscoll-test.c", Size: 900 bytes --]

#include <locale.h>	/* setlocale */
#include <stdio.h>	/* wprintf */
#include <stdlib.h>	/* calloc, free, qsort, EXIT_* */
#include <string.h>	/* mbstowcs */
#include <wchar.h>	/* wcscoll */

static int my_collate(const void *p1, const void *p2)
{
	return wcscoll(*(const wchar_t **)p1, *(const wchar_t **)p2);
}

int main(void)
{
	char *loc;
	const char *stuff[5] = { "bug", "Amy", "Tom", "Gaz", "cat" };
	wchar_t *strs[5];
	setlocale(LC_ALL, "en_US.UTF-8");
	loc = setlocale(LC_ALL, NULL);
	if(loc == NULL || strcmp(loc, "en_US.UTF-8") != 0)
	{
		perror("setlocale");
		return EXIT_FAILURE;
	}

	for(int i = 0; i < 5; i++)
	{
		strs[i] = calloc(sizeof(wchar_t), 4);
		mbstowcs(strs[i], stuff[i], 3);
	}

	qsort(&strs, 5, sizeof(wchar_t *), my_collate);

	for(int i = 0; i < 5; i++)
	{
		wprintf(L"%ls\n", strs[i]);
		free(strs[i]);
	}
	return EXIT_SUCCESS;
}

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

             reply	other threads:[~2017-11-26 21:33 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-26 21:33 A. Wilcox [this message]
2017-11-26 22:32 ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5A1B330C.80107@adelielinux.org \
    --to=awilfox@adelielinux.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).