zsh-workers
 help / color / mirror / code / Atom feed
* UTF-8 locales on BSDs do not support collation correctly
@ 2017-01-25 14:27 Jun T.
  2017-01-25 18:02 ` Mikael Magnusson
  2017-01-30  3:59 ` Bart Schaefer
  0 siblings, 2 replies; 13+ messages in thread
From: Jun T. @ 2017-01-25 14:27 UTC (permalink / raw)
  To: zsh-workers

After the commit:

commit 0e33ebc6514c8719513f3f20161274f6af2caffc
Author: Mikael Magnusson <mikachu@gmail.com>
Date:   Tue Jan 24 12:01:57 2017 +0100

    posted: Make D07 recognize more spellings of pl_PL.UTF-8

diff --git a/Test/D07multibyte.ztst b/Test/D07multibyte.ztst
(snip) 
-  if [[ -n ${$(locale -a 2>/dev/null)[(R)pl_PL.utf8]} ]]; then
+  if [[ -n ${$(locale -a 2>/dev/null)[(R)pl_PL.(utf8|UTF-8)]} ]]; then


the test D07multibyte.ztst fails on macOS and freeBSD.
On these OSs (and maybe on other BSDs), the locale pl_PL.UTF-8 exists
but it does not support collation correctly (it just uses ASCII collation).

Are there any OS which uses UTF-8 (instead of utf8) for locale name
and supports the collation correctly?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: UTF-8 locales on BSDs do not support collation correctly
  2017-01-25 14:27 UTF-8 locales on BSDs do not support collation correctly Jun T.
@ 2017-01-25 18:02 ` Mikael Magnusson
  2017-01-26 17:57   ` Peter Stephenson
  2017-01-30  3:59 ` Bart Schaefer
  1 sibling, 1 reply; 13+ messages in thread
From: Mikael Magnusson @ 2017-01-25 18:02 UTC (permalink / raw)
  To: Jun T.; +Cc: zsh workers, Jens Elkner, Peter Stephenson

 On Wed, Jan 25, 2017 at 3:27 PM, Jun T. <takimoto-j@kba.biglobe.ne.jp> wrote:
> After the commit:
>
> commit 0e33ebc6514c8719513f3f20161274f6af2caffc
> Author: Mikael Magnusson <mikachu@gmail.com>
> Date:   Tue Jan 24 12:01:57 2017 +0100
>
>     posted: Make D07 recognize more spellings of pl_PL.UTF-8
>
> diff --git a/Test/D07multibyte.ztst b/Test/D07multibyte.ztst
> (snip)
> -  if [[ -n ${$(locale -a 2>/dev/null)[(R)pl_PL.utf8]} ]]; then
> +  if [[ -n ${$(locale -a 2>/dev/null)[(R)pl_PL.(utf8|UTF-8)]} ]]; then
>
>
> the test D07multibyte.ztst fails on macOS and freeBSD.
> On these OSs (and maybe on other BSDs), the locale pl_PL.UTF-8 exists
> but it does not support collation correctly (it just uses ASCII collation).
>
> Are there any OS which uses UTF-8 (instead of utf8) for locale name
> and supports the collation correctly?

It works fine on OpenBSD. However, I had to revert 40333 for it to
compile there. Adding some extra CCs since the mailing list probably
still doesn't work.

gmake[2]: Entering directory '/home/mikachu/code/zsh/Src'
gcc -c -I. -I../Src -I../Src -I../Src/Zle -I.  -DHAVE_CONFIG_H -Wall
-Wmissing-prototypes -O2  -o watch.o watch.c
watch.c: In function 'readwtab':
watch.c:488: warning: implicit declaration of function 'setutent'
watch.c:489: warning: implicit declaration of function 'getutent'
watch.c:489: warning: assignment makes pointer from integer without a cast
watch.c:512: warning: implicit declaration of function 'endutent'
gmake[3]: Entering directory '/home/mikachu/code/zsh/Src/Builtins'
gmake[3]: Leaving directory '/home/mikachu/code/zsh/Src/Builtins'
gmake[3]: Entering directory '/home/mikachu/code/zsh/Src/Modules'
gmake[3]: Leaving directory '/home/mikachu/code/zsh/Src/Modules'
gmake[3]: Entering directory '/home/mikachu/code/zsh/Src/Zle'
gmake[3]: Leaving directory '/home/mikachu/code/zsh/Src/Zle'
gmake[2]: Leaving directory '/home/mikachu/code/zsh/Src'
Updated `stamp-modobjs'.
rm -f zsh
gcc  -s -Wl,-E -o zsh main.o  `cat stamp-modobjs`   -lncursesw -lm  -lc
string.o: In function `wcs_ztrdup':
string.c:(.text+0x534): warning: warning: wcscpy() is almost always
misused, please use wcslcpy()
builtin.o: In function `cd_try_chdir':
builtin.c:(.text+0x64c3): warning: warning: strcpy() is almost always
misused, please use strlcpy()
params.o: In function `randomgetfn':
params.c:(.text+0x23eb): warning: warning: rand() may return
deterministic values, is that what you want?
builtin.o: In function `fclist':
builtin.c:(.text+0x1054c): warning: warning: strcat() is almost always
misused, please use strlcat()
builtin.o: In function `bin_print':
builtin.c:(.text+0xa7e0): warning: warning: sprintf() is often
misused, please use snprintf()
watch.o: In function `readwtab':
watch.c:(.text+0x43): undefined reference to `setutent'
watch.c:(.text+0x48): undefined reference to `getutent'
watch.c:(.text+0x73): undefined reference to `getutent'
watch.c:(.text+0x83): undefined reference to `endutent'
collect2: ld returned 1 exit status
Makefile:227: recipe for target 'zsh' failed
gmake[1]: *** [zsh] Error 1
gmake[1]: Leaving directory '/home/mikachu/code/zsh/Src'
Makefile:188: recipe for target 'all' failed
gmake: *** [all] Error 1


-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: UTF-8 locales on BSDs do not support collation correctly
  2017-01-25 18:02 ` Mikael Magnusson
@ 2017-01-26 17:57   ` Peter Stephenson
  2017-01-26 19:30     ` Jens Elkner
  2017-01-27  9:41     ` Peter Stephenson
  0 siblings, 2 replies; 13+ messages in thread
From: Peter Stephenson @ 2017-01-26 17:57 UTC (permalink / raw)
  To: zsh workers

On Wed, 25 Jan 2017 19:02:29 +0100
Mikael Magnusson <mikachu@gmail.com> wrote:
> It works fine on OpenBSD. However, I had to revert 40333 for it to
> compile there. Adding some extra CCs since the mailing list probably
> still doesn't work.
> 
> gmake[2]: Entering directory '/home/mikachu/code/zsh/Src'
> gcc -c -I. -I../Src -I../Src -I../Src/Zle -I.  -DHAVE_CONFIG_H -Wall
> -Wmissing-prototypes -O2  -o watch.o watch.c
> watch.c: In function 'readwtab':
> watch.c:488: warning: implicit declaration of function 'setutent'
> watch.c:489: warning: implicit declaration of function 'getutent'
> watch.c:489: warning: assignment makes pointer from integer without a cast
> watch.c:512: warning: implicit declaration of function 'endutent'
>...
> watch.c:(.text+0x43): undefined reference to `setutent'
> watch.c:(.text+0x48): undefined reference to `getutent'
> watch.c:(.text+0x73): undefined reference to `getutent'
> watch.c:(.text+0x83): undefined reference to `endutent'

This is obscure: the preprocessor appears to be both replacing and not
replacing getutent and setutent.  I wonder if they are already
definitions that are being stomped on?  Or the code should go
through a different branch entirely?

Evidently this is going to stay broken until someone with access to
OpenBSD looks at it.

(I will supply a patch that checks for setutxent etc., now I've noticed
there aren't any yet, but that doesn't appear to be the problem here,
modulo obscurities.)

pws


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: UTF-8 locales on BSDs do not support collation correctly
  2017-01-26 17:57   ` Peter Stephenson
@ 2017-01-26 19:30     ` Jens Elkner
  2017-01-27  9:41     ` Peter Stephenson
  1 sibling, 0 replies; 13+ messages in thread
From: Jens Elkner @ 2017-01-26 19:30 UTC (permalink / raw)
  To: zsh-workers

On Thu, Jan 26, 2017 at 05:57:17PM +0000, Peter Stephenson wrote:
> On Wed, 25 Jan 2017 19:02:29 +0100
> Mikael Magnusson <mikachu@gmail.com> wrote:
> > It works fine on OpenBSD. However, I had to revert 40333 for it to
> > compile there. Adding some extra CCs since the mailing list probably
> > still doesn't work.
> > 
> > gmake[2]: Entering directory '/home/mikachu/code/zsh/Src'
> > gcc -c -I. -I../Src -I../Src -I../Src/Zle -I.  -DHAVE_CONFIG_H -Wall
> > -Wmissing-prototypes -O2  -o watch.o watch.c
> > watch.c: In function 'readwtab':
> > watch.c:488: warning: implicit declaration of function 'setutent'
> > watch.c:489: warning: implicit declaration of function 'getutent'
> > watch.c:489: warning: assignment makes pointer from integer without a cast
> > watch.c:512: warning: implicit declaration of function 'endutent'
> >...
> > watch.c:(.text+0x43): undefined reference to `setutent'
> > watch.c:(.text+0x48): undefined reference to `getutent'
> > watch.c:(.text+0x73): undefined reference to `getutent'
> > watch.c:(.text+0x83): undefined reference to `endutent'
> 
> This is obscure: the preprocessor appears to be both replacing and not
> replacing getutent and setutent.  I wonder if they are already
> definitions that are being stomped on?  Or the code should go
> through a different branch entirely?

Ohh, FreeBSD/NetBSD/Dragonfly seem to have it - OpenBSD surprisingly
not. So old code similar to http://bxr.su/OpenBSD/usr.bin/w/w.c is
probably required than.
 
Have fun,
jel.
-- 
Otto-von-Guericke University     http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany         Tel: +49 391 67 52768


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: UTF-8 locales on BSDs do not support collation correctly
  2017-01-26 17:57   ` Peter Stephenson
  2017-01-26 19:30     ` Jens Elkner
@ 2017-01-27  9:41     ` Peter Stephenson
  2017-01-28 20:26       ` Bart Schaefer
  1 sibling, 1 reply; 13+ messages in thread
From: Peter Stephenson @ 2017-01-27  9:41 UTC (permalink / raw)
  To: zsh workers

On Thu, 26 Jan 2017 17:57:17 +0000
Peter Stephenson <p.stephenson@samsung.com> wrote:
> (I will supply a patch that checks for setutxent etc., now I've noticed
> there aren't any yet, but that doesn't appear to be the problem here,
> modulo obscurities.)

Can't see any reason not to commit it anyway.

Are there systems that use utmpx but getutent, not getutxent (which I
think was the implication of the old code)?  If so, we need the
following; otherwise, we need the #ifdef higher up.

pws

diff --git a/Src/watch.c b/Src/watch.c
index 7a6b930..6103ef1 100644
--- a/Src/watch.c
+++ b/Src/watch.c
@@ -87,9 +87,12 @@
 
 #if !defined(WATCH_STRUCT_UTMP) && defined(HAVE_STRUCT_UTMPX) && defined(REAL_UTMPX_FILE)
 # define WATCH_STRUCT_UTMP struct utmpx
-# define setutent setutxent
-# define getutent getutxent
-# define endutent endutxent 
+# if defined(HAVE_SETUTXENT) && defined(HAVE_GETUTXENT) && defined(HAVE_ENDUTXENT)
+#  define setutent setutxent
+#  define getutent getutxent
+#  define endutent endutxent
+# endif
+
 /*
  * In utmpx, the ut_name field is replaced by ut_user.
  * Howver, on some systems ut_name may already be defined this
diff --git a/configure.ac b/configure.ac
index dda52bc..c6ece67 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1324,7 +1324,8 @@ AC_CHECK_FUNCS(strftime strptime mktime timelocal \
 	       symlink getcwd \
 	       cygwin_conv_path \
 	       nanosleep \
-	       srand_deterministic)
+	       srand_deterministic \
+	       setutxent getutxent endutxent)
 AC_FUNC_STRCOLL
 
 AH_TEMPLATE([REALPATH_ACCEPTS_NULL],


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: UTF-8 locales on BSDs do not support collation correctly
  2017-01-27  9:41     ` Peter Stephenson
@ 2017-01-28 20:26       ` Bart Schaefer
  2017-01-28 20:42         ` Peter Stephenson
  0 siblings, 1 reply; 13+ messages in thread
From: Bart Schaefer @ 2017-01-28 20:26 UTC (permalink / raw)
  To: zsh workers

On Fri, 27 Jan 2017, Peter Stephenson wrote:

> On Thu, 26 Jan 2017 17:57:17 +0000
> Peter Stephenson <p.stephenson@samsung.com> wrote:
> > (I will supply a patch that checks for setutxent etc., now I've noticed
> > there aren't any yet, but that doesn't appear to be the problem here,
> > modulo obscurities.)
>
> Can't see any reason not to commit it anyway.

I'm now getting

Src/watch.c: In function `readwtab':
Src/watch.c:497: warning: assignment from incompatible pointer type


That's:

    while ((tmp = getutent()) != NULL) {


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: UTF-8 locales on BSDs do not support collation correctly
  2017-01-28 20:26       ` Bart Schaefer
@ 2017-01-28 20:42         ` Peter Stephenson
  2017-01-28 23:27           ` Bart Schaefer
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Stephenson @ 2017-01-28 20:42 UTC (permalink / raw)
  To: zsh workers

On Sat, 28 Jan 2017 12:26:30 -0800 (PST)
Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Fri, 27 Jan 2017, Peter Stephenson wrote:
> 
> > On Thu, 26 Jan 2017 17:57:17 +0000
> > Peter Stephenson <p.stephenson@samsung.com> wrote:
> > > (I will supply a patch that checks for setutxent etc., now I've noticed
> > > there aren't any yet, but that doesn't appear to be the problem here,
> > > modulo obscurities.)
> >
> > Can't see any reason not to commit it anyway.
> 
> I'm now getting
> 
> Src/watch.c: In function `readwtab':
> Src/watch.c:497: warning: assignment from incompatible pointer type
> 
> 
> That's:
> 
>     while ((tmp = getutent()) != NULL) {

I'm not.  This appears to be another system specific issue that needs to
be investigated by whoever is able.  It could be compiler specific, I
suppose, but that's a basic enough error it doesn't seem likely.
As all I've done is not use functions that aren't present,
this suggests a pre-existing problem.  I guess this is similar to
Michael's problem?

Just to be clear (hope this doesn't seem rude, I just don't want anyone
making untoward assumptions):  I am not going to do anything at all to
investigate errors I'm not seeing myself.  The orignal patch is clearly
warranted (as the feature was not working at all on a number of
systems), so unless someone does investigate the problems they are seeing,
this will simply not get fixed.

If the original code *was* working, introducing an appropriate #ifdef is
an appropriate fix.  (But I am *still* not going to do the guesswork
myself as I don't know whether that's the right thing to do or not.)

As this is obviously a system-dependent minefield, it would really
help if we could identify owners for problem on the various systems
involved.

pws


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: UTF-8 locales on BSDs do not support collation correctly
  2017-01-28 20:42         ` Peter Stephenson
@ 2017-01-28 23:27           ` Bart Schaefer
  2017-01-30 10:46             ` Peter Stephenson
  0 siblings, 1 reply; 13+ messages in thread
From: Bart Schaefer @ 2017-01-28 23:27 UTC (permalink / raw)
  To: zsh workers

On Sat, 28 Jan 2017, Peter Stephenson wrote:

> On Sat, 28 Jan 2017 12:26:30 -0800 (PST)
> Bart Schaefer <schaefer@brasslantern.com> wrote:
> > I'm now getting
> >
> > Src/watch.c: In function `readwtab':
> > Src/watch.c:497: warning: assignment from incompatible pointer type
> >
> >
> > That's:
> >
> >     while ((tmp = getutent()) != NULL) {
>
> Just to be clear (hope this doesn't seem rude, I just don't want anyone
> making untoward assumptions):  I am not going to do anything at all to
> investigate errors I'm not seeing myself.

Fair enough.

In my case the problem seems to be that config.h.in was somehow never
regenerated with the HAVE_*UTXENT prototypes.  I'm not sure how that was
possible ... forcibly removing and re-creating it seems to have fixed
the issue, so, sorry for the false alarm (unless somebody has an idea
why config.h.in would have gone awry).


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: UTF-8 locales on BSDs do not support collation correctly
  2017-01-25 14:27 UTF-8 locales on BSDs do not support collation correctly Jun T.
  2017-01-25 18:02 ` Mikael Magnusson
@ 2017-01-30  3:59 ` Bart Schaefer
  2017-01-30  9:49   ` Peter Stephenson
  2017-01-31 10:09   ` Jun T.
  1 sibling, 2 replies; 13+ messages in thread
From: Bart Schaefer @ 2017-01-30  3:59 UTC (permalink / raw)
  To: zsh-workers

This thread sort of got hijacked by the utmpx issue, so we never got
to a resolution on the collation question that started it.  Is this
just an issue with the test or is there a real problem here?

Testing multibyte with locale en_US.UTF-8
--- /tmp/zsh.ztst.16197/ztst.out    2017-01-29 19:41:31.000000000 -0800
+++ /tmp/zsh.ztst.16197/ztst.tout    2017-01-29 19:41:31.000000000 -0800
@@ -1,2 +1,2 @@
-a ą b c ć d e ę f
-a ą b c ć d e ę f
+a b c d e f ą ć ę
+a b c d e f ą ć ę
Test ./D07multibyte.ztst failed: output differs from expected as shown above

On Wed, Jan 25, 2017 at 6:27 AM, Jun T. <takimoto-j@kba.biglobe.ne.jp> wrote:
> After the commit:
>
> commit 0e33ebc6514c8719513f3f20161274f6af2caffc
> Author: Mikael Magnusson <mikachu@gmail.com>
> Date:   Tue Jan 24 12:01:57 2017 +0100
>
>     posted: Make D07 recognize more spellings of pl_PL.UTF-8
>
> diff --git a/Test/D07multibyte.ztst b/Test/D07multibyte.ztst
> (snip)
> -  if [[ -n ${$(locale -a 2>/dev/null)[(R)pl_PL.utf8]} ]]; then
> +  if [[ -n ${$(locale -a 2>/dev/null)[(R)pl_PL.(utf8|UTF-8)]} ]]; then
>
>
> the test D07multibyte.ztst fails on macOS and freeBSD.
> On these OSs (and maybe on other BSDs), the locale pl_PL.UTF-8 exists
> but it does not support collation correctly (it just uses ASCII collation).
>
> Are there any OS which uses UTF-8 (instead of utf8) for locale name
> and supports the collation correctly?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: UTF-8 locales on BSDs do not support collation correctly
  2017-01-30  3:59 ` Bart Schaefer
@ 2017-01-30  9:49   ` Peter Stephenson
  2017-01-31 10:09   ` Jun T.
  1 sibling, 0 replies; 13+ messages in thread
From: Peter Stephenson @ 2017-01-30  9:49 UTC (permalink / raw)
  To: zsh-workers

On Sun, 29 Jan 2017 19:59:21 -0800
Bart Schaefer <schaefer@brasslantern.com> wrote:
> This thread sort of got hijacked by the utmpx issue, so we never got
> to a resolution on the collation question that started it.  Is this
> just an issue with the test or is there a real problem here?

I'm not an expert on this area but I'd be 99% sure it's something funny
about the collation sequence, not the shell: we've seen such things
before and they do appear to be gratuitously different (well, to be
fair, in roughly the sense that Polish is "gratuitously different" from
French Canadian).  We're going to have to be even more careful picking
them, I suppose.

pws


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: UTF-8 locales on BSDs do not support collation correctly
  2017-01-28 23:27           ` Bart Schaefer
@ 2017-01-30 10:46             ` Peter Stephenson
  0 siblings, 0 replies; 13+ messages in thread
From: Peter Stephenson @ 2017-01-30 10:46 UTC (permalink / raw)
  To: zsh workers

On Sat, 28 Jan 2017 15:27:28 -0800
Bart Schaefer <schaefer@brasslantern.com> wrote:
> In my case the problem seems to be that config.h.in was somehow never
> regenerated with the HAVE_*UTXENT prototypes.  I'm not sure how that was
> possible ... forcibly removing and re-creating it seems to have fixed
> the issue, so, sorry for the false alarm (unless somebody has an idea
> why config.h.in would have gone awry).

Hmm... autoheader always needs to run after autoconf, right?  I don't
think the current dependencies ensure that...

I usually get Makefile tweaks wrong, however.

pws

diff --git a/Makefile.in b/Makefile.in
index cb74e94..ae18855 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -151,8 +151,7 @@ config.modules: $(sdir)/config.h.in config.status config.modules.sh
 	$(SHELL) ./config.modules.sh
 
 $(sdir)/config.h.in: $(sdir)/stamp-h.in
-$(sdir)/stamp-h.in: $(sdir)/configure.ac \
-		$(sdir)/aclocal.m4 $(sdir)/aczsh.m4
+$(sdir)/stamp-h.in: $(sdir)/configure
 	cd $(sdir) && autoheader
 	echo > $(sdir)/stamp-h.in
 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: UTF-8 locales on BSDs do not support collation correctly
  2017-01-30  3:59 ` Bart Schaefer
  2017-01-30  9:49   ` Peter Stephenson
@ 2017-01-31 10:09   ` Jun T.
  2017-01-31 11:19     ` Peter Stephenson
  1 sibling, 1 reply; 13+ messages in thread
From: Jun T. @ 2017-01-31 10:09 UTC (permalink / raw)
  To: zsh-workers


On 2017/01/30, at 12:59, Bart Schaefer <schaefer@brasslantern.com> wrote:

> Is this
> just an issue with the test or is there a real problem here?

I believe there is no problem in zsh.
The problem is that macOS does not support UTF-8 collation at all.
For example, on macOS,
  /usr/share/locale/pl_PL.UTF-8/LC_COLLATE
is a symlink to
  /usr/share/locale/la_LN.US-ASCII/LC_COLLATE
and the strcoll(3) always uses ASCII collation.

The commit 72e5fe7 modifies glob.c so that unmetafied file names are
(correctly) used for glob sorting. In order to test this on both Linux
and macOS, we need two characters (or strings) c1 and c2 which satisfy

  c1 < c2    and    metafy(c1) > metafy(c2)

in both UTF-8 and ASCII collations. It seems the following two
characters can be used:

       Unicode      UTF-8      metafied
---------------------------------------
c1  Ą   U+0104      c4 84      c4 83 a4
c2  Ġ   U+0120      c4 a0      c4 83 80

So how about the following patch? With this patch, the test fails
without the commit 72e5fe7 but succeeds with it, on both Linux and macOS.


diff --git a/Test/D07multibyte.ztst b/Test/D07multibyte.ztst
index 0ff65c7..e203153 100644
--- a/Test/D07multibyte.ztst
+++ b/Test/D07multibyte.ztst
@@ -551,22 +551,20 @@
   : $functions)
 0:Multibyte handling of functions parameter
 
-  if [[ -n ${$(locale -a 2>/dev/null)[(R)pl_PL.(utf8|UTF-8)]} ]]; then
-  (
-    export LC_ALL=pl_PL.UTF-8
-    local -a names=(a b c d e f $'\u0105' $'\u0107' $'\u0119')
-    print -o $names
-    mkdir -p plchars
-    cd plchars
-    touch $names
-    print ?
-  )
-  else
-    ZTST_skip="No Polish UTF-8 locale found, skipping sort test"
-  fi
-0:Sorting of metafied Polish characters
->a ą b c ć d e ę f
->a ą b c ć d e ę f
+# c1=U+0104 (Ą) and c2=U+0120 (Ġ) are chosen so that
+#   u1 = utf8(c1) = c4 84  <  u2 = utf8(c2) = c4 a0
+#   metafy(u1) = c4 83 a4  >  metafy(u2) = c4 83 80
+# in both UTF-8 and ASCII collations (the latter is used in macOS
+# and some versions of BSDs).
+  local -a names=( $'\u0104' $'\u0120' )
+  print -o $names
+  mkdir -p colltest
+  cd colltest
+  touch $names
+  print ?
+0:Sorting of metafied characters
+>Ą Ġ
+>Ą Ġ
 
   printf '%q%q\n' 你你
 0:printf %q and quotestring and general metafy / token madness




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: UTF-8 locales on BSDs do not support collation correctly
  2017-01-31 10:09   ` Jun T.
@ 2017-01-31 11:19     ` Peter Stephenson
  0 siblings, 0 replies; 13+ messages in thread
From: Peter Stephenson @ 2017-01-31 11:19 UTC (permalink / raw)
  To: zsh-workers

On Tue, 31 Jan 2017 19:09:58 +0900
Jun T. <takimoto-j@kba.biglobe.ne.jp> wrote:
> So how about the following patch? With this patch, the test fails
> without the commit 72e5fe7 but succeeds with it, on both Linux and macOS.

Thanks, that does sound about as good as we're going to get given the
limitations.  It's not worth a huge effort testing the system library,
just zsh's hooks into it, which this does.

pws


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-01-31 11:19 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-25 14:27 UTF-8 locales on BSDs do not support collation correctly Jun T.
2017-01-25 18:02 ` Mikael Magnusson
2017-01-26 17:57   ` Peter Stephenson
2017-01-26 19:30     ` Jens Elkner
2017-01-27  9:41     ` Peter Stephenson
2017-01-28 20:26       ` Bart Schaefer
2017-01-28 20:42         ` Peter Stephenson
2017-01-28 23:27           ` Bart Schaefer
2017-01-30 10:46             ` Peter Stephenson
2017-01-30  3:59 ` Bart Schaefer
2017-01-30  9:49   ` Peter Stephenson
2017-01-31 10:09   ` Jun T.
2017-01-31 11:19     ` Peter Stephenson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).