mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Laine Gholson <laine.gholson@gmail.com>
To: musl@lists.openwall.com
Subject: Re: [PATCH] bind_textdomain_codeset: don't return failure unless encoding isn't UTF-8
Date: Fri, 30 Dec 2016 16:13:44 -0600	[thread overview]
Message-ID: <3446e663-1252-bb02-4248-2132cfc4d086@gmail.com> (raw)
In-Reply-To: <20161230031450.GQ1555@brightrain.aerifal.cx>

option 1 is the only sane choice, and I don't see how something could break unless they constantly check for the GNU behavior and break if it isn't the GNU behavior, in which case it is the program's fault anyways.

On 12/29/16 21:14, Rich Felker wrote:
> On Fri, Dec 16, 2016 at 10:59:54PM -0500, Rich Felker wrote:
>> On Sat, Dec 03, 2016 at 09:04:42PM -0600, Laine Gholson wrote:
>>> returning null broke a vlc media player built with gettext support
>>
>>> >From 2f79aa294db5d9230ad71298e3de4b5561b441be Mon Sep 17 00:00:00 2001
>>> From: Laine Gholson <laine.gholson@gmail.com>
>>> Date: Wed, 9 Nov 2016 20:19:00 -0600
>>> Subject: [PATCH] bind_textdomain_codeset: don't return failure unless encoding isn't UTF-8
>>>
>>> VLC isn't happy when bind_textdomain_codeset returns NULL
>>> ---
>>>  src/locale/bind_textdomain_codeset.c | 4 +++-
>>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/src/locale/bind_textdomain_codeset.c b/src/locale/bind_textdomain_codeset.c
>>> index 5ebfd5e..e5f3f52 100644
>>> --- a/src/locale/bind_textdomain_codeset.c
>>> +++ b/src/locale/bind_textdomain_codeset.c
>>> @@ -5,7 +5,9 @@
>>>  III
>>>  char *bind_textdomain_codeset(const char *domainname, const char *codeset)
>>>  {
>>> -	if (codeset && strcasecmp(codeset, "UTF-8"))
>>> +	if (codeset && ((strcasecmp(codeset, "UTF-8") == 0) || (strcasecmp(codeset, "UTF8") == 0))) {
>>> +		return "UTF-8";
>>> +	} else if (codeset)
>>>  		errno = EINVAL;
>>>  	return NULL;
>>>  }
>>> --
>>> 2.10.2
>>
>> I think this needs some more thought. The documentation of the API is
>> that a null pointer argument/result means "the locale's character
>> encoding", and that the default is null; presumably even when the
>> locale's codeset is "foo", null (default) and "foo" are still
>> different states.
>>
>> I don't actually like that, and don't think we should copy it --
>> especially since, now that we also have a C locale with "ASCII" as the
>> codeset, we _can't_ provide a codeset matching the locale in all cases
>> -- but I also don't think it's right for the return value (null or
>> "UTF-8") to depend on the argument rather than on the "previous state"
>> like it's documented to.
>>
>> There seem to be two possible reasonable behaviors:
>>
>> 1. Diverge from the GNU behavior and treat textdomains as always-bound
>>    to "UTF-8", regardless of whether bind_textdomain_codeset has been
>>    called. The function would then return a null pointer with EINVAL
>>    set for strings other than "UTF-8"/"UTF8", and would return "UTF-8"
>>    for a valid or null-pointer argument.
>>
>> 2. Keep a 1-bit state for each textdomain reflecting whether its
>>    nominally in "default" mode or "UTF-8" mode. Either way the
>>    original UTF-8 string would be returned; the only point of the
>>    state would be providing a return value for bind_textdomain_codeset
>>    that reflects how it was previously called.
>>
>> Being that 2 is gratuitous complexity to do something stupid and
>> meaningless, I'd lean towards 1, but I don't want to break anything
>> that works. Does this seem safe to do?
>
> Ping. Anyone else have thoughts on this?
>
> Rich
>


  reply	other threads:[~2016-12-30 22:13 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-04  3:04 Laine Gholson
2016-12-17  3:59 ` Rich Felker
2016-12-30  3:14   ` Rich Felker
2016-12-30 22:13     ` Laine Gholson [this message]
2016-12-30 22:22       ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3446e663-1252-bb02-4248-2132cfc4d086@gmail.com \
    --to=laine.gholson@gmail.com \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).