From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jason at zx2c4.com (Jason A. Donenfeld) Date: Fri, 17 Mar 2017 19:07:02 +0100 Subject: [PATCH] filter: set environment variable PYTHONIOENCODING to utf-8 In-Reply-To: <20170312175153.GM2102@john.keeping.me.uk> References: <20170223154823.18206-1-roy@marples.name> <20170304123521.GC2102@john.keeping.me.uk> <20170309001002.GF2102@john.keeping.me.uk> <20170312175153.GM2102@john.keeping.me.uk> Message-ID: On Sun, Mar 12, 2017 at 6:51 PM, John Keeping wrote: > While I'm inclined to agree with this, in this particular case we > explicitly encode pages as UTF-8 so there is an argument that we should > be telling child processes that UTF-8 is the correct encoding. That's a compelling argument, actually. > > Maybe we should be looking to change LANG instead, but I'm not sure how > reliably we can do that. I'm more onboard with that. Does changing LANG influence the PYTHON variable implicitly? > Is it safe to do something like: > > const char *lang = getenv("LANG"); > struct strbuf sb = STRBUF_INIT; > > if (!lang) > lang = "C"; > strbuf_addf(&sb, "%.*s.UTF-8", > (int) (strchrnul(lang, '.') - lang), lang); > setenv("LANG", sb.buf); That's probably not too bad, though I wonder if we could get away with just explicitly setting a more generic UTF-8 instead of trying to read the user's language preferences.