mailing list of musl libc
 help / color / mirror / code / Atom feed
* getenv_r
@ 2015-03-04 23:09 William Ahern
  2015-03-05  0:41 ` getenv_r Szabolcs Nagy
  0 siblings, 1 reply; 5+ messages in thread
From: William Ahern @ 2015-03-04 23:09 UTC (permalink / raw)
  To: musl

I noticed that getenv is not thread-safe. Would there be any interest in
accepting a patch to implement getenv_r (a NetBSD function) and internal
locking? Other than leaving getenv, setenv, and putenv unsafe in threaded
environments, the only other alternative is the ugliness that glibc,
Solaris, and some others implement, which is basically to leak environ
memory.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: getenv_r
  2015-03-04 23:09 getenv_r William Ahern
@ 2015-03-05  0:41 ` Szabolcs Nagy
  2015-03-05  1:34   ` getenv_r William Ahern
  0 siblings, 1 reply; 5+ messages in thread
From: Szabolcs Nagy @ 2015-03-05  0:41 UTC (permalink / raw)
  To: musl

* William Ahern <william@25thandClement.com> [2015-03-04 15:09:20 -0800]:
> I noticed that getenv is not thread-safe. Would there be any interest in
> accepting a patch to implement getenv_r (a NetBSD function) and internal
> locking? Other than leaving getenv, setenv, and putenv unsafe in threaded
> environments, the only other alternative is the ugliness that glibc,
> Solaris, and some others implement, which is basically to leak environ
> memory.

getenv is thread-safe if the environ is not modified

in a multi-threaded application environ modification
is unsafe and problematic even if you do the locks:
different threads may want different value for an env
var and when you read an env var you cannot know if
it is up to date when you want to use it.

does netbsd use getenv_r somewhere to solve some issue?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: getenv_r
  2015-03-05  0:41 ` getenv_r Szabolcs Nagy
@ 2015-03-05  1:34   ` William Ahern
  2015-03-05  2:44     ` getenv_r Rich Felker
  2015-03-05  8:59     ` getenv_r Szabolcs Nagy
  0 siblings, 2 replies; 5+ messages in thread
From: William Ahern @ 2015-03-05  1:34 UTC (permalink / raw)
  To: musl

On Thu, Mar 05, 2015 at 01:41:33AM +0100, Szabolcs Nagy wrote:
> * William Ahern <william@25thandClement.com> [2015-03-04 15:09:20 -0800]:
> > I noticed that getenv is not thread-safe. Would there be any interest in
> > accepting a patch to implement getenv_r (a NetBSD function) and internal
> > locking? Other than leaving getenv, setenv, and putenv unsafe in threaded
> > environments, the only other alternative is the ugliness that glibc,
> > Solaris, and some others implement, which is basically to leak environ
> > memory.
> 
> getenv is thread-safe if the environ is not modified
> 
> in a multi-threaded application environ modification
> is unsafe and problematic even if you do the locks:
> different threads may want different value for an env
> var and when you read an env var you cannot know if
> it is up to date when you want to use it.

That criticism applies to almost any software, no matter the interface, with
or without locks. Locks don't solve all TOCTTOU bugs, either.

Anyhow, any use of setenv is unsafe in a multi-threaded environment, even
for different variables. In musl the __environ array is invalidated by a
setenv call that needs to grow it. The system(3) implementation in musl
passes __environ to posix_spawn.

> does netbsd use getenv_r somewhere to solve some issue?

There are a few uses in NetBSD, such as in librump, but AFAICT getenv_r
isn't widely used. It just seems a much nicer interface than the contortions
other libc libraries go through.

One obvious use for a thread-safe setenv and getenv is when trying to
generate a time_t timestamp from a UTC struct tm. timegm is not standard.
The glibc manual page suggests to instead set the TZ environment variable to
UTC, call tzset, call mktime, restore TZ, and call tzset again.

You can't use that technique portably from multiple threads unless all code
that might call setenv for any reason synchronizes on the same lock you use
to implement timegm (including any locale code that needs to read LC_
values). Whereas the only code _setting_ TZ is likely to be the timegm code.

But really the issue is most intractable in my position, where I'm trying to
implement a bindings module. In Lua especially, where the Lua VM has no
global state, it's simple to run Lua scripts in a multi-threaded environment
and have them mostly just work without issue. But those scripts are
exceedingly unlikely to have been concerned with thread-safety, and with
setenv being unsafe to use. And they can't be expected to use
synchronization primatives because module A might have no relationship to
module B.

I can't paper-over all thread issues, but I still think it worthwhile to
make it as safe as possible. No amount of finger pointing would ever fix the
problem.

Granted, at the end of the day this may be none of musl's concern. Adding
getenv_r certainly won't solve all issues. That would require other
measures. But it does seem like a worthwhile QoI issue to address. I mean,
presumably the system implementation uses posix_spawn rather out of concern
for QoI. musl could have punted and simply disclaimed any support for
threaded environments.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: getenv_r
  2015-03-05  1:34   ` getenv_r William Ahern
@ 2015-03-05  2:44     ` Rich Felker
  2015-03-05  8:59     ` getenv_r Szabolcs Nagy
  1 sibling, 0 replies; 5+ messages in thread
From: Rich Felker @ 2015-03-05  2:44 UTC (permalink / raw)
  To: musl

On Wed, Mar 04, 2015 at 05:34:57PM -0800, William Ahern wrote:
> On Thu, Mar 05, 2015 at 01:41:33AM +0100, Szabolcs Nagy wrote:
> > * William Ahern <william@25thandClement.com> [2015-03-04 15:09:20 -0800]:
> > > I noticed that getenv is not thread-safe. Would there be any interest in
> > > accepting a patch to implement getenv_r (a NetBSD function) and internal
> > > locking? Other than leaving getenv, setenv, and putenv unsafe in threaded
> > > environments, the only other alternative is the ugliness that glibc,
> > > Solaris, and some others implement, which is basically to leak environ
> > > memory.
> > 
> > getenv is thread-safe if the environ is not modified
> > 
> > in a multi-threaded application environ modification
> > is unsafe and problematic even if you do the locks:
> > different threads may want different value for an env
> > var and when you read an env var you cannot know if
> > it is up to date when you want to use it.
> 
> That criticism applies to almost any software, no matter the interface, with
> or without locks. Locks don't solve all TOCTTOU bugs, either.
> 
> Anyhow, any use of setenv is unsafe in a multi-threaded environment, even
> for different variables. In musl the __environ array is invalidated by a
> setenv call that needs to grow it. The system(3) implementation in musl
> passes __environ to posix_spawn.

Indeed. Multi-threaded programs cannot modify the environment.

> > does netbsd use getenv_r somewhere to solve some issue?
> 
> There are a few uses in NetBSD, such as in librump, but AFAICT getenv_r
> isn't widely used. It just seems a much nicer interface than the contortions
> other libc libraries go through.
> 
> One obvious use for a thread-safe setenv and getenv is when trying to
> generate a time_t timestamp from a UTC struct tm. timegm is not standard.
> The glibc manual page suggests to instead set the TZ environment variable to
> UTC, call tzset, call mktime, restore TZ, and call tzset again.

This is very bad advice. The idiom is completely unsafe. As you noted,
glibc and musl both provide timegm() which is analogous to mktime but
works in UTC. You're right that it's non-standard, but neither is
getenv_r, and timegm actually is present on existing systems whereas
getenv_r is not.

Alternatively, you can just use the formula in POSIX XBD 4.15 Seconds
Since the Epoch:

tm_sec + tm_min*60 + tm_hour*3600 + tm_yday*86400 +
    (tm_year-70)*31536000 + ((tm_year-69)/4)*86400 -
    ((tm_year-1)/100)*86400 + ((tm_year+299)/400)*86400

> You can't use that technique portably from multiple threads unless all code
> that might call setenv for any reason synchronizes on the same lock you use
> to implement timegm (including any locale code that needs to read LC_
> values). Whereas the only code _setting_ TZ is likely to be the timegm code.
> 
> But really the issue is most intractable in my position, where I'm trying to
> implement a bindings module. In Lua especially, where the Lua VM has no
> global state, it's simple to run Lua scripts in a multi-threaded environment
> and have them mostly just work without issue. But those scripts are
> exceedingly unlikely to have been concerned with thread-safety, and with
> setenv being unsafe to use. And they can't be expected to use
> synchronization primatives because module A might have no relationship to
> module B.

Why don't you just bind the Lua setenv function to work with your own
environment strings list that has nothing to do with the actual
process's? This is the correct way to write a shell too -- it should
never touch its own environment, but instead maintain shell variables
itself and export them as needed to external programs it invokes.

> I can't paper-over all thread issues, but I still think it worthwhile to
> make it as safe as possible. No amount of finger pointing would ever fix the
> problem.
> 
> Granted, at the end of the day this may be none of musl's concern. Adding
> getenv_r certainly won't solve all issues. That would require other
> measures. But it does seem like a worthwhile QoI issue to address. I mean,
> presumably the system implementation uses posix_spawn rather out of concern
> for QoI. musl could have punted and simply disclaimed any support for
> threaded environments.

I'm not sure I follow. If you're talking about system(), that function
itself is (specified to be) non-thread-safe and should not be used.
Use of posix_spawn in other places is a QoI consideration, but in
system it was more an exercise and an example of how code like this
should be written. The old version used vfork and was buggy, so it
needed to be replaced anyway, and using posix_spawn seemed natural.
Also, I suppose eventually some people might care about NOMMU systems.

In any case, I'm not sure what QoI issue you think should be addressed
here. The fact that modifying the environment is not thread-safe does
not seem to be fixable by something like getenv_r. And I'm mostly
convinced that the only right solution is to treat the environment as
immutable.

Rich


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: getenv_r
  2015-03-05  1:34   ` getenv_r William Ahern
  2015-03-05  2:44     ` getenv_r Rich Felker
@ 2015-03-05  8:59     ` Szabolcs Nagy
  1 sibling, 0 replies; 5+ messages in thread
From: Szabolcs Nagy @ 2015-03-05  8:59 UTC (permalink / raw)
  To: musl

* William Ahern <william@25thandClement.com> [2015-03-04 17:34:57 -0800]:
> On Thu, Mar 05, 2015 at 01:41:33AM +0100, Szabolcs Nagy wrote:
> > in a multi-threaded application environ modification
> > is unsafe and problematic even if you do the locks:
> > different threads may want different value for an env
> > var and when you read an env var you cannot know if
> > it is up to date when you want to use it.

> 
> One obvious use for a thread-safe setenv and getenv is when trying to
> generate a time_t timestamp from a UTC struct tm. timegm is not standard.
> The glibc manual page suggests to instead set the TZ environment variable to
> UTC, call tzset, call mktime, restore TZ, and call tzset again.
> 

this is exactly the use-case i warned against:
in a multi-threaded application where libraries do this
TZ will be clobbered concurrently

> But really the issue is most intractable in my position, where I'm trying to
> implement a bindings module. In Lua especially, where the Lua VM has no
> global state, it's simple to run Lua scripts in a multi-threaded environment
> and have them mostly just work without issue. But those scripts are
> exceedingly unlikely to have been concerned with thread-safety, and with
> setenv being unsafe to use. And they can't be expected to use
> synchronization primatives because module A might have no relationship to
> module B.

as i said multi-threaded code should not call setenv

that fixes all issues

the locking fixes no issues (at least you could not
demonstrate a valid use-case yet)


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-03-05  8:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-04 23:09 getenv_r William Ahern
2015-03-05  0:41 ` getenv_r Szabolcs Nagy
2015-03-05  1:34   ` getenv_r William Ahern
2015-03-05  2:44     ` getenv_r Rich Felker
2015-03-05  8:59     ` getenv_r Szabolcs Nagy

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).