mailing list of musl libc
 help / color / mirror / code / Atom feed
* Nonstandard functions with callbacks
@ 2014-11-11  4:50 Rich Felker
  2014-11-11  7:22 ` Timo Teras
  0 siblings, 1 reply; 7+ messages in thread
From: Rich Felker @ 2014-11-11  4:50 UTC (permalink / raw)
  To: musl

The topic of fopencookie has just come up on #musl again in regards to
Asterisk's dependence on it (or the similar funopen function), and in
response I'd like to offer a few ideas on this kind of function in
general.

Supporting nonstandard functions is always a potential pitfall. Unlike
functions with a rigorous or semi-rigorous specification in one or
more standards documents, they inevitably have all sorts of
underspecified or unspecified corner cases that some software ends up
depending on. And when they come from a single origin (e.g. glibc)
rather than various historical systems that all had their own quirks,
it's arguably reasonable for applications to expect the exact behavior
of the original (e.g. glibc) implementation.

In this regard, functions that take a caller-provided callback
function are just about the worst possible. The documentation (e.g.
man pages or glibc manual) for the original version rarely specifies
constraints on what the callback can do, such as:

- Does it have to return?
- Can it call longjmp?
- What happens if it causes pthread cancellation to be acted upon?
- Can it change the floating point environment?
- Does it run in the same thread as the caller?
- Does it need to take special precautions to avoid deadlock?
- What reentrancy requirements does it have?
- Are there specific standard library functions it can't call?
- Can it unwind or backtrace out of the callback context?
- Etc.

One feature musl intentionally does not yet support is "IFUNC"
resolvers in the dynamic linker. These are a feature by which a
program can arrange for a symbol to resolve to different versions of a
function at runtime depending on cpu capabilities or similar. What's
blocking IFUNC support in musl is a specification for the constraints
on the resolver function. Since it runs in a context that happens
prior to execution of global ctors for the containing module, in a
context where relocations have not yet completed, and with locks held
in the dynamic linker, there are A LOT of implementation-imposed
constraints on what such a function can do. And sweeping those under
the rug and just saying "you can do whatever seems to work" is not a
reasonable approach for a high-quality implementation; from my
perspective, it's better not to have the feature at all than to have a
version where internal changes might break something that seems
legitimate.

The fopencookie situation is fairly similar. The callbacks to provide
a custom FILE type would run in a context with the relevant FILE
locked. It's likely that the underlying reads or writes performed on
the 'cookie' would be somewhat different under musl than under glibc
due to musl's readv/writev type model for stdio buffering, and these
differences could potentially break applications' assumptions about
how the underlying operations would be performed. Such assumptions in
turn may be valid if glibc is considered 'authoritative' for the
fopencookie function. There are a number of other considerations too
with regard to the FILE locking. In a naive fopencookie implementation
for musl, the callbacks would run with the FILE lock potentially held
in musl's light-locking mode rather than with a true recursive lock,
meaning that strange things could happen if the callback tries to
perform any operation on its own FILE. Avoiding that seems like it
would require some complex dance to 'upgrade' the lock type, which in
turn might impose long-term costs on maintaining the FILE locking
system.

My feeling is that "involves callbacks" should be an indication for
exclusion of nonstandard functions. In terms of what I've written
above, I think this follows from the existing principles of exclusion
based on cost of implementation complexity and high risk of
compatibility issues with applications.

Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Nonstandard functions with callbacks
  2014-11-11  4:50 Nonstandard functions with callbacks Rich Felker
@ 2014-11-11  7:22 ` Timo Teras
  2014-11-11 13:46   ` Rich Felker
  2014-11-12 14:27   ` Justin Cormack
  0 siblings, 2 replies; 7+ messages in thread
From: Timo Teras @ 2014-11-11  7:22 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

On Mon, 10 Nov 2014 23:50:46 -0500
Rich Felker <dalias@libc.org> wrote:

> The topic of fopencookie has just come up on #musl again in regards to
> Asterisk's dependence on it (or the similar funopen function), and in
> response I'd like to offer a few ideas on this kind of function in
> general.
> 
> Supporting nonstandard functions is always a potential pitfall. Unlike
> functions with a rigorous or semi-rigorous specification in one or
> more standards documents, they inevitably have all sorts of
> underspecified or unspecified corner cases that some software ends up
> depending on. And when they come from a single origin (e.g. glibc)
> rather than various historical systems that all had their own quirks,
> it's arguably reasonable for applications to expect the exact behavior
> of the original (e.g. glibc) implementation.

Understand. Especially for musl which aims to be mostly ABI compatible.

Apparently both fopencookie() and funopen() had ABI issues. Most
horribly fopencookie() changed the callback pointer struct
layout/contents. 

funopen() seems to have some versions with fpos_t for seek; and others
use off_t.

NetBSD current seems to have funopen2() with alternative signatures for
the callbacks using ssize_t/size_t for read/write instead of int.

So yeah, it's a royal mess.

> In this regard, functions that take a caller-provided callback
> function are just about the worst possible. The documentation (e.g.
> man pages or glibc manual) for the original version rarely specifies
> constraints on what the callback can do, such as:
> 
> - Does it have to return?
> - Can it call longjmp?
> - What happens if it causes pthread cancellation to be acted upon?
> - Can it change the floating point environment?
> - Does it run in the same thread as the caller?
> - Does it need to take special precautions to avoid deadlock?
> - What reentrancy requirements does it have?
> - Are there specific standard library functions it can't call?
> - Can it unwind or backtrace out of the callback context?
> - Etc.
>[snip]
> The fopencookie situation is fairly similar. The callbacks to provide
> a custom FILE type would run in a context with the relevant FILE
> locked. It's likely that the underlying reads or writes performed on
> the 'cookie' would be somewhat different under musl than under glibc
> due to musl's readv/writev type model for stdio buffering, and these
> differences could potentially break applications' assumptions about
> how the underlying operations would be performed. Such assumptions in
> turn may be valid if glibc is considered 'authoritative' for the
> fopencookie function. There are a number of other considerations too
> with regard to the FILE locking. In a naive fopencookie implementation
> for musl, the callbacks would run with the FILE lock potentially held
> in musl's light-locking mode rather than with a true recursive lock,
> meaning that strange things could happen if the callback tries to
> perform any operation on its own FILE. Avoiding that seems like it
> would require some complex dance to 'upgrade' the lock type, which in
> turn might impose long-term costs on maintaining the FILE locking
> system.

Generally applications seem to write things to work with funopen() or
fopencookie(); so they do not do too many assumptions. But yes, having
identical emulation seems hard and unfeasible to accomplish. Given, the
vagueness of the situation and the above mentioned hassle I perfectly
understand why it's not suitable for musl upstreaming. At least without
having additional specification.

Though the readv/writev architecture is no excuse the not implement
them. It can always be split to individual read()/write() callbacks.
But yes, the sizes (and alignment of sizes) may vary. And yes, it might
mean the performance will be non-optimal.

> My feeling is that "involves callbacks" should be an indication for
> exclusion of nonstandard functions. In terms of what I've written
> above, I think this follows from the existing principles of exclusion
> based on cost of implementation complexity and high risk of
> compatibility issues with applications.

As distribution, we want things to work. And we can limit support to
certain applications that use the specific API. And as we compile
everything against musl, we can also make additional ABI
considerations. So I think we'll still consider doing fopencookie() or
funopen() as distro-mainted patch. These are additional constraints we
can live with - compared to musl upstream which is committed to
maintain ABI between releases.

Though, we'll probably also file portability bugs against those apps
that rely on this.

/Timo


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Nonstandard functions with callbacks
  2014-11-11  7:22 ` Timo Teras
@ 2014-11-11 13:46   ` Rich Felker
  2014-11-11 13:56     ` Timo Teras
  2014-11-11 13:56     ` Timo Teras
  2014-11-12 14:27   ` Justin Cormack
  1 sibling, 2 replies; 7+ messages in thread
From: Rich Felker @ 2014-11-11 13:46 UTC (permalink / raw)
  To: musl

On Tue, Nov 11, 2014 at 09:22:53AM +0200, Timo Teras wrote:
> On Mon, 10 Nov 2014 23:50:46 -0500
> Rich Felker <dalias@libc.org> wrote:
> 
> > The topic of fopencookie has just come up on #musl again in regards to
> > Asterisk's dependence on it (or the similar funopen function), and in
> > response I'd like to offer a few ideas on this kind of function in
> > general.
> > 
> > Supporting nonstandard functions is always a potential pitfall. Unlike
> > functions with a rigorous or semi-rigorous specification in one or
> > more standards documents, they inevitably have all sorts of
> > underspecified or unspecified corner cases that some software ends up
> > depending on. And when they come from a single origin (e.g. glibc)
> > rather than various historical systems that all had their own quirks,
> > it's arguably reasonable for applications to expect the exact behavior
> > of the original (e.g. glibc) implementation.
> 
> Understand. Especially for musl which aims to be mostly ABI compatible.

IMO ABI is only a small part of the issue; the issue of source-level
compatibility when an apparently API-compatible function is provided
matters more.

> > My feeling is that "involves callbacks" should be an indication for
> > exclusion of nonstandard functions. In terms of what I've written
> > above, I think this follows from the existing principles of exclusion
> > based on cost of implementation complexity and high risk of
> > compatibility issues with applications.
> 
> As distribution, we want things to work. And we can limit support to
> certain applications that use the specific API. And as we compile
> everything against musl, we can also make additional ABI
> considerations. So I think we'll still consider doing fopencookie() or
> funopen() as distro-mainted patch. These are additional constraints we
> can live with - compared to musl upstream which is committed to
> maintain ABI between releases.
> 
> Though, we'll probably also file portability bugs against those apps
> that rely on this.

While of course I can't make the decision for you, I'd really
encourage distros not to add to or change the public API in musl in
ways that are not expected to ever make it upstream. The result is
application binaries that have an undocumented dependency on your
distro-specific version of musl and that won't work elsewhere, which
is not so much of an issue for your own distro binaries, but is a
potential gotcha for anyone compiling their own binaries on your dist
and expecting them to work on other musl-based systems. Up to now I've
been trying to reduce and hopefully eliminate the Alpine patches to
musl that affect public interfaces and I think we're almost there.

Have you looked at what other programs do as a fallback when there's
no fopencookie? I don't think it's possible to match the API 100% with
this approach (seeking won't work, for instance), but it seems
possible to do a fallback implementation based on pthread_create and
socketpair that doesn't depend on implementation internals of stdio,
so that it could go into the application or a shim library.

Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Nonstandard functions with callbacks
  2014-11-11 13:46   ` Rich Felker
@ 2014-11-11 13:56     ` Timo Teras
  2014-11-11 14:28       ` Rich Felker
  2014-11-11 13:56     ` Timo Teras
  1 sibling, 1 reply; 7+ messages in thread
From: Timo Teras @ 2014-11-11 13:56 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

On Tue, 11 Nov 2014 08:46:53 -0500
Rich Felker <dalias@libc.org> wrote:

> While of course I can't make the decision for you, I'd really
> encourage distros not to add to or change the public API in musl in
> ways that are not expected to ever make it upstream. The result is
> application binaries that have an undocumented dependency on your
> distro-specific version of musl and that won't work elsewhere, which
> is not so much of an issue for your own distro binaries, but is a
> potential gotcha for anyone compiling their own binaries on your dist
> and expecting them to work on other musl-based systems. Up to now I've
> been trying to reduce and hopefully eliminate the Alpine patches to
> musl that affect public interfaces and I think we're almost there.

Yes, we are not doing hasty decisions. So it's on hold for the time
being. I filed bug against Asterisk to see what happens. They are
supposed supporting Solaris and Mingw, which both lack this
functionality. It seems to have broken on Asterisk 13. Well TLS will
not work without that, but in Asterisk 13 they broke regular non-TLS
tcp connections...

> Have you looked at what other programs do as a fallback when there's
> no fopencookie? I don't think it's possible to match the API 100% with
> this approach (seeking won't work, for instance), but it seems
> possible to do a fallback implementation based on pthread_create and
> socketpair that doesn't depend on implementation internals of stdio,
> so that it could go into the application or a shim library.

No. This is not simple. The read and write callbacks take timer
value from cookie, and they block up until specific timeout. So yeah,
it goes basically even weirder. They modify semantics of fread() and
fwrite() which is kinda nasty.

Either way it's a can of worms. So we are stopping to reconsider
options.

/Timo


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Nonstandard functions with callbacks
  2014-11-11 13:46   ` Rich Felker
  2014-11-11 13:56     ` Timo Teras
@ 2014-11-11 13:56     ` Timo Teras
  1 sibling, 0 replies; 7+ messages in thread
From: Timo Teras @ 2014-11-11 13:56 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

On Tue, 11 Nov 2014 08:46:53 -0500
Rich Felker <dalias@libc.org> wrote:

> While of course I can't make the decision for you, I'd really
> encourage distros not to add to or change the public API in musl in
> ways that are not expected to ever make it upstream. The result is
> application binaries that have an undocumented dependency on your
> distro-specific version of musl and that won't work elsewhere, which
> is not so much of an issue for your own distro binaries, but is a
> potential gotcha for anyone compiling their own binaries on your dist
> and expecting them to work on other musl-based systems. Up to now I've
> been trying to reduce and hopefully eliminate the Alpine patches to
> musl that affect public interfaces and I think we're almost there.

Yes, we are not doing hasty decisions. So it's on hold for the time
being. I filed bug against Asterisk to see what happens. They are
supposed supporting Solaris and Mingw, which both lack this
functionality. It seems to have broken on Asterisk 13. Well TLS will
not work without that, but in Asterisk 13 they broke regular non-TLS
tcp connections...

> Have you looked at what other programs do as a fallback when there's
> no fopencookie? I don't think it's possible to match the API 100% with
> this approach (seeking won't work, for instance), but it seems
> possible to do a fallback implementation based on pthread_create and
> socketpair that doesn't depend on implementation internals of stdio,
> so that it could go into the application or a shim library.

No. This is not simple. The read and write callbacks take timer
value from cookie, and they block up until specific timeout. So yeah,
it goes basically even weirder. They modify semantics of fread() and
fwrite() which is kinda nasty.

Either way it's a can of worms. So we are stopping for a moment and
re-evaluating options.

/Timo


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Nonstandard functions with callbacks
  2014-11-11 13:56     ` Timo Teras
@ 2014-11-11 14:28       ` Rich Felker
  0 siblings, 0 replies; 7+ messages in thread
From: Rich Felker @ 2014-11-11 14:28 UTC (permalink / raw)
  To: musl

On Tue, Nov 11, 2014 at 03:56:12PM +0200, Timo Teras wrote:
> On Tue, 11 Nov 2014 08:46:53 -0500
> Rich Felker <dalias@libc.org> wrote:
> 
> > While of course I can't make the decision for you, I'd really
> > encourage distros not to add to or change the public API in musl in
> > ways that are not expected to ever make it upstream. The result is
> > application binaries that have an undocumented dependency on your
> > distro-specific version of musl and that won't work elsewhere, which
> > is not so much of an issue for your own distro binaries, but is a
> > potential gotcha for anyone compiling their own binaries on your dist
> > and expecting them to work on other musl-based systems. Up to now I've
> > been trying to reduce and hopefully eliminate the Alpine patches to
> > musl that affect public interfaces and I think we're almost there.
> 
> Yes, we are not doing hasty decisions. So it's on hold for the time
> being. I filed bug against Asterisk to see what happens. They are
> supposed supporting Solaris and Mingw, which both lack this
> functionality. It seems to have broken on Asterisk 13. Well TLS will
> not work without that, but in Asterisk 13 they broke regular non-TLS
> tcp connections...

Sounds like a mess, but this is good from our standpoint in that it
means they have a real problem they need to solve (regressions on
their side) rather than it just being compatibility with musl (which
they might not particularly care about).

> > Have you looked at what other programs do as a fallback when there's
> > no fopencookie? I don't think it's possible to match the API 100% with
> > this approach (seeking won't work, for instance), but it seems
> > possible to do a fallback implementation based on pthread_create and
> > socketpair that doesn't depend on implementation internals of stdio,
> > so that it could go into the application or a shim library.
> 
> No. This is not simple. The read and write callbacks take timer
> value from cookie, and they block up until specific timeout. So yeah,
> it goes basically even weirder.

That's not a problem. You can do timeouts in many other ways, like
applying the timeout to the socketpair with setsockopt or simply
having the helper thread close the socket when you want it to timeout.
However this kind of thing is likely best done in application specific
code rather than via a fake fopencookie.

> They modify semantics of fread() and
> fwrite() which is kinda nasty.

How so? Are you talking about the timeouts or something else?

Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Nonstandard functions with callbacks
  2014-11-11  7:22 ` Timo Teras
  2014-11-11 13:46   ` Rich Felker
@ 2014-11-12 14:27   ` Justin Cormack
  1 sibling, 0 replies; 7+ messages in thread
From: Justin Cormack @ 2014-11-12 14:27 UTC (permalink / raw)
  To: musl; +Cc: Rich Felker

On Tue, Nov 11, 2014 at 7:22 AM, Timo Teras <timo.teras@iki.fi> wrote:
> NetBSD current seems to have funopen2() with alternative signatures for
> the callbacks using ssize_t/size_t for read/write instead of int.
>
> So yeah, it's a royal mess.

I looked at NetBSD, and it is used internally for implementing
open_memstream, otherwise I would be tempted to remove it. The man
page says not portable, do not use. Might still be worth making it an
internal libc function instead though.

Justin


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-11-12 14:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-11  4:50 Nonstandard functions with callbacks Rich Felker
2014-11-11  7:22 ` Timo Teras
2014-11-11 13:46   ` Rich Felker
2014-11-11 13:56     ` Timo Teras
2014-11-11 14:28       ` Rich Felker
2014-11-11 13:56     ` Timo Teras
2014-11-12 14:27   ` Justin Cormack

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).