mailing list of musl libc
 help / color / mirror / code / Atom feed
* Broken silent glibc-specific assumptions uncovered by musl
@ 2013-05-17 17:37 Rich Felker
  2013-05-18  9:18 ` Szabolcs Nagy
  0 siblings, 1 reply; 8+ messages in thread
From: Rich Felker @ 2013-05-17 17:37 UTC (permalink / raw)
  To: musl

Hi all,

There's been at least one request for putting together a list of
"silent" application bugs uncovered by building against musl
applications which were previously used mainly/only with glibc. By
silent, I mean things that are not easily caught as configure- or
compile-time errors, but which cause the application to misbehave at
runtime.

I'm writing down here what I can think of off-hand. This list should
probably be expanded by the community and perhaps put on the wiki.
Here's what I have so far:



Assuming that dlerror is thread-local. (POSIX previously required it
to be global; as of 2008-TC1, either behavior is allowed.)

Assuming dlclose actually unloads a library (and calls dtors), so that
a future dlopen will reset static objects to their initial state (and
re-run ctors). (POSIX leaves this implementation-defined, and
unloading is impossible to do safely in general, so robust
implementations will not do it.)

Making wrong assumptions about fsync and fdatasync. (I'm not familiar
with this issue so somebody else will have to fill it in.)

Calling exit from global destructors. (If an application calls exit
more than once, the behavior is undefined.)

Assuming pthread_cancel unwinds and calls destructors. (Interaction
between cancellation and C++ is undefined.)

Use of GNU extensions in regular expressions, especially
backslash-prefixed versions of ERE operators in BRE. (Undefined.)

Assuming iconv reports characters that cannot be represented in the
dest charset via EILSEQ. (This behavior is non-conforming; POSIX
requires an implementation-defined replacement and positive return
value in this case.)

Use of deprecated charset aliases with iconv_open, for example, using
"UNICODE" to mean UCS-2. (The list of charsets is
implementation-defined, but common sense dictates using the IANA
preferred MIME charset names, and especially not misleading names.)



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Broken silent glibc-specific assumptions uncovered by musl
  2013-05-17 17:37 Broken silent glibc-specific assumptions uncovered by musl Rich Felker
@ 2013-05-18  9:18 ` Szabolcs Nagy
  2013-05-18  9:31   ` Daniel Cegiełka
  2013-05-18 14:15   ` Rich Felker
  0 siblings, 2 replies; 8+ messages in thread
From: Szabolcs Nagy @ 2013-05-18  9:18 UTC (permalink / raw)
  To: musl

* Rich Felker <dalias@aerifal.cx> [2013-05-17 13:37:10 -0400]:
> Making wrong assumptions about fsync and fdatasync. (I'm not familiar
> with this issue so somebody else will have to fill it in.)

apparently i didn't remember this correctly it's
just an old-linux vs new-linux break:
linux used to implement fsync,O_SYNC with the
same guarantees as fdatasync,O_DSYNC, so many
applications use fsync when they actually mean
fdatasync (faster, no mtime sync)

then some filesystems started to support O_SYNC
properly with a new flag, but it took some time
to trickle down as distros often use older libc
(where the O_DSYNC and O_SYNC definition was the
same so even on new kernel the applications kept
getting the old flag, hence i remembered it as
a libc issue: when i first tried musl on some
storage application it was significantly slower
due to this difference)

other runtime differences:
- "%Ld" instead of "%lld" as mentioned by sh4rm4 on irc
- lfs64 problems: eg printing off_t with "%d"
- serializing abi incompatible structures with (char*) cast
- relying on some locale specific behaviour (LC_NUMERIC)
- /proc fs issue with writev in musl stdio
- relying on LD_* or other env vars for glibc or the loader
- relying on /etc/* files used by glibc or the loader
- dlopen with RTLD_LAZY
- timezone files are not yet supported in musl
- crypt sha2 with long key input
- using constructors with priority gcc extension
- relying on the random generator algorithm to be the same
- musl's err does not print __progname, it might annoy one
- musl have some stubs
- ppc double-double long double


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Broken silent glibc-specific assumptions uncovered by musl
  2013-05-18  9:18 ` Szabolcs Nagy
@ 2013-05-18  9:31   ` Daniel Cegiełka
  2013-05-18 14:15   ` Rich Felker
  1 sibling, 0 replies; 8+ messages in thread
From: Daniel Cegiełka @ 2013-05-18  9:31 UTC (permalink / raw)
  To: musl

2013/5/18 Szabolcs Nagy <nsz@port70.net>:

> - musl's err does not print __progname, it might annoy one

http://git.musl-libc.org/cgit/musl/commit/?id=b4ea63856a6af3d1bcc2db12537785371ac2024c

Daniel


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Broken silent glibc-specific assumptions uncovered by musl
  2013-05-18  9:18 ` Szabolcs Nagy
  2013-05-18  9:31   ` Daniel Cegiełka
@ 2013-05-18 14:15   ` Rich Felker
  2013-05-18 22:51     ` Szabolcs Nagy
  1 sibling, 1 reply; 8+ messages in thread
From: Rich Felker @ 2013-05-18 14:15 UTC (permalink / raw)
  To: musl

On Sat, May 18, 2013 at 11:18:20AM +0200, Szabolcs Nagy wrote:
> other runtime differences:
> - "%Ld" instead of "%lld" as mentioned by sh4rm4 on irc

Yes.

> - lfs64 problems: eg printing off_t with "%d"

Is this common? I would think most apps now are built with 64-bit
off_t, at least in distros, since mixing can be dangerous.

> - serializing abi incompatible structures with (char*) cast

Have you seen examples of this?

> - relying on some locale specific behaviour (LC_NUMERIC)

Do you mean relying on being able to request specific non-default
behavior through a hard-coded locale name? Or something else?

> - /proc fs issue with writev in musl stdio

Yes, basically this is "assumptions about how stdio writes translate
into underlying writes to the file descriptor".

We might also add the issue that glibc incorrectly allows reads after
the EOF flag is set, and some apps might depend on this. I don't think
we've encountered any that do, but the glibc excuse for not fixing the
bug is that some might.

> - relying on LD_* or other env vars for glibc or the loader

We do support the main ones that could be used reasonably by
applications: LD_PRELOAD and LD_LIBRARY_PATH. Most of the others are
for debugging/"audit" stuff, I think.

> - relying on /etc/* files used by glibc or the loader

Examples?

> - dlopen with RTLD_LAZY

"Assuming that undefined function references in loaded libraries will
not produce an error as long as another library is loaded to satisfy
the reference before the first use, or the function is never used."

> - timezone files are not yet supported in musl

Yes, this is not so much relying on a glibc bug/quirk though, just
musl being incomplete in this area.

> - crypt sha2 with long key input

Have you seen examples of this? Or is it just theoretical?

> - using constructors with priority gcc extension

Do you know if musl just ignores the order, or fails to run them at
all?

> - relying on the random generator algorithm to be the same

I doubt applications directly make this assumption, but for programs
that let you generate random images/sounds/etc. and give you the seed
as a way of reproducing the same output again, seeds would not
necessarily be compatible between different systems. Such programs
really should be using their own prngs, however.

> - musl's err does not print __progname, it might annoy one

This should be fixed now that we have __progname, but I don't think it
_breaks_ anything.

> - musl have some stubs

Yes, this too falls under musl deficiencies, though, at least in most
cases. I wonder if anyone feels up to making a list of stubs to
discuss which ones should be de-stub-ified.

> - ppc double-double long double

I really doubt anyone depends on this or even wants it.. but is ppc
really using double-double still? The gcc docs make it sound like they
switched to IEEE quad when they made long double 128-bit, ignoring
what IBM did, but the glibc people seem to consider it double-double.

Rich


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Broken silent glibc-specific assumptions uncovered by musl
  2013-05-18 14:15   ` Rich Felker
@ 2013-05-18 22:51     ` Szabolcs Nagy
  2013-05-19 22:08       ` Rich Felker
  0 siblings, 1 reply; 8+ messages in thread
From: Szabolcs Nagy @ 2013-05-18 22:51 UTC (permalink / raw)
  To: musl

* Rich Felker <dalias@aerifal.cx> [2013-05-18 10:15:51 -0400]:
> On Sat, May 18, 2013 at 11:18:20AM +0200, Szabolcs Nagy wrote:
> > - lfs64 problems: eg printing off_t with "%d"
> 
> Is this common? I would think most apps now are built with 64-bit
> off_t, at least in distros, since mixing can be dangerous.
> 

no, but when i checked the sabotage patches i saw a lot
of printf fixes (none of them are lfs64 related though)
and i was thinking about legitimate reasons why printf
would break and this came to mind

> > - serializing abi incompatible structures with (char*) cast
> 
> Have you seen examples of this?

i've seen a lot of fragile serialization code
but they did not work with obscure libc structures
so it is probably not relevant

> > - relying on some locale specific behaviour (LC_NUMERIC)
> 
> Do you mean relying on being able to request specific non-default
> behavior through a hard-coded locale name? Or something else?
> 

i mean behaviour of shell utils:
eg. sort -n checks for the locale specific decimal and thousand
separators so it can behave differently with glibc vs musl
in the same environment

it shouldnt break things and is not unexpected but observable
runtime behaviour difference

> > - relying on LD_* or other env vars for glibc or the loader
> 
> We do support the main ones that could be used reasonably by
> applications: LD_PRELOAD and LD_LIBRARY_PATH. Most of the others are
> for debugging/"audit" stuff, I think.
> 

yes other ld_ stuff does not seem useful

there are many envvars used by glibc, eg now i found these:
TMPDIR (tempname)
MALLOC_* (malloc debug)
CHARSET (toutf8)
LANGUAGE (gettext)
DATEMSK (getdate)
MSGVERB (fmtmsg)
SEV_LEVEL (fmtmsg)
LOCPATH (newlocale)
_<PID>_GNU_nonoption_argv_flags_ (getopt)
POSIXLY_CORRECT (getopt, fnmatch)
ARGP_HELP_FMT (argp-help)
RESOLV_* (resolv)
RES_OPTIONS (resolv)
HOSTALIASES (resolv)
LOCALDOMAIN (resolv)
NLSPATH (catgets)
GCONV_PATH (gconv)
GETCONF_DIR (sysconf)
TZDIR (tzfile)
LIBC_FATAL_STDERR_ (libc_fatal)

none of them seem very interesting, but the list
shows some functionality that one may rely upon
(eg musl has no complex resolv thing and that
is a runtime behaviour difference)

> > - relying on /etc/* files used by glibc or the loader
> 
> Examples?
> 

some paths glibc references but musl does not:

/etc/ld.so.preload (rtld)
/etc/.pwd.lock (shadow)
/etc/{host.conf,networks,protocols,..} (resolv)
/etc/gai.conf (getaddrinfo)
/dev/console (syslog)
/usr/lib/pt_chown (grantpt)
/usr/local/etc/zoneinfo (timezone)

> > - crypt sha2 with long key input
> 
> Have you seen examples of this? Or is it just theoretical?
> 

theoretical

iirc solardiz said at some point that there might
be crypt based benchmarks which use larger keybuf

> > - using constructors with priority gcc extension
> 
> Do you know if musl just ignores the order, or fails to run them at
> all?
> 

ok, the priority is solved by the linker and musl runs them

here (i386) constructors are put into .ctors.* sections
which get sorted by the linker

on arm they are put into .init_array.*

it seems the linker and glibc support mixing these:
the order in which init things are run is

 preinit_array
 ctors (priority sorted by the linker)
 init_array (priority sorted by the linker)

on i386 i have to explicitly request something to get
into an .init_array section, and then it will be run
by glibc but not by musl

i think musl does not support .preinit_array at all

these are probably rarely used features

> > - ppc double-double long double
> 
> I really doubt anyone depends on this or even wants it.. but is ppc
> really using double-double still? The gcc docs make it sound like they
> switched to IEEE quad when they made long double 128-bit, ignoring
> what IBM did, but the glibc people seem to consider it double-double.
> 

yes, i dont think ppl depend on this but you can never know

some applications may have different behaviour under
a glibc ld128ibm/ld128ieee toolchain vs a musl ld64 toolchain


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Broken silent glibc-specific assumptions uncovered by musl
  2013-05-18 22:51     ` Szabolcs Nagy
@ 2013-05-19 22:08       ` Rich Felker
  2013-05-20  0:17         ` Szabolcs Nagy
  0 siblings, 1 reply; 8+ messages in thread
From: Rich Felker @ 2013-05-19 22:08 UTC (permalink / raw)
  To: musl

On Sun, May 19, 2013 at 12:51:47AM +0200, Szabolcs Nagy wrote:
> > > - relying on LD_* or other env vars for glibc or the loader
> > 
> > We do support the main ones that could be used reasonably by
> > applications: LD_PRELOAD and LD_LIBRARY_PATH. Most of the others are
> > for debugging/"audit" stuff, I think.
> > 
> 
> yes other ld_ stuff does not seem useful
> 
> there are many envvars used by glibc, eg now i found these:
> [...]
> CHARSET (toutf8)

What is toutf8? (Just curious)

> DATEMSK (getdate)

This is POSIX and supported by musl. :)

> MSGVERB (fmtmsg)
> SEV_LEVEL (fmtmsg)

I believe these are standard too, but we presently don't have fmtmsg.
It's one of the few missing XSI interfaces.

> RESOLV_* (resolv)
> RES_OPTIONS (resolv)
> HOSTALIASES (resolv)
> LOCALDOMAIN (resolv)

Some of these may be desirable at some point.

> NLSPATH (catgets)

I believe this is standard too.

> > > - using constructors with priority gcc extension
> > 
> > Do you know if musl just ignores the order, or fails to run them at
> > all?
> > 
> 
> ok, the priority is solved by the linker and musl runs them
> 
> here (i386) constructors are put into .ctors.* sections
> which get sorted by the linker

How does this work for dynamic linking? Is priority only respected
within a single DSO, and not between multiple DSOs?

> on arm they are put into .init_array.*
> 
> it seems the linker and glibc support mixing these:
> the order in which init things are run is
> 
>  preinit_array
>  ctors (priority sorted by the linker)
>  init_array (priority sorted by the linker)
> 
> on i386 i have to explicitly request something to get
> into an .init_array section, and then it will be run
> by glibc but not by musl
> 
> i think musl does not support .preinit_array at all
> 
> these are probably rarely used features

Yes, at some point we should probably revisit this. In addition, it
seems that the init_array stuff might eventually be used on more and
more archs, so we might need to investigate whether there's a way to
write the code for calling it in C rather than asm, and then somehow
merge the C and asm object files when generating crti.o and crtn.o...

Unfortunately, however, I'm skeptical of whether this can give
reasonable code generation that works for both PIC(PIE) and non-PIC
cases...

> > > - ppc double-double long double
> > 
> > I really doubt anyone depends on this or even wants it.. but is ppc
> > really using double-double still? The gcc docs make it sound like they
> > switched to IEEE quad when they made long double 128-bit, ignoring
> > what IBM did, but the glibc people seem to consider it double-double.
> 
> yes, i dont think ppl depend on this but you can never know
> 
> some applications may have different behaviour under
> a glibc ld128ibm/ld128ieee toolchain vs a musl ld64 toolchain

I think for this list, if we're going to publish it, we should focus
on glibc-specific assumptions that were actually found in practice.
Bringing in lots of theoretical ones just adds doubt to whether musl
will meet people's needs, and unless they're clearly marked and
separated from the issues we actually found, I think it makes the list
less informative -- the fact that we actually found certain types of
problems, rather than just reasoning about ones that might arise, is
in many ways the most informative aspect of the list. Of course, it
could be supplemented by an additional list of more theoretical
considerations.

Rich


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Broken silent glibc-specific assumptions uncovered by musl
  2013-05-19 22:08       ` Rich Felker
@ 2013-05-20  0:17         ` Szabolcs Nagy
  2013-05-20  0:23           ` Rich Felker
  0 siblings, 1 reply; 8+ messages in thread
From: Szabolcs Nagy @ 2013-05-20  0:17 UTC (permalink / raw)
  To: musl

* Rich Felker <dalias@aerifal.cx> [2013-05-19 18:08:32 -0400]:
> On Sun, May 19, 2013 at 12:51:47AM +0200, Szabolcs Nagy wrote:
> > CHARSET (toutf8)
> 
> What is toutf8? (Just curious)
> 

it is in libidn
"toutf8.c --- Convert strings from system locale into UTF-8."

> > here (i386) constructors are put into .ctors.* sections
> > which get sorted by the linker
> 
> How does this work for dynamic linking? Is priority only respected
> within a single DSO, and not between multiple DSOs?
> 

i think ordering is only guaranteed within a single dso
and this is not clearly documented

ld -verbose

shows the actual linker script that merges the
relevant sections



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Broken silent glibc-specific assumptions uncovered by musl
  2013-05-20  0:17         ` Szabolcs Nagy
@ 2013-05-20  0:23           ` Rich Felker
  0 siblings, 0 replies; 8+ messages in thread
From: Rich Felker @ 2013-05-20  0:23 UTC (permalink / raw)
  To: musl

On Mon, May 20, 2013 at 02:17:05AM +0200, Szabolcs Nagy wrote:
> * Rich Felker <dalias@aerifal.cx> [2013-05-19 18:08:32 -0400]:
> > On Sun, May 19, 2013 at 12:51:47AM +0200, Szabolcs Nagy wrote:
> > > CHARSET (toutf8)
> > 
> > What is toutf8? (Just curious)
> > 
> 
> it is in libidn
> "toutf8.c --- Convert strings from system locale into UTF-8."

Oh. Speaking of which, we need to add IDN support at some point...

> > > here (i386) constructors are put into .ctors.* sections
> > > which get sorted by the linker
> > 
> > How does this work for dynamic linking? Is priority only respected
> > within a single DSO, and not between multiple DSOs?
> 
> i think ordering is only guaranteed within a single dso
> and this is not clearly documented

Yes, I think that's the only approach that makes sense anyway. And it
makes the whole ctor-priority system even more ugly because it causes
program semantics to change depending on how the program is broken up
into shared libraries, counteracting all the hard work ELF did to make
dynamic linking transparent with respect to program semantics...

Rich


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-05-20  0:23 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-17 17:37 Broken silent glibc-specific assumptions uncovered by musl Rich Felker
2013-05-18  9:18 ` Szabolcs Nagy
2013-05-18  9:31   ` Daniel Cegiełka
2013-05-18 14:15   ` Rich Felker
2013-05-18 22:51     ` Szabolcs Nagy
2013-05-19 22:08       ` Rich Felker
2013-05-20  0:17         ` Szabolcs Nagy
2013-05-20  0:23           ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).