mailing list of musl libc
 help / color / mirror / code / Atom feed
* string word-at-a-time and atomic.h FAQ on twitter
@ 2016-01-05 16:46 Szabolcs Nagy
  2016-01-05 17:50 ` Rich Felker
  2016-01-12 21:02 ` Alexander Cherepanov
  0 siblings, 2 replies; 16+ messages in thread
From: Szabolcs Nagy @ 2016-01-05 16:46 UTC (permalink / raw)
  To: musl

https://twitter.com/johnregehr/status/684126374966198281

these are old faq items but since we dont have docs about internals
i try to address them based on my understanding:

1) musl strlen oob access:

the os manages memory with page granularity, this is an internal
detail that the user should not rely on in c, but the libc can,
(otherwise malloc/free and the dynamic linker could not be
implemented in c) so oob accesses in word-at-a-time string
algorithms do not cause segfault on the os level.

in theory the compiler is part of the implementation so it can
treat libc code specially, but in practice libc code is normal
freestanding c code. this means that the compiler can treat oob
access arbitrarily following the abstract c semantics if it can
see through the implementation with lto.  however i find lto a
weak excuse to rewrite strlen in asm for all targets since lto
of libc is still not practical and asm implementations
historically had a lot of target specific bugs in other libcs.
i think compiler attributes should be used here on compilers that
might break the code, but there is no attribute for this kind of
oob access yet (although may_alias attribute is missing here too
and should be added like in other string functions).

this takes care of oob access, but the bytes outside the passed
object might change concurrently i.e. strlen might introduce a
data race: again this is a problem on the abstract c language
level that may be solved e.g. by making all accesses to those
bytes relaxed atomic, but user code is not under libc control.
in practice the code works if HASZERO reads the word once so it
does arithmetics with a consistent value (because the memory
model of the underlying machine does not treat such race
undefined and it does not propagate unspecified value bits nor
has trap representations).

we do not try to enforce these behaviours on the c level yet
(only a very narrow set of string functions are affected which
are also very performance critical), but fortunately those who
are worried that the code is not correct can always generate asm
and compile that into the libc. (and then one can verify that
indeed the generated code is completely correct on the asm level.
maybe musl will add generated asm to the repo, but there are other
pending cleanup works related to asm vs c level semantics and
these should be considered together.)


2) musl atomic.h sync primitives

the primitives in atomic.h are carefully designed for musl's
pthread implementation (which seems to me far ahead of other
implementations in terms of correctness and portability).

however they are not documented in the code (only in the git
log) so ppl assume they understand their precise interface
contract by guessing (which is usually wrong because the names
are misleading).

musl does not use 64-bit atomic primitives, a_and_64 and a_or_64
have secific uses in the malloc implementation which determine
their semantics.


3) a_crash

formally a_crash can be anything (only called if user invoked
ub or underlying system broke interface contract).

in practice a_crash should be __builtin_trap (i.e. the most
lightweight way of terminating the process and this matters for
security which is of course not c level semantics), but builtin
usage is mininmized in musl which makes it possible to compile
it with several c compilers with consistent behaviour (e.g. gcc
does not guarantee consistent behaviour for __builtin_trap
across targets and falls back to abort if a target does not
have appropriate target hook defined), keeping the interface
between the compiler and libc minimal is a key design choice
in musl.

at some point this should be cleaned up and all targets should
have proper single instruction crash, but that's low priority
cleanup work so on some targets this is not yet done (there are
other pending atomic.h cleanup works).



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: string word-at-a-time and atomic.h FAQ on twitter
  2016-01-05 16:46 string word-at-a-time and atomic.h FAQ on twitter Szabolcs Nagy
@ 2016-01-05 17:50 ` Rich Felker
  2016-01-05 23:39   ` Matthew Fernandez
  2016-01-08 21:59   ` Alexander Cherepanov
  2016-01-12 21:02 ` Alexander Cherepanov
  1 sibling, 2 replies; 16+ messages in thread
From: Rich Felker @ 2016-01-05 17:50 UTC (permalink / raw)
  To: musl

On Tue, Jan 05, 2016 at 05:46:41PM +0100, Szabolcs Nagy wrote:
> https://twitter.com/johnregehr/status/684126374966198281
> 
> these are old faq items but since we dont have docs about internals
> i try to address them based on my understanding:

An "internals FAQ" would be a welcome addition to the wiki. I think
your text is mostly ok, but I'm including some thoughts below on
improving/fixing these issues and the possible directions we might
take.

> 1) musl strlen oob access:
> 
> the os manages memory with page granularity, this is an internal
> detail that the user should not rely on in c, but the libc can,
> (otherwise malloc/free and the dynamic linker could not be
> implemented in c) so oob accesses in word-at-a-time string
> algorithms do not cause segfault on the os level.
> 
> in theory the compiler is part of the implementation so it can
> treat libc code specially, but in practice libc code is normal
> freestanding c code. this means that the compiler can treat oob
> access arbitrarily following the abstract c semantics if it can
> see through the implementation with lto.  however i find lto a
> weak excuse to rewrite strlen in asm for all targets since lto
> of libc is still not practical and asm implementations
> historically had a lot of target specific bugs in other libcs.
> i think compiler attributes should be used here on compilers that
> might break the code, but there is no attribute for this kind of
> oob access yet (although may_alias attribute is missing here too
> and should be added like in other string functions).

I agree that we should use compiler attributes and fall back to an
utterly naive implementation if __GNUC__ (or other indication that
such attributes are supported) is not defined. Clearly we should be
using may_alias for the word-at-a-time access, but as you've said
there doesn't seem to be any attribute to safely access past the end
of the object.

In the particular case of strlen, the naive unrolled strlen with no
OOB access is actually optimal on most or all 32-bit archs, better
than what we have now. I suspect the same is true for strchr and other
related functions. So we could just consider trying to drop the OOB
accesses. Do we have a list of affected functions? That might be nice
to include.

> this takes care of oob access, but the bytes outside the passed
> object might change concurrently i.e. strlen might introduce a
> data race: again this is a problem on the abstract c language
> level that may be solved e.g. by making all accesses to those
> bytes relaxed atomic, but user code is not under libc control.
> in practice the code works if HASZERO reads the word once so it
> does arithmetics with a consistent value (because the memory
> model of the underlying machine does not treat such race
> undefined and it does not propagate unspecified value bits nor
> has trap representations).

Indeed, this seems like less of a practical concern. I think we could
make it a non-concern via volatile accesses (the practical concern
with data races like this is transformations that convert one read to
multiple reads and expect to see the same value each time) but that
would preclude most optimization by the compiler.

> we do not try to enforce these behaviours on the c level yet
> (only a very narrow set of string functions are affected which
> are also very performance critical), but fortunately those who
> are worried that the code is not correct can always generate asm
> and compile that into the libc. (and then one can verify that
> indeed the generated code is completely correct on the asm level.
> maybe musl will add generated asm to the repo, but there are other
> pending cleanup works related to asm vs c level semantics and
> these should be considered together.)

I would rather not add generated asm; not only is it ugly and
potentially full of directives we may not want (like cfi and stuff
that may not work on some as versions or alternate assemblers) but
also it can't be optimized for the specific target (e.g. modern x86
with sse instead of base i386, armv7 instead of armv4t, etc.), and it
can't be inlined or otherwise optimized with LTO.

> 2) musl atomic.h sync primitives
> 
> the primitives in atomic.h are carefully designed for musl's
> pthread implementation (which seems to me far ahead of other
> implementations in terms of correctness and portability).
> 
> however they are not documented in the code (only in the git
> log) so ppl assume they understand their precise interface
> contract by guessing (which is usually wrong because the names
> are misleading).
> 
> musl does not use 64-bit atomic primitives, a_and_64 and a_or_64
> have secific uses in the malloc implementation which determine
> their semantics.

I think we could replace them with atomic bitset/bitclear. They're
only used in a couple places, mainly malloc.

> 3) a_crash
> 
> formally a_crash can be anything (only called if user invoked
> ub or underlying system broke interface contract).
> 
> in practice a_crash should be __builtin_trap (i.e. the most
> lightweight way of terminating the process and this matters for
> security which is of course not c level semantics), but builtin
> usage is mininmized in musl which makes it possible to compile
> it with several c compilers with consistent behaviour (e.g. gcc
> does not guarantee consistent behaviour for __builtin_trap
> across targets and falls back to abort if a target does not
> have appropriate target hook defined), keeping the interface
> between the compiler and libc minimal is a key design choice
> in musl.
> 
> at some point this should be cleaned up and all targets should
> have proper single instruction crash, but that's low priority
> cleanup work so on some targets this is not yet done (there are
> other pending atomic.h cleanup works).

Indeed. Actually I'd kind of prefer improving a_crash to block signals
(via inline syscall) before issuing the SIGSEGV/SIGILL to ensure that
the kernel terminates the process rather than allowing a handler to
run.

I forget -- do you have an account on the wiki to add these things? If
not we should get you one, but in the mean time someone else could add
this content.

Rich


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: string word-at-a-time and atomic.h FAQ on twitter
  2016-01-05 17:50 ` Rich Felker
@ 2016-01-05 23:39   ` Matthew Fernandez
  2016-01-06  2:56     ` Szabolcs Nagy
  2016-01-08 21:59   ` Alexander Cherepanov
  1 sibling, 1 reply; 16+ messages in thread
From: Matthew Fernandez @ 2016-01-05 23:39 UTC (permalink / raw)
  To: musl

On 06/01/16 04:50, Rich Felker wrote:
> On Tue, Jan 05, 2016 at 05:46:41PM +0100, Szabolcs Nagy wrote:
>> ...  however i find lto a
>> weak excuse to rewrite strlen in asm for all targets since lto
>> of libc is still not practical ...

This is an interesting comment; would you mind expanding on this a little? Though we don't do it
regularly, we have used LTO on Musl C in the past. I am not aware of any Musl-specific issues we
encountered, though it's entirely possible our build contained latent bugs.

>> 2) musl atomic.h sync primitives
>>
>> the primitives in atomic.h are carefully designed for musl's
>> pthread implementation (which seems to me far ahead of other
>> implementations in terms of correctness and portability).
>>
>> however they are not documented in the code (only in the git
>> log) so ppl assume they understand their precise interface
>> contract by guessing (which is usually wrong because the names
>> are misleading).
>>
>> musl does not use 64-bit atomic primitives, a_and_64 and a_or_64
>> have secific uses in the malloc implementation which determine
>> their semantics.
>
> I think we could replace them with atomic bitset/bitclear. They're
> only used in a couple places, mainly malloc.

This would certainly make us happier. We've been bitten a couple of times by newcomers (and myself
once) assuming these were suitable 64-bit atomic primitives based on the naming.

________________________________

The information in this e-mail may be confidential and subject to legal professional privilege and/or copyright. National ICT Australia Limited accepts no liability for any damage caused by this email or its attachments.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: string word-at-a-time and atomic.h FAQ on twitter
  2016-01-05 23:39   ` Matthew Fernandez
@ 2016-01-06  2:56     ` Szabolcs Nagy
  0 siblings, 0 replies; 16+ messages in thread
From: Szabolcs Nagy @ 2016-01-06  2:56 UTC (permalink / raw)
  To: musl

* Matthew Fernandez <matthew.fernandez@nicta.com.au> [2016-01-06 10:39:34 +1100]:
> On 06/01/16 04:50, Rich Felker wrote:
> >On Tue, Jan 05, 2016 at 05:46:41PM +0100, Szabolcs Nagy wrote:
> >>...  however i find lto a
> >>weak excuse to rewrite strlen in asm for all targets since lto
> >>of libc is still not practical ...
> 
> This is an interesting comment; would you mind expanding on this a little? Though we don't do it
> regularly, we have used LTO on Musl C in the past. I am not aware of any Musl-specific issues we
> encountered, though it's entirely possible our build contained latent bugs.

two things:

1) does not give the expected improvements (smaller, faster)
on code that's factored well.

2) wrong code genertion (because of lto implementation bugs,
or because of invalid c code)

i think we saw issues around file scope inline asm,
we speculated about issues around fenv access because
the compilers don't model that correctly and
early dynamic linker code.

it is easy to imagine other nasty cases especially if
gcc extensions are in use: they are often not carefully
specified enough to have robust semantics with lto
(e.g. attribute const for tls was an example where the
semantics is unclear but lto makes it easier to break
code if the user and compiler disagree)

in short: i dont see the benefits given the risks.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: string word-at-a-time and atomic.h FAQ on twitter
  2016-01-05 17:50 ` Rich Felker
  2016-01-05 23:39   ` Matthew Fernandez
@ 2016-01-08 21:59   ` Alexander Cherepanov
  2016-01-08 22:05     ` Rich Felker
  1 sibling, 1 reply; 16+ messages in thread
From: Alexander Cherepanov @ 2016-01-08 21:59 UTC (permalink / raw)
  To: musl

On 2016-01-05 20:50, Rich Felker wrote:
> So we could just consider trying to drop the OOB
> accesses. Do we have a list of affected functions? That might be nice
> to include.

I think it would be nice to have a full list of intentional UB. For 
example, this:

   http://git.musl-libc.org/cgit/musl/tree/src/stdio/vsnprintf.c#n33

   if (n > (char *)0+SIZE_MAX-s-1) n = (char *)0+SIZE_MAX-s-1;

If I understand the code correctly, fixing it will require changes to 
the FILE structure. Are there such plans?

>> this takes care of oob access, but the bytes outside the passed
>> object might change concurrently i.e. strlen might introduce a
>> data race: again this is a problem on the abstract c language
>> level that may be solved e.g. by making all accesses to those
>> bytes relaxed atomic, but user code is not under libc control.
>> in practice the code works if HASZERO reads the word once so it
>> does arithmetics with a consistent value (because the memory
>> model of the underlying machine does not treat such race
>> undefined and it does not propagate unspecified value bits nor
>> has trap representations).
>
> Indeed, this seems like less of a practical concern.

HASZERO reads the word twice so this should be a problem for unoptimized 
code on big-endian platforms.

-- 
Alexander Cherepanov


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: string word-at-a-time and atomic.h FAQ on twitter
  2016-01-08 21:59   ` Alexander Cherepanov
@ 2016-01-08 22:05     ` Rich Felker
  2016-01-08 22:39       ` Alexander Cherepanov
  0 siblings, 1 reply; 16+ messages in thread
From: Rich Felker @ 2016-01-08 22:05 UTC (permalink / raw)
  To: musl

On Sat, Jan 09, 2016 at 12:59:55AM +0300, Alexander Cherepanov wrote:
> On 2016-01-05 20:50, Rich Felker wrote:
> >So we could just consider trying to drop the OOB
> >accesses. Do we have a list of affected functions? That might be nice
> >to include.
> 
> I think it would be nice to have a full list of intentional UB. For
> example, this:
> 
>   http://git.musl-libc.org/cgit/musl/tree/src/stdio/vsnprintf.c#n33
> 
>   if (n > (char *)0+SIZE_MAX-s-1) n = (char *)0+SIZE_MAX-s-1;
> 
> If I understand the code correctly, fixing it will require changes
> to the FILE structure. Are there such plans?

Fixing it requires not just changes to the structure, but abandoning
compatibility with buffer reads and writes via the glibc ABI (used by
glibc getc_unlocked and putc_unlocked macros). This is not as bad as
it sounds; if we just abandoned use of the glibc-offset-defined fields
and made them always null pointers, then legacy glibc code would
always see no buffer available and make a function call instead. I'm
not clear on whether this might break code using the gnulib junk that
pokes at glibc FILE internals, though, or whether such code works now,
nor am I clear whether we even care.

> >>this takes care of oob access, but the bytes outside the passed
> >>object might change concurrently i.e. strlen might introduce a
> >>data race: again this is a problem on the abstract c language
> >>level that may be solved e.g. by making all accesses to those
> >>bytes relaxed atomic, but user code is not under libc control.
> >>in practice the code works if HASZERO reads the word once so it
> >>does arithmetics with a consistent value (because the memory
> >>model of the underlying machine does not treat such race
> >>undefined and it does not propagate unspecified value bits nor
> >>has trap representations).
> >
> >Indeed, this seems like less of a practical concern.
> 
> HASZERO reads the word twice so this should be a problem for
> unoptimized code on big-endian platforms.

The number of abstract-machine reads is irrelevant unless we use
volatile here. A good compiler will always reduce it to one read, and
a bad compiler is always free to turn it into multiple reads.

Rich


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: string word-at-a-time and atomic.h FAQ on twitter
  2016-01-08 22:05     ` Rich Felker
@ 2016-01-08 22:39       ` Alexander Cherepanov
  2016-01-08 22:59         ` Rich Felker
  0 siblings, 1 reply; 16+ messages in thread
From: Alexander Cherepanov @ 2016-01-08 22:39 UTC (permalink / raw)
  To: musl

On 2016-01-09 01:05, Rich Felker wrote:
> On Sat, Jan 09, 2016 at 12:59:55AM +0300, Alexander Cherepanov wrote:
>> On 2016-01-05 20:50, Rich Felker wrote:
>>> So we could just consider trying to drop the OOB
>>> accesses. Do we have a list of affected functions? That might be nice
>>> to include.
>>
>> I think it would be nice to have a full list of intentional UB. For
>> example, this:
>>
>>    http://git.musl-libc.org/cgit/musl/tree/src/stdio/vsnprintf.c#n33
>>
>>    if (n > (char *)0+SIZE_MAX-s-1) n = (char *)0+SIZE_MAX-s-1;
>>
>> If I understand the code correctly, fixing it will require changes
>> to the FILE structure. Are there such plans?
>
> Fixing it requires not just changes to the structure, but abandoning
> compatibility with buffer reads and writes via the glibc ABI (used by
> glibc getc_unlocked and putc_unlocked macros). This is not as bad as
> it sounds; if we just abandoned use of the glibc-offset-defined fields
> and made them always null pointers, then legacy glibc code would
> always see no buffer available and make a function call instead. I'm
> not clear on whether this might break code using the gnulib junk that
> pokes at glibc FILE internals, though, or whether such code works now,
> nor am I clear whether we even care.

Ugh.

Documenting UB in it in the mean time seems like non-controversial thing:-)

>>>> this takes care of oob access, but the bytes outside the passed
>>>> object might change concurrently i.e. strlen might introduce a
>>>> data race: again this is a problem on the abstract c language
>>>> level that may be solved e.g. by making all accesses to those
>>>> bytes relaxed atomic, but user code is not under libc control.
>>>> in practice the code works if HASZERO reads the word once so it
>>>> does arithmetics with a consistent value (because the memory
>>>> model of the underlying machine does not treat such race
>>>> undefined and it does not propagate unspecified value bits nor
>>>> has trap representations).
>>>
>>> Indeed, this seems like less of a practical concern.
>>
>> HASZERO reads the word twice so this should be a problem for
>> unoptimized code on big-endian platforms.
>
> The number of abstract-machine reads is irrelevant unless we use
> volatile here. A good compiler will always reduce it to one read, and
> a bad compiler is always free to turn it into multiple reads.

Ok, I'll reformulate: is compiling musl on a big-endian platform with 
optimizations turned off officially supported?

-- 
Alexander Cherepanov


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: string word-at-a-time and atomic.h FAQ on twitter
  2016-01-08 22:39       ` Alexander Cherepanov
@ 2016-01-08 22:59         ` Rich Felker
  2016-01-09  1:40           ` Szabolcs Nagy
  2016-01-12 12:41           ` Alexander Cherepanov
  0 siblings, 2 replies; 16+ messages in thread
From: Rich Felker @ 2016-01-08 22:59 UTC (permalink / raw)
  To: musl

On Sat, Jan 09, 2016 at 01:39:10AM +0300, Alexander Cherepanov wrote:
> >>>>this takes care of oob access, but the bytes outside the passed
> >>>>object might change concurrently i.e. strlen might introduce a
> >>>>data race: again this is a problem on the abstract c language
> >>>>level that may be solved e.g. by making all accesses to those
> >>>>bytes relaxed atomic, but user code is not under libc control.
> >>>>in practice the code works if HASZERO reads the word once so it
> >>>>does arithmetics with a consistent value (because the memory
> >>>>model of the underlying machine does not treat such race
> >>>>undefined and it does not propagate unspecified value bits nor
> >>>>has trap representations).
> >>>
> >>>Indeed, this seems like less of a practical concern.
> >>
> >>HASZERO reads the word twice so this should be a problem for
> >>unoptimized code on big-endian platforms.
> >
> >The number of abstract-machine reads is irrelevant unless we use
> >volatile here. A good compiler will always reduce it to one read, and
> >a bad compiler is always free to turn it into multiple reads.
> 
> Ok, I'll reformulate: is compiling musl on a big-endian platform
> with optimizations turned off officially supported?

Yes, and I don't see why you expect this case to break due to data
race issues.

Rich


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: string word-at-a-time and atomic.h FAQ on twitter
  2016-01-08 22:59         ` Rich Felker
@ 2016-01-09  1:40           ` Szabolcs Nagy
  2016-01-12 12:41           ` Alexander Cherepanov
  1 sibling, 0 replies; 16+ messages in thread
From: Szabolcs Nagy @ 2016-01-09  1:40 UTC (permalink / raw)
  To: musl

* Rich Felker <dalias@libc.org> [2016-01-08 17:59:51 -0500]:
> On Sat, Jan 09, 2016 at 01:39:10AM +0300, Alexander Cherepanov wrote:
> > >>>>this takes care of oob access, but the bytes outside the passed
> > >>>>object might change concurrently i.e. strlen might introduce a
> > >>>>data race: again this is a problem on the abstract c language
> > >>>>level that may be solved e.g. by making all accesses to those
> > >>>>bytes relaxed atomic, but user code is not under libc control.
> > >>>>in practice the code works if HASZERO reads the word once so it
> > >>>>does arithmetics with a consistent value (because the memory
> > >>>>model of the underlying machine does not treat such race
> > >>>>undefined and it does not propagate unspecified value bits nor
> > >>>>has trap representations).
> > >>>
> > >>>Indeed, this seems like less of a practical concern.
> > >>
> > >>HASZERO reads the word twice so this should be a problem for
> > >>unoptimized code on big-endian platforms.
> > >
> > >The number of abstract-machine reads is irrelevant unless we use
> > >volatile here. A good compiler will always reduce it to one read, and
> > >a bad compiler is always free to turn it into multiple reads.
> > 
> > Ok, I'll reformulate: is compiling musl on a big-endian platform
> > with optimizations turned off officially supported?
> 
> Yes, and I don't see why you expect this case to break due to data
> race issues.

i didnt think it through, just assumed changing bits
would ruin the computation.

but it seems if the zero byte is unchanged then other
bits can arbitrarily change between the two evaluations
of x in

#define HASZERO(x) ((x)-ONES & ~(x) & HIGHS)

and the predicate is still true

..unless some silly code transformation is applied
like x -> x+x-x


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: string word-at-a-time and atomic.h FAQ on twitter
  2016-01-08 22:59         ` Rich Felker
  2016-01-09  1:40           ` Szabolcs Nagy
@ 2016-01-12 12:41           ` Alexander Cherepanov
  1 sibling, 0 replies; 16+ messages in thread
From: Alexander Cherepanov @ 2016-01-12 12:41 UTC (permalink / raw)
  To: musl

On 2016-01-09 01:59, Rich Felker wrote:
> On Sat, Jan 09, 2016 at 01:39:10AM +0300, Alexander Cherepanov wrote:
>>>>>> this takes care of oob access, but the bytes outside the passed
>>>>>> object might change concurrently i.e. strlen might introduce a
>>>>>> data race: again this is a problem on the abstract c language
>>>>>> level that may be solved e.g. by making all accesses to those
>>>>>> bytes relaxed atomic, but user code is not under libc control.
>>>>>> in practice the code works if HASZERO reads the word once so it
>>>>>> does arithmetics with a consistent value (because the memory
>>>>>> model of the underlying machine does not treat such race
>>>>>> undefined and it does not propagate unspecified value bits nor
>>>>>> has trap representations).
>>>>>
>>>>> Indeed, this seems like less of a practical concern.
>>>>
>>>> HASZERO reads the word twice so this should be a problem for
>>>> unoptimized code on big-endian platforms.
>>>
>>> The number of abstract-machine reads is irrelevant unless we use
>>> volatile here. A good compiler will always reduce it to one read, and
>>> a bad compiler is always free to turn it into multiple reads.
>>
>> Ok, I'll reformulate: is compiling musl on a big-endian platform
>> with optimizations turned off officially supported?
>
> Yes, and I don't see why you expect this case to break due to data
> race issues.

Right, I was too fast. I checked that outside bytes affect computations 
on inside bytes (on BE platforms) but the effect is not strong enough to 
break the code. Sorry for the noise.

Perhaps a comment is warranted in this and other hairy cases (like qsort)?

As for a list of affected functions, the first approximation:

$ grep -rF HASZERO | grep -v '#define'
src/string/stpcpy.c:		for (; !HASZERO(*ws); *wd++ = *ws++);
src/string/memchr.c:		for (w = (const void *)s; n>=SS && !HASZERO(*w^k); 
w++, n-=SS);
src/string/strlen.c:	for (w = (const void *)s; !HASZERO(*w); w++);
src/string/strchrnul.c:	for (w = (void *)s; !HASZERO(*w) && 
!HASZERO(*w^k); w++);
src/string/memccpy.c:		for (; n>=sizeof(size_t) && !HASZERO(*ws^k);
src/string/strlcpy.c:			for (; n>=sizeof(size_t) && !HASZERO(*ws);
src/string/stpncpy.c:		for (; n>=sizeof(size_t) && !HASZERO(*ws);

HASZERO is not guarded by a size check in strpcpy, strlen and strchrnul.

-- 
Alexander Cherepanov


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: string word-at-a-time and atomic.h FAQ on twitter
  2016-01-05 16:46 string word-at-a-time and atomic.h FAQ on twitter Szabolcs Nagy
  2016-01-05 17:50 ` Rich Felker
@ 2016-01-12 21:02 ` Alexander Cherepanov
  2016-01-12 21:09   ` Alexander Cherepanov
  1 sibling, 1 reply; 16+ messages in thread
From: Alexander Cherepanov @ 2016-01-12 21:02 UTC (permalink / raw)
  To: musl

On 2016-01-05 19:46, Szabolcs Nagy wrote:
> i think compiler attributes should be used here on compilers that
> might break the code, but there is no attribute for this kind of
> oob access yet (although may_alias attribute is missing here too
> and should be added like in other string functions).

Perhaps the noclone function attribute could be used in the meantime?

https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-g_t_0040code_007bnoclone_007d-function-attribute-3205

-- 
Alexander Cherepanov


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: string word-at-a-time and atomic.h FAQ on twitter
  2016-01-12 21:02 ` Alexander Cherepanov
@ 2016-01-12 21:09   ` Alexander Cherepanov
  2016-01-12 23:07     ` Szabolcs Nagy
  0 siblings, 1 reply; 16+ messages in thread
From: Alexander Cherepanov @ 2016-01-12 21:09 UTC (permalink / raw)
  To: musl

On 2016-01-13 00:02, Alexander Cherepanov wrote:
> On 2016-01-05 19:46, Szabolcs Nagy wrote:
>> i think compiler attributes should be used here on compilers that
>> might break the code, but there is no attribute for this kind of
>> oob access yet (although may_alias attribute is missing here too
>> and should be added like in other string functions).
>
> Perhaps the noclone function attribute could be used in the meantime?
>
> https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-g_t_0040code_007bnoclone_007d-function-attribute-3205

Probably together with the noinline attribute...

Another attribute which looks relevant is no_sanitize_address.

https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-g_t_0040code_007bno_005fsanitize_005faddress_007d-function-attribute-3199

-- 
Alexander Cherepanov


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: string word-at-a-time and atomic.h FAQ on twitter
  2016-01-12 21:09   ` Alexander Cherepanov
@ 2016-01-12 23:07     ` Szabolcs Nagy
  2016-01-13 17:30       ` Szabolcs Nagy
  0 siblings, 1 reply; 16+ messages in thread
From: Szabolcs Nagy @ 2016-01-12 23:07 UTC (permalink / raw)
  To: musl

* Alexander Cherepanov <ch3root@openwall.com> [2016-01-13 00:09:56 +0300]:
> On 2016-01-13 00:02, Alexander Cherepanov wrote:
> >On 2016-01-05 19:46, Szabolcs Nagy wrote:
> >>i think compiler attributes should be used here on compilers that
> >>might break the code, but there is no attribute for this kind of
> >>oob access yet (although may_alias attribute is missing here too
> >>and should be added like in other string functions).
> >
> >Perhaps the noclone function attribute could be used in the meantime?
> >
> >https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-g_t_0040code_007bnoclone_007d-function-attribute-3205
> 
> Probably together with the noinline attribute...
> 
> Another attribute which looks relevant is no_sanitize_address.
> 
> https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-g_t_0040code_007bno_005fsanitize_005faddress_007d-function-attribute-3199
> 

i think a no-lto attr should be used, maybe noinline
can achieve that.

no-instrumentation attrs are ugly because new
instrumentations may appear in new compilers and then
your code becomes decorated.
(libc does not work with asan instrumentation anyway
without serious changes and asan runtime implementation
inside the libc)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: string word-at-a-time and atomic.h FAQ on twitter
  2016-01-12 23:07     ` Szabolcs Nagy
@ 2016-01-13 17:30       ` Szabolcs Nagy
  2016-01-14 12:49         ` Szabolcs Nagy
  2016-01-14 22:51         ` Rich Felker
  0 siblings, 2 replies; 16+ messages in thread
From: Szabolcs Nagy @ 2016-01-13 17:30 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 2528 bytes --]

* Szabolcs Nagy <nsz@port70.net> [2016-01-13 00:07:39 +0100]:
> * Alexander Cherepanov <ch3root@openwall.com> [2016-01-13 00:09:56 +0300]:
> > On 2016-01-13 00:02, Alexander Cherepanov wrote:
> > >On 2016-01-05 19:46, Szabolcs Nagy wrote:
> > >>i think compiler attributes should be used here on compilers that
> > >>might break the code, but there is no attribute for this kind of
> > >>oob access yet (although may_alias attribute is missing here too
> > >>and should be added like in other string functions).
> > >
> > >Perhaps the noclone function attribute could be used in the meantime?
> > >
> > >https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-g_t_0040code_007bnoclone_007d-function-attribute-3205
> > 
> > Probably together with the noinline attribute...
> > 
> > Another attribute which looks relevant is no_sanitize_address.
> > 
> > https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-g_t_0040code_007bno_005fsanitize_005faddress_007d-function-attribute-3199
> > 
> 
> i think a no-lto attr should be used, maybe noinline
> can achieve that.
> 

i tried to do it with -fno-lto

but it seems gcc-6 miscompiles musl with -flto anyway:

lto incorrectly dead code eliminates _dlstart_c.
(the libc entry point, _dlstart, is defined in toplevel
inline asm in ldso/dlstart.c and it jumps to _dlstart_c)

lto breaks symbol binding for environ, _environ, ___environ.
(they should be weak, without that environ in a main binary
has different address than in libc.so)

libc.so built with -flto:
$ readelf --dyn-syms -W libc.so |grep envi
    22: 000000000028eb90     8 OBJECT  GLOBAL DEFAULT   15 __environ
   398: 000000000028eb90     8 OBJECT  GLOBAL PROTECTED   15 ___environ
  1034: 000000000028eb90     8 OBJECT  GLOBAL PROTECTED   15 _environ
  1107: 000000000028eb90     8 OBJECT  GLOBAL DEFAULT   15 environ

libc.so without -flto:
$ readelf --dyn-syms -W libc.so |grep envi
    22: 000000000028d2d8     8 OBJECT  GLOBAL DEFAULT   15 __environ
   398: 000000000028d2d8     8 OBJECT  WEAK   PROTECTED   15 ___environ
  1034: 000000000028d2d8     8 OBJECT  WEAK   PROTECTED   15 _environ
  1107: 000000000028d2d8     8 OBJECT  WEAK   DEFAULT   15 environ


so i tried to -fno-lto to crt/*, dlstart.c and __environ.c
and then libc seemed to build correctly, but during tests
gcc lto1 ICE crashed. (i havent reported the bugs yet)

given these issues i'm not convinced that lto build of
libc is a good idea, but i attached a patch how the
string issues might be worked around.

[-- Attachment #2: lto.diff --]
[-- Type: text/x-diff, Size: 1779 bytes --]

diff --git a/Makefile b/Makefile
index df20f94..3586697 100644
--- a/Makefile
+++ b/Makefile
@@ -113,6 +113,15 @@ NOSSP_SRCS = $(wildcard crt/*.c) \
 	src/ldso/dlstart.c src/ldso/dynlink.c
 $(NOSSP_SRCS:%.c=%.o) $(NOSSP_SRCS:%.c=%.lo): CFLAGS_ALL += $(CFLAGS_NOSSP)
 
+# TODO: update the list when aliasing violations are fixed
+NOLTO_SRCS = $(wildcard crt/*.c) \
+	src/ldso/dlstart.c \
+	src/string/stpcpy.c src/string/strlen.c \
+	src/string/strchrnul.c src/string/memchr.c \
+	src/string/memccpy.c str/string/strlcpy.c \
+	src/string/strncpy.c
+$(NOLTO_SRCS:%.c=%.o) $(NOLTO_SRCS:%.c=%.lo): CFLAGS_ALL += $(CFLAGS_NOLTO)
+
 $(CRT_LIBS:lib/%=crt/%): CFLAGS_ALL += -DCRT
 
 # This incantation ensures that changes to any subarch asm files will
diff --git a/configure b/configure
index ee21771..4672ebb 100755
--- a/configure
+++ b/configure
@@ -113,6 +113,7 @@ CFLAGS_C99FSE=
 CFLAGS_AUTO=
 CFLAGS_MEMOPS=
 CFLAGS_NOSSP=
+CFLAGS_NOLTO=
 CFLAGS_TRY=
 LDFLAGS_AUTO=
 LDFLAGS_TRY=
@@ -344,6 +345,13 @@ tryflag CFLAGS_C99FSE -Wa,--noexecstack
 tryflag CFLAGS_NOSSP -fno-stack-protector
 
 #
+# Check for options to disable LTO, which is needed for executable
+# entry points and functions with aliasing violations.  If not found,
+# this is not an error; we assume the toolchain does not do LTO.
+#
+tryflag CFLAGS_NOLTO -fno-lto
+
+#
 # Check for options that may be needed to prevent the compiler from
 # generating self-referential versions of memcpy,, memmove, memcmp,
 # and memset. Really, we should add a check to determine if this
@@ -660,6 +668,7 @@ CFLAGS_AUTO = $CFLAGS_AUTO
 CFLAGS_C99FSE = $CFLAGS_C99FSE
 CFLAGS_MEMOPS = $CFLAGS_MEMOPS
 CFLAGS_NOSSP = $CFLAGS_NOSSP
+CFLAGS_NOLTO = $CFLAGS_NOLTO
 CPPFLAGS = $CPPFLAGS
 LDFLAGS = $LDFLAGS
 LDFLAGS_AUTO = $LDFLAGS_AUTO

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: string word-at-a-time and atomic.h FAQ on twitter
  2016-01-13 17:30       ` Szabolcs Nagy
@ 2016-01-14 12:49         ` Szabolcs Nagy
  2016-01-14 22:51         ` Rich Felker
  1 sibling, 0 replies; 16+ messages in thread
From: Szabolcs Nagy @ 2016-01-14 12:49 UTC (permalink / raw)
  To: musl

* Szabolcs Nagy <nsz@port70.net> [2016-01-13 18:30:50 +0100]:
> 
> lto breaks symbol binding for environ, _environ, ___environ.
> (they should be weak, without that environ in a main binary
> has different address than in libc.so)
> 

all weak symbol aliases are affected on gcc-5 and gcc-trunk,
gcc-49 is fine (at least the 4.9.3 i tried).

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69271


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: string word-at-a-time and atomic.h FAQ on twitter
  2016-01-13 17:30       ` Szabolcs Nagy
  2016-01-14 12:49         ` Szabolcs Nagy
@ 2016-01-14 22:51         ` Rich Felker
  1 sibling, 0 replies; 16+ messages in thread
From: Rich Felker @ 2016-01-14 22:51 UTC (permalink / raw)
  To: musl

On Wed, Jan 13, 2016 at 06:30:50PM +0100, Szabolcs Nagy wrote:
> i tried to do it with -fno-lto
> 
> but it seems gcc-6 miscompiles musl with -flto anyway:
> 
> lto incorrectly dead code eliminates _dlstart_c.
> (the libc entry point, _dlstart, is defined in toplevel
> inline asm in ldso/dlstart.c and it jumps to _dlstart_c)

I think the files that receive "crt" treatment (-DCRT?) should be
hard-coded as no-LTO. Even if it weren't for this issue, there are
also potential issues with optimizations that might move code across
relocations and initialization. At the very least there should be a
full asm barrier between such init code and the rest of the program.

> lto breaks symbol binding for environ, _environ, ___environ.
> (they should be weak, without that environ in a main binary
> has different address than in libc.so)
> 
> libc.so built with -flto:
> $ readelf --dyn-syms -W libc.so |grep envi
>     22: 000000000028eb90     8 OBJECT  GLOBAL DEFAULT   15 __environ
>    398: 000000000028eb90     8 OBJECT  GLOBAL PROTECTED   15 ___environ
>   1034: 000000000028eb90     8 OBJECT  GLOBAL PROTECTED   15 _environ
>   1107: 000000000028eb90     8 OBJECT  GLOBAL DEFAULT   15 environ
> 
> libc.so without -flto:
> $ readelf --dyn-syms -W libc.so |grep envi
>     22: 000000000028d2d8     8 OBJECT  GLOBAL DEFAULT   15 __environ
>    398: 000000000028d2d8     8 OBJECT  WEAK   PROTECTED   15 ___environ
>   1034: 000000000028d2d8     8 OBJECT  WEAK   PROTECTED   15 _environ
>   1107: 000000000028d2d8     8 OBJECT  WEAK   DEFAULT   15 environ

This is purely a gcc bug and indicates LTO is not really ready for use
with libc, which means I don't see much sense in trying to support it
yet. Of course that doesn't mean I'm disinterested in fixing actual C
language usage errors (aliasing, etc.) that might break with LTO.
Rather I'm disinterested in LTO-specific hacks for working around bugs
in GCC's LTO. OTOH your patch is nice from the standpoint of being
able to test LTO and find and report the bugs.

Rich


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-01-14 22:51 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-05 16:46 string word-at-a-time and atomic.h FAQ on twitter Szabolcs Nagy
2016-01-05 17:50 ` Rich Felker
2016-01-05 23:39   ` Matthew Fernandez
2016-01-06  2:56     ` Szabolcs Nagy
2016-01-08 21:59   ` Alexander Cherepanov
2016-01-08 22:05     ` Rich Felker
2016-01-08 22:39       ` Alexander Cherepanov
2016-01-08 22:59         ` Rich Felker
2016-01-09  1:40           ` Szabolcs Nagy
2016-01-12 12:41           ` Alexander Cherepanov
2016-01-12 21:02 ` Alexander Cherepanov
2016-01-12 21:09   ` Alexander Cherepanov
2016-01-12 23:07     ` Szabolcs Nagy
2016-01-13 17:30       ` Szabolcs Nagy
2016-01-14 12:49         ` Szabolcs Nagy
2016-01-14 22:51         ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).