mailing list of musl libc
 help / color / mirror / code / Atom feed
* Solving the recursive memcpy/memset/etc. issue
@ 2013-08-01  0:49 Rich Felker
  2013-08-01  6:05 ` Luca Barbato
  2013-08-02  2:52 ` Rich Felker
  0 siblings, 2 replies; 7+ messages in thread
From: Rich Felker @ 2013-08-01  0:49 UTC (permalink / raw)
  To: musl

OK, so now that it's hit us for real, what should we do about GCC
generating code for memcpy, memset, etc. which might contain infinite
recursion? Aside from the ARM issue (which was separate), we know the
option causing this bad code generation, and it can be disabled via
-fno-tree-loop-distribute-patterns. However, if GCC policy is that
they consider the compiler entitled to generate calls to
memcpy/memset/memmove/memcmp whenever it wants, then we're just going
to be playing whack-a-mole.

The only fully viable option I see is replacing the code for these
functions with code that uses volatile objects so as to make
optimization utterly impossible. This will of course make them
incredibly slow, but at least we would have safe, working C code, and
we could add asm for each supported arch.

An alternative might be to test the compiler in configure to determine
if, with the selected CFLAGS, it generates recursive code for these
functions, and if so, defining a macro that causes musl to revert to
the volatile code.

Other ideas? For now, if -fno-tree-loop-distribute-patterns fixes it
(still waiting on confirmation for this) I'm going to commit that to
configure, but it doesn't seem like a viable long-term solution.

My ideal outcome would be a promise from the GCC developers that, in
future GCC versions, -ffreestanding implies disabling any options
which would generate calls to the mem* functions. However that sounds
unlikely.

Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Solving the recursive memcpy/memset/etc. issue
  2013-08-01  0:49 Solving the recursive memcpy/memset/etc. issue Rich Felker
@ 2013-08-01  6:05 ` Luca Barbato
  2013-08-01  6:20   ` Rich Felker
  2013-08-02  2:52 ` Rich Felker
  1 sibling, 1 reply; 7+ messages in thread
From: Luca Barbato @ 2013-08-01  6:05 UTC (permalink / raw)
  To: musl

On 01/08/13 02:49, Rich Felker wrote:
> OK, so now that it's hit us for real, what should we do about GCC
> generating code for memcpy, memset, etc. which might contain infinite
> recursion? Aside from the ARM issue (which was separate), we know the
> option causing this bad code generation, and it can be disabled via
> -fno-tree-loop-distribute-patterns. However, if GCC policy is that
> they consider the compiler entitled to generate calls to
> memcpy/memset/memmove/memcmp whenever it wants, then we're just going
> to be playing whack-a-mole.

Sounds lovely.

> The only fully viable option I see is replacing the code for these
> functions with code that uses volatile objects so as to make
> optimization utterly impossible. This will of course make them
> incredibly slow, but at least we would have safe, working C code, and
> we could add asm for each supported arch.

Not exactly great.

> An alternative might be to test the compiler in configure to determine
> if, with the selected CFLAGS, it generates recursive code for these
> functions, and if so, defining a macro that causes musl to revert to
> the volatile code.

Sounds much better.

> Other ideas? For now, if -fno-tree-loop-distribute-patterns fixes it
> (still waiting on confirmation for this) I'm going to commit that to
> configure, but it doesn't seem like a viable long-term solution.

I'd rather check and error out reporting the compiler is broken. Then
have an explicit configure option to try to workaround it.

> My ideal outcome would be a promise from the GCC developers that, in
> future GCC versions, -ffreestanding implies disabling any options
> which would generate calls to the mem* functions. However that sounds
> unlikely.

They have competition, if clang works better then we could just suggest
to use it and nowadays gcc has no deployment advantage to it anymore.


lu


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Solving the recursive memcpy/memset/etc. issue
  2013-08-01  6:05 ` Luca Barbato
@ 2013-08-01  6:20   ` Rich Felker
  2013-08-01  8:03     ` Luca Barbato
  0 siblings, 1 reply; 7+ messages in thread
From: Rich Felker @ 2013-08-01  6:20 UTC (permalink / raw)
  To: musl

On Thu, Aug 01, 2013 at 08:05:01AM +0200, Luca Barbato wrote:
> > The only fully viable option I see is replacing the code for these
> > functions with code that uses volatile objects so as to make
> > optimization utterly impossible. This will of course make them
> > incredibly slow, but at least we would have safe, working C code, and
> > we could add asm for each supported arch.
> 
> Not exactly great.

Well, we really need to add the arch asm anyway, as ugly as it is.
Right now most archs have memcpy running 2-5x slower than it should. I
could _try_ writing C to handle the unaligned (hard) cases well,
basically mimicing what the proposed asm for arm does, but I don't
think it will be competitive, just "not as slow". And we'd still have
to worry about it getting miscompiled...

> > An alternative might be to test the compiler in configure to determine
> > if, with the selected CFLAGS, it generates recursive code for these
> > functions, and if so, defining a macro that causes musl to revert to
> > the volatile code.
> 
> Sounds much better.

Well, it would be an ugly heuristic like running cc -S -o - on
src/string/memcpy.c, with -Dmemcpy=noname or something, and grepping
the output for memcpy...

> > Other ideas? For now, if -fno-tree-loop-distribute-patterns fixes it
> > (still waiting on confirmation for this) I'm going to commit that to
> > configure, but it doesn't seem like a viable long-term solution.
> 
> I'd rather check and error out reporting the compiler is broken. Then
> have an explicit configure option to try to workaround it.

If it were just a temporary regression, I would agree, but I think the
GCC position is that this is not a bug...

> > My ideal outcome would be a promise from the GCC developers that, in
> > future GCC versions, -ffreestanding implies disabling any options
> > which would generate calls to the mem* functions. However that sounds
> > unlikely.
> 
> They have competition, if clang works better then we could just suggest
> to use it and nowadays gcc has no deployment advantage to it anymore.

I figured someone would say that, and almost put a preemptive note in
my post. clang/LLVM was the first to have this sort of bug of ignoring
-ffreestanding, only much worse, making invalid assumptions about the
result value of malloc inside the malloc implementation... Competition
is unfortunately the source of our woes, not the solution. GCC and
clang/LLVM are facing competition to be the best at compiling
application code, and since compiling the implementation itself is an
unusual, unexciting usage case, nobody's really watching out for how
they break that one in the race to have the fastest application
code...

Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Solving the recursive memcpy/memset/etc. issue
  2013-08-01  6:20   ` Rich Felker
@ 2013-08-01  8:03     ` Luca Barbato
  2013-08-01  8:26       ` Jens Gustedt
  0 siblings, 1 reply; 7+ messages in thread
From: Luca Barbato @ 2013-08-01  8:03 UTC (permalink / raw)
  To: musl

On 01/08/13 08:20, Rich Felker wrote:
> On Thu, Aug 01, 2013 at 08:05:01AM +0200, Luca Barbato wrote:
>>> The only fully viable option I see is replacing the code for
>>> these functions with code that uses volatile objects so as to
>>> make optimization utterly impossible. This will of course make
>>> them incredibly slow, but at least we would have safe, working C
>>> code, and we could add asm for each supported arch.
>> 
>> Not exactly great.
> 
> Well, we really need to add the arch asm anyway, as ugly as it is. 
> Right now most archs have memcpy running 2-5x slower than it should.
> I could _try_ writing C to handle the unaligned (hard) cases well, 
> basically mimicing what the proposed asm for arm does, but I don't 
> think it will be competitive, just "not as slow". And we'd still
> have to worry about it getting miscompiled...

I trust you on that.

>>> An alternative might be to test the compiler in configure to
>>> determine if, with the selected CFLAGS, it generates recursive
>>> code for these functions, and if so, defining a macro that causes
>>> musl to revert to the volatile code.
>> 
>> Sounds much better.
> 
> Well, it would be an ugly heuristic like running cc -S -o - on 
> src/string/memcpy.c, with -Dmemcpy=noname or something, and grepping 
> the output for memcpy...

Indeed.

>>> Other ideas? For now, if -fno-tree-loop-distribute-patterns fixes
>>> it (still waiting on confirmation for this) I'm going to commit
>>> that to configure, but it doesn't seem like a viable long-term
>>> solution.
>> 
>> I'd rather check and error out reporting the compiler is broken.
>> Then have an explicit configure option to try to workaround it.
> 
> If it were just a temporary regression, I would agree, but I think
> the GCC position is that this is not a bug...

Given that even glibc breaks on that...

> I figured someone would say that, and almost put a preemptive note
> in my post. clang/LLVM was the first to have this sort of bug of
> ignoring -ffreestanding, only much worse, making invalid assumptions
> about the result value of malloc inside the malloc implementation...

Understandable, clang is a "young" project. It is like expecting cparser
to be able to compile Libav.

> Competition is unfortunately the source of our woes, not the
> solution. GCC and clang/LLVM are facing competition to be the best at
> compiling application code, and since compiling the implementation
> itself is an unusual, unexciting usage case, nobody's really watching
> out for how they break that one in the race to have the fastest
> application code...

Implementors and distributors might have to say a thing or two about
that. If they manage to break singlehandedly musl, uclibc, glibc and
bionic and refuse to acknowledge the bug then I guess some people might
question their direction.

lu


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Solving the recursive memcpy/memset/etc. issue
  2013-08-01  8:03     ` Luca Barbato
@ 2013-08-01  8:26       ` Jens Gustedt
  2013-08-01 17:48         ` Rich Felker
  0 siblings, 1 reply; 7+ messages in thread
From: Jens Gustedt @ 2013-08-01  8:26 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1339 bytes --]

Am Donnerstag, den 01.08.2013, 10:03 +0200 schrieb Luca Barbato:
> > Well, it would be an ugly heuristic like running cc -S -o - on 
> > src/string/memcpy.c, with -Dmemcpy=noname or something, and grepping 
> > the output for memcpy...
> 
> Indeed.

there might be the word memcpy in comments or whatever, depends on the
assembler format etc. might be really relatively difficult to set up
reliably.

Other options:

 - compile it with that -Dmemcpy=noname trick to a normal .o and use
   nm to look for an undefined symbol "memcpy".

 - use a different name for the implementation of the function from
   the start, __musl_memcpy or so. Then you could do the check for the
   memcpy symbol on the normally compiled .o and (if everything is ok)
   rename the __musl_memcpy to memcpy with some linker trick.

 - put "#define memcpy __musl_memcpy" at the start of string.h
   if __musl_memcpy would refer to memcpy this would give an link time
   error for an undefined symbol.

Jens

-- 
:: INRIA Nancy Grand Est :: http://www.loria.fr/~gustedt/   ::
:: AlGorille ::::::::::::::: office Nancy : +33 383593090   ::
:: ICube :::::::::::::: office Strasbourg : +33 368854536   ::
:: ::::::::::::::::::::::::::: gsm France : +33 651400183   ::
:: :::::::::::::::::::: gsm international : +49 15737185122 ::



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Solving the recursive memcpy/memset/etc. issue
  2013-08-01  8:26       ` Jens Gustedt
@ 2013-08-01 17:48         ` Rich Felker
  0 siblings, 0 replies; 7+ messages in thread
From: Rich Felker @ 2013-08-01 17:48 UTC (permalink / raw)
  To: musl

On Thu, Aug 01, 2013 at 10:26:55AM +0200, Jens Gustedt wrote:
> Am Donnerstag, den 01.08.2013, 10:03 +0200 schrieb Luca Barbato:
> > > Well, it would be an ugly heuristic like running cc -S -o - on 
> > > src/string/memcpy.c, with -Dmemcpy=noname or something, and grepping 
> > > the output for memcpy...
> > 
> > Indeed.
> 
> there might be the word memcpy in comments or whatever, depends on the
> assembler format etc. might be really relatively difficult to set up
> reliably.

Well as long as they're false-positives for detecting the bug rather
than false-negatives, at worst you'd get a slow memcpy rather than a
crashing libc.

>  - compile it with that -Dmemcpy=noname trick to a normal .o and use
>    nm to look for an undefined symbol "memcpy".

Using nm is undesirable with respect to cross-compiling. It requires
detecting the right version of nm to run, which may be hard; I don't
even have *-linux-musl-nm in my path on my laptop here, just the gcc.
Right now cross-compiling is as simple as setting CC to the cross
compiler, and I don't want to break that.

>  - use a different name for the implementation of the function from
>    the start, __musl_memcpy or so. Then you could do the check for the
>    memcpy symbol on the normally compiled .o and (if everything is ok)
>    rename the __musl_memcpy to memcpy with some linker trick.

This is a lot more ugliness in the build system and I'm not clear on
the benefit over the above methods.

>  - put "#define memcpy __musl_memcpy" at the start of string.h
>    if __musl_memcpy would refer to memcpy this would give an link time
>    error for an undefined symbol.

No linking is done to make libc.a. As for linking applications, the
symbol memcpy must exist. You are always permitted to remove macro
definitions and refer to the underlying function. Even if this were
valid, it would change the ABI in a very ugly hackish way that belongs
in glibc, not musl.

Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Solving the recursive memcpy/memset/etc. issue
  2013-08-01  0:49 Solving the recursive memcpy/memset/etc. issue Rich Felker
  2013-08-01  6:05 ` Luca Barbato
@ 2013-08-02  2:52 ` Rich Felker
  1 sibling, 0 replies; 7+ messages in thread
From: Rich Felker @ 2013-08-02  2:52 UTC (permalink / raw)
  To: musl

On Wed, Jul 31, 2013 at 08:49:40PM -0400, Rich Felker wrote:
> OK, so now that it's hit us for real, what should we do about GCC
> generating code for memcpy, memset, etc. which might contain infinite
> recursion? Aside from the ARM issue (which was separate), we know the
> option causing this bad code generation, and it can be disabled via
> -fno-tree-loop-distribute-patterns. However, if GCC policy is that
> they consider the compiler entitled to generate calls to
> memcpy/memset/memmove/memcmp whenever it wants, then we're just going
> to be playing whack-a-mole.

For now, I solved it with just this option, and also added some more
asm for x86[_64].

> The only fully viable option I see is replacing the code for these
> functions with code that uses volatile objects so as to make
> optimization utterly impossible. This will of course make them
> incredibly slow, but at least we would have safe, working C code, and
> we could add asm for each supported arch.

I'd actually like to experiment with some ideas for volatile that
don't make the C slow. If the compiler really honors volatile, it
seems like you can write C code in such a way where the compiler has
to implement it pretty close to the abstract machine -- but you can be
selective about it, so as to allow the compiler some freedom.

> An alternative might be to test the compiler in configure to determine
> if, with the selected CFLAGS, it generates recursive code for these
> functions, and if so, defining a macro that causes musl to revert to
> the volatile code.

This is not in the workaround I committed, but could be added..

Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-08-02  2:52 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-01  0:49 Solving the recursive memcpy/memset/etc. issue Rich Felker
2013-08-01  6:05 ` Luca Barbato
2013-08-01  6:20   ` Rich Felker
2013-08-01  8:03     ` Luca Barbato
2013-08-01  8:26       ` Jens Gustedt
2013-08-01 17:48         ` Rich Felker
2013-08-02  2:52 ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).