mailing list of musl libc
 help / color / mirror / code / Atom feed
* cpuset/affinity interfaces and TSX lock elision in musl
@ 2013-05-16 16:37 Daniel Cegiełka
  2013-05-16 20:36 ` Rich Felker
  0 siblings, 1 reply; 11+ messages in thread
From: Daniel Cegiełka @ 2013-05-16 16:37 UTC (permalink / raw)
  To: musl

1) Are there any plans to add support for cpuset/affinity interfaces?

2) The upcoming glibc will have support for TSX lock elision.

http://en.wikipedia.org/wiki/Transactional_Synchronization_Extensions

http://lwn.net/Articles/534761/

Are there any outlook that we can support TSX lock elision in musl?

Daniel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: cpuset/affinity interfaces and TSX lock elision in musl
  2013-05-16 16:37 cpuset/affinity interfaces and TSX lock elision in musl Daniel Cegiełka
@ 2013-05-16 20:36 ` Rich Felker
  2013-05-17  4:49   ` Rob Landley
  2013-05-17  7:41   ` Daniel Cegiełka
  0 siblings, 2 replies; 11+ messages in thread
From: Rich Felker @ 2013-05-16 20:36 UTC (permalink / raw)
  To: musl

On Thu, May 16, 2013 at 06:37:01PM +0200, Daniel Cegiełka wrote:
> 1) Are there any plans to add support for cpuset/affinity interfaces?

I sat down to do it one day, and it was so ugly I got sick and put it
off again. Seriously. There's a huge abundance of CPU_*
macros/functions for manipulating abstract bitsets, but all "cpu set"
specific for no good reason.

If anyone wants to volunteer to do these, it would be a big relief to
me. Some caveats:

1. The glibc versions invoke UB by accessing past the end of the
__bits array in the macros that work with arbitrary-size sets. A
correct version would just cast the input pointer to a pointer to
unsigned long.

2. The glibc version has buggy overflow checks that were just fixed in
their git (so don't coppy the buggy logic).

3. These macros are sufficiently complex that they probably quality as
actual-code (with copyrightable content) in header files, and I don't
like that. Maybe we should make the external functions instead?

Discussion of these points and other ideas for implementing them is
welcome.

> 2) The upcoming glibc will have support for TSX lock elision.
> 
> http://en.wikipedia.org/wiki/Transactional_Synchronization_Extensions
> 
> http://lwn.net/Articles/534761/
> 
> Are there any outlook that we can support TSX lock elision in musl?

I was involved in the discussions about lock elision on the glibc
mailing list, and from what I could gather, it's a pain to implement
and whether it brings you any benefit is questionable. Before making
any decision, I think we should wait to see some performance figures.

Rich


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: cpuset/affinity interfaces and TSX lock elision in musl
  2013-05-16 20:36 ` Rich Felker
@ 2013-05-17  4:49   ` Rob Landley
  2013-05-17  5:01     ` Rich Felker
  2013-05-17  7:41   ` Daniel Cegiełka
  1 sibling, 1 reply; 11+ messages in thread
From: Rob Landley @ 2013-05-17  4:49 UTC (permalink / raw)
  To: musl; +Cc: musl

On 05/16/2013 03:36:58 PM, Rich Felker wrote:
> On Thu, May 16, 2013 at 06:37:01PM +0200, Daniel Cegiełka wrote:
> > 1) Are there any plans to add support for cpuset/affinity  
> interfaces?
> 
> I sat down to do it one day, and it was so ugly I got sick and put it
> off again. Seriously. There's a huge abundance of CPU_*
> macros/functions for manipulating abstract bitsets, but all "cpu set"
> specific for no good reason.
> 
> If anyone wants to volunteer to do these, it would be a big relief to
> me. Some caveats:

Meh, the data format's trivial. It's just that the documentation is in  
an insane place, namely here:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/include/asm/bitops.h

(And no, it can't change because it would break existing binaries. Last  
I checked we still run binaries from 0.0.1 if you enable the ancient  
stuff. Alan Cox thacked people who broke that.)

I ripped the glibc stuff out of my taskset implementation last year:

   http://landley.net/hg/toybox/rev/fb546cc2a022

And the new operations boil down to:

   int x = 255 & (mask[j/sizeof(long)] >> (8*(j&(sizeof(long)-1))));

   mask[j/(2*sizeof(long))] |= digit << 4*(j&((2*sizeof(long))-1));

And yes, all the endianness and word size and such work out right if  
you just compile that for the target in question.

Rob

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: cpuset/affinity interfaces and TSX lock elision in musl
  2013-05-17  4:49   ` Rob Landley
@ 2013-05-17  5:01     ` Rich Felker
  2013-05-19  4:05       ` Rob Landley
  0 siblings, 1 reply; 11+ messages in thread
From: Rich Felker @ 2013-05-17  5:01 UTC (permalink / raw)
  To: musl

On Thu, May 16, 2013 at 11:49:11PM -0500, Rob Landley wrote:
> On 05/16/2013 03:36:58 PM, Rich Felker wrote:
> >On Thu, May 16, 2013 at 06:37:01PM +0200, Daniel Cegiełka wrote:
> >> 1) Are there any plans to add support for cpuset/affinity
> >interfaces?
> >
> >I sat down to do it one day, and it was so ugly I got sick and put it
> >off again. Seriously. There's a huge abundance of CPU_*
> >macros/functions for manipulating abstract bitsets, but all "cpu set"
> >specific for no good reason.
> >
> >If anyone wants to volunteer to do these, it would be a big relief to
> >me. Some caveats:
> 
> Meh, the data format's trivial. It's just that the documentation is
> in an insane place, namely here:

It's also the exact same format as fd_set and sigset_t, i.e. the only
natural set implementation. What's frustrating is that we have to have
3+ sets of interfaces that do exactly the same thing...

Rich


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: cpuset/affinity interfaces and TSX lock elision in musl
  2013-05-16 20:36 ` Rich Felker
  2013-05-17  4:49   ` Rob Landley
@ 2013-05-17  7:41   ` Daniel Cegiełka
  2013-05-17 11:28     ` Szabolcs Nagy
  2013-05-19  4:12     ` Rob Landley
  1 sibling, 2 replies; 11+ messages in thread
From: Daniel Cegiełka @ 2013-05-17  7:41 UTC (permalink / raw)
  To: musl

Rich, Rob - thanks for the information. This is functionality that
sooner or later, but it is worth to add to the musl.

>> 2) The upcoming glibc will have support for TSX lock elision.
>>
>> http://en.wikipedia.org/wiki/Transactional_Synchronization_Extensions
>>
>> http://lwn.net/Articles/534761/
>>
>> Are there any outlook that we can support TSX lock elision in musl?
>
> I was involved in the discussions about lock elision on the glibc
> mailing list, and from what I could gather, it's a pain to implement
> and whether it brings you any benefit is questionable.

There is currently no hardware support, so the tests were done in the
emulator. It's too early to say there's is no performance gain.

> Before making
> any decision, I think we should wait to see some performance figures.

musl is described as libc for embedded systems (raspberry pi, small
routers, mobile etc.). Summing up: low-end hardware. I think musl is
the ideal solution for high-end HPC servers etc., so that's why we
should support innovative solutions (like TSX lock elision). We may
also ask manufacturers (such as Intel) for help with optimization
(they really help with glibc and gcc).

Best regards,
Daniel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: cpuset/affinity interfaces and TSX lock elision in musl
  2013-05-17  7:41   ` Daniel Cegiełka
@ 2013-05-17 11:28     ` Szabolcs Nagy
  2013-05-17 17:29       ` Rich Felker
  2013-05-19  4:12     ` Rob Landley
  1 sibling, 1 reply; 11+ messages in thread
From: Szabolcs Nagy @ 2013-05-17 11:28 UTC (permalink / raw)
  To: musl

* Daniel Cegie?ka <daniel.cegielka@gmail.com> [2013-05-17 09:41:18 +0200]:
> >> 2) The upcoming glibc will have support for TSX lock elision.
> >>
> >> http://en.wikipedia.org/wiki/Transactional_Synchronization_Extensions
> >>
> >> http://lwn.net/Articles/534761/
> >>
> >> Are there any outlook that we can support TSX lock elision in musl?
> >
> > I was involved in the discussions about lock elision on the glibc
> > mailing list, and from what I could gather, it's a pain to implement
> > and whether it brings you any benefit is questionable.
> 
> There is currently no hardware support, so the tests were done in the
> emulator. It's too early to say there's is no performance gain.
> 

it's not the lock performance that's questionable
but the benefits

locks should not be the bottleneck in applications
unless there is too much shared state on hot paths,
which is probably a design bug or a special use-case
for which non-standard synchronization methods may
be better anyway

for the implementation costs check the glibc
discussion where rich pointed out conformance
issues in the original design

http://article.gmane.org/gmane.comp.lib.glibc.alpha/29240


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: cpuset/affinity interfaces and TSX lock elision in musl
  2013-05-17 11:28     ` Szabolcs Nagy
@ 2013-05-17 17:29       ` Rich Felker
  2013-05-19  4:40         ` Rob Landley
  0 siblings, 1 reply; 11+ messages in thread
From: Rich Felker @ 2013-05-17 17:29 UTC (permalink / raw)
  To: musl

On Fri, May 17, 2013 at 01:28:02PM +0200, Szabolcs Nagy wrote:
> * Daniel Cegie?ka <daniel.cegielka@gmail.com> [2013-05-17 09:41:18 +0200]:
> > >> 2) The upcoming glibc will have support for TSX lock elision.
> > >>
> > >> http://en.wikipedia.org/wiki/Transactional_Synchronization_Extensions
> > >>
> > >> http://lwn.net/Articles/534761/
> > >>
> > >> Are there any outlook that we can support TSX lock elision in musl?
> > >
> > > I was involved in the discussions about lock elision on the glibc
> > > mailing list, and from what I could gather, it's a pain to implement
> > > and whether it brings you any benefit is questionable.
> > 
> > There is currently no hardware support, so the tests were done in the
> > emulator. It's too early to say there's is no performance gain.

I agree it's too early. That's why I said I'd like to wait and see
before doing anything. My view is that what glibc is doing is (1) an
experiment to see if it's worthwhile, and (2) a buzzword-compliance
gimmick whereby Linux vendors and Intel can show off that they have a
state-of-the-art new feature (regardless of whether it's useful).

> it's not the lock performance that's questionable
> but the benefits

Yes. An artificial benchmark to spam lock requests would not be that
interesting, and for real-world usage, it's a lot more questionable
whether lock elision would help or hurt. The canonical case where it
would hurt is:

1. Take lock
2. Do expensive computation
3. Output results via syscall
4. Release lock

In this case, the expensive computation gets performed twice. It may
be possible to avoid all of the costly cases by adaptively turning off
elision for particular locks (or of course by having the application
manually tune it, but that's hideous), with corresponding complexity
costs. Unless the _gains_ in the good cases are sufficiently
beneficial, however, I think that complexity would be misspent.

In some sense, perhaps a better place for lock elision would be at the
_compiler_ level. If the compiler could analyze the code and determine
that there is an unconditional path from the lock to the corresponding
unlock with no intervening external calls (think: adding or removing
item from a linked list), it could add a code path that uses lock
elision rather than locking. However this seems to require intricate
cooperation between the compiler and library implementation, which is
unacceptable to me...

> locks should not be the bottleneck in applications
> unless there is too much shared state on hot paths,
> which is probably a design bug or a special use-case
> for which non-standard synchronization methods may
> be better anyway

One place where there is unfortunately a huge amount of shared state
is memory management; this is inevitable. Even if we don't use lock
elision for pthread locks, it might be worth considering using it
_internally_ in malloc when it's available. It's hard to say without
any measurements, but this might result in a malloc that beats
ptmalloc, etc. without any thread-locale management.

Rich


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: cpuset/affinity interfaces and TSX lock elision in musl
  2013-05-17  5:01     ` Rich Felker
@ 2013-05-19  4:05       ` Rob Landley
  0 siblings, 0 replies; 11+ messages in thread
From: Rob Landley @ 2013-05-19  4:05 UTC (permalink / raw)
  To: musl; +Cc: musl

On 05/17/2013 12:01:26 AM, Rich Felker wrote:
> On Thu, May 16, 2013 at 11:49:11PM -0500, Rob Landley wrote:
> > On 05/16/2013 03:36:58 PM, Rich Felker wrote:
> > >On Thu, May 16, 2013 at 06:37:01PM +0200, Daniel Cegiełka wrote:
> > >> 1) Are there any plans to add support for cpuset/affinity
> > >interfaces?
> > >
> > >I sat down to do it one day, and it was so ugly I got sick and put  
> it
> > >off again. Seriously. There's a huge abundance of CPU_*
> > >macros/functions for manipulating abstract bitsets, but all "cpu  
> set"
> > >specific for no good reason.
> > >
> > >If anyone wants to volunteer to do these, it would be a big relief  
> to
> > >me. Some caveats:
> >
> > Meh, the data format's trivial. It's just that the documentation is
> > in an insane place, namely here:
> 
> It's also the exact same format as fd_set and sigset_t, i.e. the only
> natural set implementation. What's frustrating is that we have to have
> 3+ sets of interfaces that do exactly the same thing...

Inside the kernel it's all the same set interface. It's just glibc that  
decided to add layers.

Rob

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: cpuset/affinity interfaces and TSX lock elision in musl
  2013-05-17  7:41   ` Daniel Cegiełka
  2013-05-17 11:28     ` Szabolcs Nagy
@ 2013-05-19  4:12     ` Rob Landley
  1 sibling, 0 replies; 11+ messages in thread
From: Rob Landley @ 2013-05-19  4:12 UTC (permalink / raw)
  To: musl; +Cc: musl

On 05/17/2013 02:41:18 AM, Daniel Cegiełka wrote:
> Rich, Rob - thanks for the information. This is functionality that
> sooner or later, but it is worth to add to the musl.
> 
> >> 2) The upcoming glibc will have support for TSX lock elision.
> >>
> >>  
> http://en.wikipedia.org/wiki/Transactional_Synchronization_Extensions
> >>
> >> http://lwn.net/Articles/534761/
> >>
> >> Are there any outlook that we can support TSX lock elision in musl?
> >
> > I was involved in the discussions about lock elision on the glibc
> > mailing list, and from what I could gather, it's a pain to implement
> > and whether it brings you any benefit is questionable.
> 
> There is currently no hardware support, so the tests were done in the
> emulator. It's too early to say there's is no performance gain.
> 
> > Before making
> > any decision, I think we should wait to see some performance  
> figures.
> 
> musl is described as libc for embedded systems (raspberry pi, small
> routers, mobile etc.). Summing up: low-end hardware.

Where is it described that way? That wasn't my impression: it was a  
simple generic C library for Linux and Android. We should be able to  
build desktops with it just fine.

> I think musl is
> the ideal solution for high-end HPC servers etc., so that's why we
> should support innovative solutions (like TSX lock elision).

HPC and embedded are closer to each other than either is to the  
desktop. They race to completion, we race to quiescence on underpowered  
hardware; neither has persistent processes but more of a batch  
mentality. We keep power consumption down to extend battery life, they  
keep power consumption down because heat dissipation costs more than  
the hardware...

Right now I'm in month 4 of a 6 month contract at Cray, the  
supercomputer company. (No, not the one SGI bought: it split in two  
when Seymour Cray retired and they gave him a $100 million research lab  
as a retirement present. The half SGI _didn't_ buy expanded back into  
the space and is again supercomputing in the big leagues.)

> We may
> also ask manufacturers (such as Intel) for help with optimization
> (they really help with glibc and gcc).

Worry about target-specific optimization after 1.0. Until then it's  
premature optimization. Right now target-independent optimization seems  
more interesting. (To me, anyway.)

Rob

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: cpuset/affinity interfaces and TSX lock elision in musl
  2013-05-17 17:29       ` Rich Felker
@ 2013-05-19  4:40         ` Rob Landley
  2013-05-19 20:51           ` Rich Felker
  0 siblings, 1 reply; 11+ messages in thread
From: Rob Landley @ 2013-05-19  4:40 UTC (permalink / raw)
  To: musl; +Cc: musl

On 05/17/2013 12:29:03 PM, Rich Felker wrote:
> > locks should not be the bottleneck in applications
> > unless there is too much shared state on hot paths,
> > which is probably a design bug or a special use-case
> > for which non-standard synchronization methods may
> > be better anyway
> 
> One place where there is unfortunately a huge amount of shared state
> is memory management; this is inevitable. Even if we don't use lock
> elision for pthread locks, it might be worth considering using it
> _internally_ in malloc when it's available. It's hard to say without
> any measurements, but this might result in a malloc that beats
> ptmalloc, etc. without any thread-locale management.

I thought the point of futexes was that in the non-contention case you  
don't enter the kernel at all?

I really don't see how lock elision is supposed to improve upon that.  
If you're optimizing the contended case, something is wrong.

Rob

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: cpuset/affinity interfaces and TSX lock elision in musl
  2013-05-19  4:40         ` Rob Landley
@ 2013-05-19 20:51           ` Rich Felker
  0 siblings, 0 replies; 11+ messages in thread
From: Rich Felker @ 2013-05-19 20:51 UTC (permalink / raw)
  To: musl

On Sat, May 18, 2013 at 11:40:32PM -0500, Rob Landley wrote:
> On 05/17/2013 12:29:03 PM, Rich Felker wrote:
> >> locks should not be the bottleneck in applications
> >> unless there is too much shared state on hot paths,
> >> which is probably a design bug or a special use-case
> >> for which non-standard synchronization methods may
> >> be better anyway
> >
> >One place where there is unfortunately a huge amount of shared state
> >is memory management; this is inevitable. Even if we don't use lock
> >elision for pthread locks, it might be worth considering using it
> >_internally_ in malloc when it's available. It's hard to say without
> >any measurements, but this might result in a malloc that beats
> >ptmalloc, etc. without any thread-locale management.
> 
> I thought the point of futexes was that in the non-contention case
> you don't enter the kernel at all?
> 
> I really don't see how lock elision is supposed to improve upon
> that. If you're optimizing the contended case, something is wrong.

Yes, that "something" is C++ (and by extension, glib, which might as
well be C++ but worse). But we're not in a position to fix it.

Rich


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-05-19 20:51 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-16 16:37 cpuset/affinity interfaces and TSX lock elision in musl Daniel Cegiełka
2013-05-16 20:36 ` Rich Felker
2013-05-17  4:49   ` Rob Landley
2013-05-17  5:01     ` Rich Felker
2013-05-19  4:05       ` Rob Landley
2013-05-17  7:41   ` Daniel Cegiełka
2013-05-17 11:28     ` Szabolcs Nagy
2013-05-17 17:29       ` Rich Felker
2013-05-19  4:40         ` Rob Landley
2013-05-19 20:51           ` Rich Felker
2013-05-19  4:12     ` Rob Landley

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).