Re: [musl] Suggestion for thread safety

mailing list of musl libc
 help / color / mirror / code / Atom feed

From: Lee Shallis <gb2985@gmail.com>
To: musl@lists.openwall.com
Subject: Re: [musl] Suggestion for thread safety
Date: Wed, 2 Mar 2022 01:44:38 +0000	[thread overview]
Message-ID: <CAOZ3c1q7m5wgryBYzoE1Y60guxXog-bkrG8qCz0tyxj3xSMENQ@mail.gmail.com> (raw)
In-Reply-To: <CAOZ3c1oc5EVdcEJBWnxeFkZ8wL+RBfqo-HUDv_-om2KTk5h4pQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3888 bytes --]

Welp, I think I finally managed to fix my implementation, wasn't quite
what I had in mind but it was the only method that seemed to work
without the bulky code pthread_mutex_lock falls to, it is however
slightly slower so I would treat it as a fallback for systems that
don't provide a mutex for now, the solution I ended up with utilises
kill( getpid(), SIGCONT ) & an additional member to identify which
thread managed to get their pid_t in at the time of the claim.

On Mon, 28 Feb 2022 at 16:07, Lee Shallis <gb2985@gmail.com> wrote:
>
> On Mon, 28 Feb 2022 at 15:51, Joakim Sindholt <opensource@zhasha.com> wrote:
> >
> > On Mon, 28 Feb 2022 14:43:36 +0000, Lee Shallis <gb2985@gmail.com> wrote:
> > > Seems the wait just wasn't long enough, at about 4 yields onwards the
> > > results become consistent success, I've attached the file I did the
> > > experiments in, I even tried it under -O3 and no exits were
> > > encountered, so yes my method works, just needs a bit more wait time
> > > for extreme cases
> >
> > Between the lines
> > > if ( !(shared->tid) )
> > and
> > > shared->tid = tid;
> > the kernel might suspend the running thread and allow the other to run,
> > or you might simply get unlucky and have the two threads do the checks
> > close enough to simultaneously that the memory hasn't been synchronized
> > yet. Either way you end up with both threads seeing that shared->tid is
> > zero and both of them writing their tids to it, and thus both enter the
> That's the point of the loop, to check it's the same as what they
> wrote, if it's not then it's either locked to another thread or empty,
> the point in doing the yield after the write is to allow that failure
> to occur, basically I'm using the race condition itself as the point
> of success, rather than expect the CPU to perform an atomic lock that
> could be just as broken as timing based locks, I already have my ideas
> on how to fix the need for many yields to need only 2, I'm about to
> try it now
> > critical section at the same time. And so the lock fails at the very
> > first hurdle: mutual exclusion. No amount of sleeping will make the bug
> > go away, only slightly more difficult to trigger.
>
> No it doesn't, think through the loop properly and you'll see that the
> concept is the best one to go with, implementation just needs a little
> work
>
> > The point of the clock_nanosleep call was to force a reschedule while
> > holding the lock. This also increases the runtime inside the lock which
> > in this case increases the likelihood that the thread trying to take the
> > lock will be waiting for it and end up racing with the thread that
> > currently has it when it unlocks and tries to relock it.
>
> How so? It still takes time for the jump condition to be evaluated and
> the call to LockSiData to start, the other thread will already be in
> the call loop ready to lock it, I designed this function specifically
> around the idea that multiple threads could see an empty tid at the
> same time, that's the reason for the yield call, so that all those
> writes get in before the execution resumes.
>
> > Now that you've inserted lots of sched_yield()s your lock is not only
> > still broken (in more ways than the one we've been trying to get you to
> > understand) but also extremely slow.
> >
> > As a hint for your future education: the first (and far from only) thing
> > you'll need is compare-and-swap, aka. CAS.
> > You can read up on this class of bugs if you'd like. It's called "Time
> > Of Check to Time Of Use" or "TOCTOU" for short.
> >
> > I didn't even need to poke at the code this time as the code you sent
> > breaks just the same on my machine.
> >
> > I hope you'll learn from this.
>
> I hope you'll learn to think through the code before you speak out of
> your ass, the concept is perfect, it's only that my implementation of
> that concept isn't

[-- Attachment #2: lock.c --]
[-- Type: text/x-csrc, Size: 4497 bytes --]

#define _GNU_SOURCE
#include <limits.h>
#include <stdbool.h>
#include <unistd.h>
#include <errno.h>
#include <linux/types.h>
#include <time.h>
#include <sys/resource.h>
#include <sched.h>
#include <setjmp.h>
#include <signal.h>
#include <pthread.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>

//#define PRINT_LOCKS
//#define PRINT_ATTEMPTS
/* Seconds */
#define TIMED_TEST 0
/* Loops, not used if TIMED_TEST != 0 */
#define TRIES_TODO CLOCKS_PER_SEC

typedef unsigned int uint;
typedef unsigned long int ulong;
typedef struct _LOCK
{
	uint num;
	void *ud;
	struct timespec ts;
	volatile pid_t tid;
	volatile pid_t trying;
} LOCK;

volatile LOCK *_shared = NULL;

void lock_handler( int signal )
{
	/* We don't want the pointer we're working with to change midway
	 * through so we take a copy then work with that */
	volatile LOCK *shared = _shared;
	(void)signal;

	if ( !(shared->tid) )
		shared->tid = shared->trying;
#ifdef PRINT_ATTEMPTS
	flockfile( stdout );
	printf( "Thread %lu attempted lock\n", (ulong)(shared->trying) );
	funlockfile( stdout );
#endif
}

int LockSiData( LOCK *shared )
{
	int const sig = SIGCONT;
	pid_t tid = gettid(), was;
	struct sigaction this = {NULL}, prev = {NULL};

	/* Possible our signal handler will be called before _shared is not
	 * NULL so we set it prior to trying then continue on */
	_shared = shared;
	this.sa_handler = lock_handler;

	sigaction( sig, &this, &prev );

	for ( was = shared->tid; was != tid; was = shared->tid )
	{
		if ( !was )
		{
			shared->trying = tid;
			_shared = shared;
			kill( getpid(), sig );
		}
	}

	sigaction( sig, &prev, &this );

	clock_gettime( CLOCK_PROCESS_CPUTIME_ID, &(shared->ts) );
	shared->num++;

#ifdef PRINT_LOCKS
	flockfile( stdout );
	printf( "Thread %lu took lock\n", (ulong)tid );
	funlockfile( stdout );
#endif
	return 0;
}

int FreeSiData( LOCK *shared )
{
	pid_t tid = gettid();
	if ( shared->tid != tid )
		return 0;
	shared->num--;
	if ( shared->num )
		return 0;
#ifdef PRINT_LOCKS
	flockfile( stdout );
	printf( "Thread %lu released lock\n", (ulong)tid );
	funlockfile( stdout );
#endif
	shared->tid = (pid_t)0;
	return 0;
}

LOCK tlock = {0};
pthread_mutex_t mutex;

typedef int (*lock_cb)( void *ud );
typedef struct _TEST
{
	volatile uint quit;
	volatile uint data;
	void *ud;
	char *name;
	lock_cb lock;
	lock_cb free;
} TEST;

void* Abort( TEST *test, uint got, uint expected, clock_t start )
{
	ulong ticks = (ulong)(clock() - start);
	test->free( test->ud );
	flockfile( stdout );
	printf
	(
		"Thread %lu (lock%s) ended at %lu ticks, "
		"got = %u, expected %u\n",
		(ulong)gettid(), test->name, ticks, got, expected
	);
	funlockfile( stdout );
	exit(1);
	/* Prevents going further than expected */
	return test;
}

void* thread( void *ud )
{
	TEST *test = ud;
	uint got, expected;
	pid_t tid = gettid();
	clock_t start = clock(), end = start + (CLOCKS_PER_SEC * TIMED_TEST);
	struct timespec ts = {0};
	ts.tv_nsec = 1;
	(void)ud;

	flockfile( stdout );
	printf( "Thread %lu (lock%s)\n", (ulong)tid, test->name );
	funlockfile( stdout );

#if TIMED_TEST
	while ( end > clock() )
#else
	while ( test->quit < TRIES_TODO )
#endif
	{
		test->lock( test->ud );

		expected = 0;
		got = (test->data)++;
		if (got != expected)
			return Abort( test, got, expected, start );

		clock_nanosleep(CLOCK_MONOTONIC, 0, &ts, 0);

		expected = 1;
		got = (test->data)--;
		if (got != expected)
			return Abort( test, got, expected, start );

		test->quit++;
		test->free( test->ud );
	}

	end = clock();
	flockfile( stdout );
	printf
	(
		"lock%s (%lu) took %5lu clock ticks\n",
		test->name, (ulong)tid, (ulong)(end - start)
	);
	funlockfile( stdout );
	return ud;
}

int main()
{
	pthread_t pt;
	int i;

	TEST *test;
	TEST tests[2] = {{0}};

	setbuf(stdout,NULL);

	test = tests;
	test->ud = &tlock;
	test->name = "sidata";
	test->lock = (lock_cb)LockSiData;
	test->free = (lock_cb)FreeSiData;

	test = tests + 1;
	test->ud = &mutex;
	test->name = "mutex";
	test->lock = (lock_cb)pthread_mutex_lock;
	test->free = (lock_cb)pthread_mutex_unlock;

	for (i = 0; i < 2; i++)
	{
		if ((errno = pthread_create(&pt, 0, thread, tests)) != 0 )
		{
			flockfile( stdout );
			printf("pthread_create failed: %m\n");
			funlockfile( stdout );
			return 1;
		}

		if ((errno = pthread_create(&pt, 0, thread, tests + 1)) != 0 )
		{
			flockfile( stdout );
			printf("pthread_create failed: %m\n");
			funlockfile( stdout );
			return 1;
		}
	}

	pthread_exit(0);
	pthread_mutex_destroy( &mutex );
}

next prev parent reply	other threads:[~2022-03-02  1:49 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-21 11:36 Lee Shallis
2022-02-21 17:42 ` Markus Wichmann
2022-02-23  0:30   ` Lee Shallis
2022-02-23 18:57     ` Markus Wichmann
2022-02-23 20:06       ` Rich Felker
2022-02-26  9:56       ` Lee Shallis
2022-02-26 11:38         ` Joakim Sindholt
2022-02-27 23:32           ` Lee Shallis
2022-02-28  0:15             ` Rich Felker
2022-02-28  8:48             ` Joakim Sindholt
2022-02-28 14:43               ` Lee Shallis
2022-02-28 15:19                 ` Rich Felker
2022-02-28 15:50                 ` Joakim Sindholt
2022-02-28 16:07                   ` Lee Shallis
2022-03-02  1:44                     ` Lee Shallis [this message]
2022-02-23  1:19 ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOZ3c1q7m5wgryBYzoE1Y60guxXog-bkrG8qCz0tyxj3xSMENQ@mail.gmail.com \
    --to=gb2985@gmail.com \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).