Welp, I think I finally managed to fix my implementation, wasn't quite
what I had in mind but it was the only method that seemed to work
without the bulky code pthread_mutex_lock falls to, it is however
slightly slower so I would treat it as a fallback for systems that
don't provide a mutex for now, the solution I ended up with utilises
kill( getpid(), SIGCONT ) & an additional member to identify which
thread managed to get their pid_t in at the time of the claim.

On Mon, 28 Feb 2022 at 16:07, Lee Shallis <gb2985@gmail.com> wrote:
>
> On Mon, 28 Feb 2022 at 15:51, Joakim Sindholt <opensource@zhasha.com> wrote:
> >
> > On Mon, 28 Feb 2022 14:43:36 +0000, Lee Shallis <gb2985@gmail.com> wrote:
> > > Seems the wait just wasn't long enough, at about 4 yields onwards the
> > > results become consistent success, I've attached the file I did the
> > > experiments in, I even tried it under -O3 and no exits were
> > > encountered, so yes my method works, just needs a bit more wait time
> > > for extreme cases
> >
> > Between the lines
> > > if ( !(shared->tid) )
> > and
> > > shared->tid = tid;
> > the kernel might suspend the running thread and allow the other to run,
> > or you might simply get unlucky and have the two threads do the checks
> > close enough to simultaneously that the memory hasn't been synchronized
> > yet. Either way you end up with both threads seeing that shared->tid is
> > zero and both of them writing their tids to it, and thus both enter the
> That's the point of the loop, to check it's the same as what they
> wrote, if it's not then it's either locked to another thread or empty,
> the point in doing the yield after the write is to allow that failure
> to occur, basically I'm using the race condition itself as the point
> of success, rather than expect the CPU to perform an atomic lock that
> could be just as broken as timing based locks, I already have my ideas
> on how to fix the need for many yields to need only 2, I'm about to
> try it now
> > critical section at the same time. And so the lock fails at the very
> > first hurdle: mutual exclusion. No amount of sleeping will make the bug
> > go away, only slightly more difficult to trigger.
>
> No it doesn't, think through the loop properly and you'll see that the
> concept is the best one to go with, implementation just needs a little
> work
>
> > The point of the clock_nanosleep call was to force a reschedule while
> > holding the lock. This also increases the runtime inside the lock which
> > in this case increases the likelihood that the thread trying to take the
> > lock will be waiting for it and end up racing with the thread that
> > currently has it when it unlocks and tries to relock it.
>
> How so? It still takes time for the jump condition to be evaluated and
> the call to LockSiData to start, the other thread will already be in
> the call loop ready to lock it, I designed this function specifically
> around the idea that multiple threads could see an empty tid at the
> same time, that's the reason for the yield call, so that all those
> writes get in before the execution resumes.
>
> > Now that you've inserted lots of sched_yield()s your lock is not only
> > still broken (in more ways than the one we've been trying to get you to
> > understand) but also extremely slow.
> >
> > As a hint for your future education: the first (and far from only) thing
> > you'll need is compare-and-swap, aka. CAS.
> > You can read up on this class of bugs if you'd like. It's called "Time
> > Of Check to Time Of Use" or "TOCTOU" for short.
> >
> > I didn't even need to poke at the code this time as the code you sent
> > breaks just the same on my machine.
> >
> > I hope you'll learn from this.
>
> I hope you'll learn to think through the code before you speak out of
> your ass, the concept is perfect, it's only that my implementation of
> that concept isn't