On 2022-09-03 07:48, Rich Felker wrote:
> On Sat, Sep 03, 2022 at 03:19:40AM +0300, Alexey Izbyshev wrote:
>> sem_open() attempts to avoid overflow on the future ref count 
>> increment
>> by computing the sum of all ref counts in semtab and checking that 
>> it's
>> not INT_MAX when semtab is locked for slot reservation.  This approach
>> suffers from a TOCTTOU: by the time semtab is re-locked after opening 
>> a
>> semaphore, the sum could have already been increased by a concurrent
>> sem_open(), so it will overflow on the increment. An individual ref
>> count can be overflowed as well if the call happened to open a 
>> duplicate
>> of the only other semaphore.
>> 
>> Moreover, since the overflow avoidance check admits a negative (i.e.
>> overflowed) sum, after this state is reached, an individual ref count
>> can be incremented until it reaches 1 again, and then sem_close() will
>> incorrectly free the semaphore, promoting what could be just a
>> signed-overflow UB to a use-after-free.
>> 
>> Fix the overflow check by accounting for the maximum possible number 
>> of
>> concurrent sem_open() calls that have already reserved a slot, but
>> haven't incremented the ref count yet.
>> ---
>> Alternatively, we could use the number of currently reserved slots
>> instead of SEM_NSEMS_MAX, preserving the ability of individual ref
>> counts to reach INT_MAX in non-concurrent scenarios. I'm not sure
>> whether it matters, so I'm sending a smaller of the two fixes.
>> 
>> Alexey
>> ---
>>  src/thread/sem_open.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/src/thread/sem_open.c b/src/thread/sem_open.c
>> index 0ad29de9..611a3f64 100644
>> --- a/src/thread/sem_open.c
>> +++ b/src/thread/sem_open.c
>> @@ -61,7 +61,7 @@ sem_t *sem_open(const char *name, int flags, ...)
>>  		if (!semtab[i].sem && slot < 0) slot = i;
>>  	}
>>  	/* Avoid possibility of overflow later */
>> -	if (cnt == INT_MAX || slot < 0) {
>> +	if (cnt > INT_MAX - SEM_NSEMS_MAX || slot < 0) {
>>  		errno = EMFILE;
>>  		UNLOCK(lock);
>>  		return SEM_FAILED;
>> --
>> 2.37.2
> 
> Thanks! This is probably acceptable at least relative to the current
> behavior, but thinking about it, the current behavior (this whole
> logic) doesn't really make sense. If flags are such that a new
> semaphore can't be created, the claim that there's no way to handle
> failure after opening is false; the operation is side-effect free and
> we can just back out. The only case where we can't back out is when
> we're creating a new semaphore, and in that case, it will never be
> incrementing the refcnt on an existing slot; it will always be a new
> slot. And even in this case, I think we could back out if we needed
> to, since the file is created initially with a temp name.

Indeed, my patch tried to fix the issue while staying within the wider 
logic of the current implementation. Since it makes the newly created 
semaphore visible via link() *before* re-locking semtab, the following 
scenario could happen:

* thread 1 calls sem_open("/sem", O_CREAT) and proceeds until after 
link() succeeds
* other threads successfully call sem_open("/sem", 0) INT_MAX times
* thread 1 re-locks semtab, discovers another slot with the i_no it 
created, but can't increment the refcnt

So in this case even though thread 1 did create a new semaphore, it 
still has to use an existing slot and hence can't back out (if it does, 
it'd look like semaphore creation didn't succeed, but opening the 
semaphore did).

If link() and the refcnt increment occur under the same lock, this issue 
can be avoided, as in the attached draft patch.

There is still a separate spurious failure case when after opening 
SEM_NSEMS_MAX semaphores an application can't open even an already 
opened one, but I don't think it's possible to completely fix in case 
O_CREAT is specified because we can't know whether we need to allocate a 
new slot or not before link() returns.

Thanks,
Alexey