Development discussion of WireGuard
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: paulmck@kernel.org
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>,
	"Uladzislau Rezki (Sony)" <urezki@gmail.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Julia Lawall <Julia.Lawall@inria.fr>,
	linux-block@vger.kernel.org, kernel-janitors@vger.kernel.org,
	bridge@lists.linux.dev, linux-trace-kernel@vger.kernel.org,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	kvm@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	"Naveen N. Rao" <naveen.n.rao@linux.ibm.com>,
	Christophe Leroy <christophe.leroy@csgroup.eu>,
	Nicholas Piggin <npiggin@gmail.com>,
	netdev@vger.kernel.org, wireguard@lists.zx2c4.com,
	linux-kernel@vger.kernel.org, ecryptfs@vger.kernel.org,
	Neil Brown <neilb@suse.de>, Olga Kornievskaia <kolga@netapp.com>,
	Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
	linux-nfs@vger.kernel.org, linux-can@vger.kernel.org,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	netfilter-devel@vger.kernel.org, coreteam@netfilter.org,
	kasan-dev <kasan-dev@googlegroups.com>
Subject: Re: [PATCH 00/14] replace call_rcu by kfree_rcu for simple kmem_cache_free callback
Date: Mon, 17 Jun 2024 23:34:04 +0200	[thread overview]
Message-ID: <e7cbca4d-9b34-46f8-961a-9f8ddc92be21@suse.cz> (raw)
In-Reply-To: <1755282b-e3f5-4d18-9eab-fc6a29ca5886@paulmck-laptop>

On 6/17/24 8:54 PM, Paul E. McKenney wrote:
> On Mon, Jun 17, 2024 at 07:23:36PM +0200, Vlastimil Babka wrote:
>> On 6/17/24 6:12 PM, Paul E. McKenney wrote:
>>> On Mon, Jun 17, 2024 at 05:10:50PM +0200, Vlastimil Babka wrote:
>>>> On 6/13/24 2:22 PM, Jason A. Donenfeld wrote:
>>>>> On Wed, Jun 12, 2024 at 08:38:02PM -0700, Paul E. McKenney wrote:
>>>>>> o	Make the current kmem_cache_destroy() asynchronously wait for
>>>>>> 	all memory to be returned, then complete the destruction.
>>>>>> 	(This gets rid of a valuable debugging technique because
>>>>>> 	in normal use, it is a bug to attempt to destroy a kmem_cache
>>>>>> 	that has objects still allocated.)
>>>>
>>>> This seems like the best option to me. As Jason already said, the debugging
>>>> technique is not affected significantly, if the warning just occurs
>>>> asynchronously later. The module can be already unloaded at that point, as
>>>> the leak is never checked programatically anyway to control further
>>>> execution, it's just a splat in dmesg.
>>>
>>> Works for me!
>>
>> Great. So this is how a prototype could look like, hopefully? The kunit test
>> does generate the splat for me, which should be because the rcu_barrier() in
>> the implementation (marked to be replaced with the real thing) is really
>> insufficient. Note the test itself passes as this kind of error isn't wired
>> up properly.
> 
> ;-) ;-) ;-)

Yeah yeah, I just used the kunit module as a convenient way add the code
that should see if there's the splat :)

> Some might want confirmation that their cleanup efforts succeeded,
> but if so, I will let them make that known.

It could be just the kunit test that could want that, but I don't see
how it could wrap and inspect the result of the async handling and
suppress the splats for intentionally triggered errors as many of the
other tests do.

>> Another thing to resolve is the marked comment about kasan_shutdown() with
>> potential kfree_rcu()'s in flight.
> 
> Could that simply move to the worker function?  (Hey, had to ask!)

I think I had a reason why not, but I guess it could move. It would just
mean that if any objects are quarantined, we'll go for the async freeing
even though those could be flushed immediately. Guess that's not too bad.

>> Also you need CONFIG_SLUB_DEBUG enabled otherwise node_nr_slabs() is a no-op
>> and it might fail to notice the pending slabs. This will need to change.
> 
> Agreed.
> 
> Looks generally good.  A few questions below, to be taken with a
> grain of salt.

Thanks!

>> +static void kmem_cache_kfree_rcu_destroy_workfn(struct work_struct *work)
>> +{
>> +	struct kmem_cache *s;
>> +	int err = -EBUSY;
>> +	bool rcu_set;
>> +
>> +	s = container_of(work, struct kmem_cache, async_destroy_work);
>> +
>> +	// XXX use the real kmem_cache_free_barrier() or similar thing here
>> +	rcu_barrier();

Note here's the barrier.

>> +	cpus_read_lock();
>> +	mutex_lock(&slab_mutex);
>> +
>> +	rcu_set = s->flags & SLAB_TYPESAFE_BY_RCU;
>> +
>> +	err = shutdown_cache(s, true);
> 
> This is currently the only call to shutdown_cache()?  So there is to be
> a way for the caller to have some influence over the value of that bool?

Not the only caller, there's still the initial attempt in
kmem_cache_destroy() itself below.

> 
>> +	WARN(err, "kmem_cache_destroy %s: Slab cache still has objects",
>> +	     s->name);
> 
> Don't we want to have some sort of delay here?  Or is this the
> 21-second delay and/or kfree_rcu_barrier() mentioned before?

Yes this is after the barrier. The first immediate attempt to shutdown
doesn't warn.

>> +	mutex_unlock(&slab_mutex);
>> +	cpus_read_unlock();
>> +	if (!err && !rcu_set)
>> +		kmem_cache_release(s);
>> +}
>> +
>>  void kmem_cache_destroy(struct kmem_cache *s)
>>  {
>>  	int err = -EBUSY;
>> @@ -494,9 +527,9 @@ void kmem_cache_destroy(struct kmem_cache *s)
>>  	if (s->refcount)
>>  		goto out_unlock;
>>  
>> -	err = shutdown_cache(s);
>> -	WARN(err, "%s %s: Slab cache still has objects when called from %pS",
>> -	     __func__, s->name, (void *)_RET_IP_);
>> +	err = shutdown_cache(s, false);
>> +	if (err)
>> +		schedule_work(&s->async_destroy_work);

And here's the initial attempt that used to warn but now doesn't and
instead schedules the async one.

>>  out_unlock:
>>  	mutex_unlock(&slab_mutex);
>>  	cpus_read_unlock();
>> diff --git a/mm/slub.c b/mm/slub.c
>> index 1617d8014ecd..4d435b3d2b5f 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -5342,7 +5342,8 @@ static void list_slab_objects(struct kmem_cache *s, struct slab *slab,
>>   * This is called from __kmem_cache_shutdown(). We must take list_lock
>>   * because sysfs file might still access partial list after the shutdowning.
>>   */
>> -static void free_partial(struct kmem_cache *s, struct kmem_cache_node *n)
>> +static void free_partial(struct kmem_cache *s, struct kmem_cache_node *n,
>> +			 bool warn_inuse)
>>  {
>>  	LIST_HEAD(discard);
>>  	struct slab *slab, *h;
>> @@ -5353,7 +5354,7 @@ static void free_partial(struct kmem_cache *s, struct kmem_cache_node *n)
>>  		if (!slab->inuse) {
>>  			remove_partial(n, slab);
>>  			list_add(&slab->slab_list, &discard);
>> -		} else {
>> +		} else if (warn_inuse) {
>>  			list_slab_objects(s, slab,
>>  			  "Objects remaining in %s on __kmem_cache_shutdown()");
>>  		}
>> @@ -5378,7 +5379,7 @@ bool __kmem_cache_empty(struct kmem_cache *s)
>>  /*
>>   * Release all resources used by a slab cache.
>>   */
>> -int __kmem_cache_shutdown(struct kmem_cache *s)
>> +int __kmem_cache_shutdown(struct kmem_cache *s, bool warn_inuse)
>>  {
>>  	int node;
>>  	struct kmem_cache_node *n;
>> @@ -5386,7 +5387,7 @@ int __kmem_cache_shutdown(struct kmem_cache *s)
>>  	flush_all_cpus_locked(s);
>>  	/* Attempt to free all objects */
>>  	for_each_kmem_cache_node(s, node, n) {
>> -		free_partial(s, n);
>> +		free_partial(s, n, warn_inuse);
>>  		if (n->nr_partial || node_nr_slabs(n))
>>  			return 1;
>>  	}
>>

  reply	other threads:[~2024-06-17 21:33 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-09  8:27 Julia Lawall
2024-06-09  8:27 ` [PATCH 01/14] wireguard: allowedips: " Julia Lawall
2024-06-09 14:32   ` Jason A. Donenfeld
2024-06-09 14:36     ` Julia Lawall
2024-06-10 20:38     ` Vlastimil Babka
2024-06-10 20:59       ` Jason A. Donenfeld
2024-06-12 21:33 ` [PATCH 00/14] " Jakub Kicinski
2024-06-12 22:37   ` Paul E. McKenney
2024-06-12 22:46     ` Jakub Kicinski
2024-06-12 22:52     ` Jens Axboe
2024-06-12 23:04       ` Paul E. McKenney
2024-06-12 23:31     ` Jason A. Donenfeld
2024-06-13  0:31       ` Jason A. Donenfeld
2024-06-13  3:38         ` Paul E. McKenney
2024-06-13 12:22           ` Jason A. Donenfeld
2024-06-13 12:46             ` Paul E. McKenney
2024-06-13 14:11               ` Jason A. Donenfeld
2024-06-13 15:12                 ` Paul E. McKenney
2024-06-17 15:10             ` Vlastimil Babka
2024-06-17 16:12               ` Paul E. McKenney
2024-06-17 17:23                 ` Vlastimil Babka
2024-06-17 18:42                   ` Uladzislau Rezki
2024-06-17 21:08                     ` Vlastimil Babka
2024-06-18  9:31                       ` Uladzislau Rezki
2024-06-18 16:48                         ` Paul E. McKenney
2024-06-18 17:21                           ` Vlastimil Babka
2024-06-18 17:53                             ` Paul E. McKenney
2024-06-19  9:28                               ` Vlastimil Babka
2024-06-19 16:46                                 ` Paul E. McKenney
2024-06-21  9:32                                 ` Uladzislau Rezki
2024-07-15 20:39                                   ` Vlastimil Babka
2024-07-24 13:53                                     ` Paul E. McKenney
2024-07-24 14:40                                       ` Vlastimil Babka
2024-10-08 16:41                                       ` Vlastimil Babka
2024-10-08 20:02                                         ` Paul E. McKenney
2024-10-09 17:08                                           ` Julia Lawall
2024-10-09 21:02                                             ` Paul E. McKenney
2024-06-19  9:51                           ` Uladzislau Rezki
2024-06-19  9:56                             ` Vlastimil Babka
2024-06-19 11:22                               ` Uladzislau Rezki
2024-06-17 18:54                   ` Paul E. McKenney
2024-06-17 21:34                     ` Vlastimil Babka [this message]
2024-06-13 14:17           ` Jakub Kicinski
2024-06-13 14:53             ` Paul E. McKenney
2024-06-13 11:58     ` Jason A. Donenfeld
2024-06-13 12:47       ` Paul E. McKenney
2024-06-13 13:06         ` Uladzislau Rezki
2024-06-13 15:06           ` Paul E. McKenney
2024-06-13 17:38             ` Uladzislau Rezki
2024-06-13 17:45               ` Paul E. McKenney
2024-06-13 17:58                 ` Uladzislau Rezki
2024-06-13 18:13                   ` Paul E. McKenney
2024-06-14 12:35                     ` Uladzislau Rezki
2024-06-14 14:17                       ` Paul E. McKenney
2024-06-14 14:50                         ` Uladzislau Rezki
2024-06-14 19:33                       ` Jason A. Donenfeld
2024-06-17 13:50                         ` Uladzislau Rezki
2024-06-17 14:56                           ` Jason A. Donenfeld
2024-06-17 16:30                             ` Uladzislau Rezki
2024-06-17 16:33                               ` Jason A. Donenfeld
2024-06-17 16:38                                 ` Vlastimil Babka
2024-06-17 17:04                                   ` Jason A. Donenfeld
2024-06-17 21:19                                     ` Vlastimil Babka
2024-06-17 16:42                                 ` Uladzislau Rezki
2024-06-17 16:57                                   ` Jason A. Donenfeld
2024-06-17 17:19                                     ` Uladzislau Rezki
2024-06-17 14:37                         ` Vlastimil Babka
2024-10-08 16:36 ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e7cbca4d-9b34-46f8-961a-9f8ddc92be21@suse.cz \
    --to=vbabka@suse.cz \
    --cc=Dai.Ngo@oracle.com \
    --cc=Jason@zx2c4.com \
    --cc=Julia.Lawall@inria.fr \
    --cc=bridge@lists.linux.dev \
    --cc=christophe.leroy@csgroup.eu \
    --cc=coreteam@netfilter.org \
    --cc=ecryptfs@vger.kernel.org \
    --cc=jiangshanlai@gmail.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=kernel-janitors@vger.kernel.org \
    --cc=kolga@netapp.com \
    --cc=kuba@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-can@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=naveen.n.rao@linux.ibm.com \
    --cc=neilb@suse.de \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=npiggin@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=tom@talpey.com \
    --cc=urezki@gmail.com \
    --cc=wireguard@lists.zx2c4.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).