From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <7d7ee33e0ce036134194a640c6b2cc81@gmx.de> Date: Wed, 29 Feb 2012 15:14:57 +0100 From: cinap_lenrek@gmx.de To: 9fans@9fans.net In-Reply-To: <1439e4e1393f1910df845c17ed9eb0bc@hamnavoe.com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Subject: Re: [9fans] duppage Topicbox-Message-UUID: 65c1a3dc-ead7-11e9-9d60-3106f5b1d025 no, this is different. the XXX - there's a bug is about the new page that duppage() makes. it explains a race where the new page is on the freelist and the image cache. whats important is that when someone (lookpage) locks the page, its refcount and image/daddr is consistent wich should be the case as far as i can see. if you are paranoid, you can just check in newpage() auxpage() and duppage() what the refcount of the page is you just grabbed from the freelist is. if its anything other than 0, then we hit that XXX bug. (done that, never triggered) at the point when duppage unlocks(np), the following things are possible: lookpage() could succeed locking np, finding its ref to be 0 and incrementing its ref and unchaining it from the freelist. (all while holding palloc lock so it wont interfere with newpage() or auxpage()) if newpage(), auxpage() or duppage() got the lock first, the page will be uncached, or point to a differnt image/daddr (duppage). (with some luck, it might even point to the same image/daddr, then it just doesnt matter :)) or someone like newpage() or auxpage() succeeding locking it, incrementing its ref and removing it from the image cache. the cachepage() is done by duppage() while holding lock on np so theres no race. or duppage() on another process succeeds with locking np, removing it from the image cache, and filling it with contents of the to be duped page and putting it back in the image cache. all this looks safe to me, but i might be missing something of course... anyway, thats about all i know about the XXX comment :) back to the duppage bug i was talking about... the bug that i see is that when duppage() is called, p has to have a ref of 1. the whole duppage approach will not work if p is shared with other segments already. in the normal case of duppage() when we have p locked, and p->ref is 1, and we are calling uncachepage(p); before unlocking p we are safe. because even if someone like lookpage finds the page in the image cache, it has to lock it first. when it succeeds locking p, then lookpage() will find p->image != i because duppage called uncachepage(p) before unlocking it. so it wont get shared ever again. the problem we have is that we temporarily unlock p to acquire palloc lock, wich opens a chance for someone to take a ref on p, but duppage doesnt recheck after reqcquiering the p lock. if this happens, duppage() has to return and let fixfault() make a copy for its segment. -- cinap