From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id 5733ABDDB for ; Fri, 2 Sep 2005 23:40:10 +0200 (CEST) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j82Le9YT010060 for ; Fri, 2 Sep 2005 23:40:09 +0200 Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id XAA10941 for ; Fri, 2 Sep 2005 23:40:09 +0200 (MET DST) Received: from wproxy.gmail.com (wproxy.gmail.com [64.233.184.206]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j82Le80g010057 for ; Fri, 2 Sep 2005 23:40:09 +0200 Received: by wproxy.gmail.com with SMTP id 68so616646wra for ; Fri, 02 Sep 2005 14:40:08 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=Kw5tg+N0xl0QtOldiG5xTtjtlu6GA149O0I6ZJvv84jRSqLKDGAE2Vumvmr7/pFOHo7ci9lOHsDTnIyuM2jbTWe8/iW5kvg8pygGQb1PAuyc91s/3zVx0rf4ZrwGbt81nKnkuxWS/HV7bGPlQcV/jvfzRNCOex09r1RGnJ83faQ= Received: by 10.54.46.69 with SMTP id t69mr2799366wrt; Fri, 02 Sep 2005 14:40:08 -0700 (PDT) Received: by 10.54.154.7 with HTTP; Fri, 2 Sep 2005 14:40:08 -0700 (PDT) Message-ID: <891bd33905090214401b046fcb@mail.gmail.com> Date: Fri, 2 Sep 2005 17:40:08 -0400 From: Yaron Minsky Reply-To: yminsky@cs.cornell.edu To: Caml Mailing List Subject: Re: Odd behavior with GC and Mutex.lock (resolved) In-Reply-To: <891bd339050831052665d380b2@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline References: <891bd339050831052665d380b2@mail.gmail.com> X-Miltered: at nez-perce with ID 4318C6B9.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at nez-perce with ID 4318C6B8.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; yaron:01 minsky:01 yminsky:01 mistaken:01 scheduler:01 sched:01 threads:01 yaron:01 minsky:01 yminsky:01 threads:01 synchronized:01 minor:01 delayed:01 delayed:01 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=RCVD_BY_IP autolearn=disabled version=3.0.3 For anyone who is interested, we've resolved this issue. We were mistaken to think this was a GC issue at all. The long delays simply resulted from some odd decisions by the Linux scheduler. We resolved them by changing SCHED_RR scheduling for the threads in question. Yaron On 8/31/05, Yaron Minsky wrote: > I've been running into some very odd behavior with the GC and > Mutex.lock. We have an application where a bunch of threads are all > pushing updates through a synchronized queue, so all of the threads are > contending for a global lock. Under some rare circumstances, we've seen > very odd GC delays coming up, and we're at a bit of a loss to explain > them. >=20 > The basic situation is that when the system comes under unusually high > GC load, we will sometimes see one of the threads take a long time (on > the order of half a second) when it tries to acquire the lock. We > instrumented the acquisition of the lock as follows: >=20 > let pre_stat =3D Gc.quick_stat () in > Mutex.lock mutex > let post_stat =3D Gc.quick_stat () in >=20 > The thread that gets delays shows hundreds or thousands of minor > collections and 5-10 major collections between the calls to > Gc.quick_stat (). What's also weird is that it's always the same thread > that gets blocked, and other threads are not blocked during this period. > They seem to be able to acquire and release the lock without problems. > And the thread that does get delayed isn't even the one causing the > unusual load (although the delayed thread does do a big chunk of > allocation that is immediately unreachable right before it gets blocked). >=20 > I can't come up with a reasonable explanation for how this might happen, > and I'm hoping someone with a deeper understanding of the GC than I > might be able to help. >=20 > Yaron >