From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id SAA20766; Wed, 19 Sep 2001 18:40:41 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id SAA20836 for ; Wed, 19 Sep 2001 18:40:40 +0200 (MET DST) Received: from beaune.inria.fr (beaune.inria.fr [128.93.8.3]) by nez-perce.inria.fr (8.11.1/8.10.0) with ESMTP id f8JGed101352 for ; Wed, 19 Sep 2001 18:40:40 +0200 (MET DST) Received: by beaune.inria.fr (8.8.8/1.1.22.3/14Sep99-0328PM) id SAA0000032430; Wed, 19 Sep 2001 18:40:39 +0200 (MET DST) Date: Wed, 19 Sep 2001 18:40:39 +0200 (MET DST) From: Damien Doligez Message-Id: <200109191640.SAA0000032430@beaune.inria.fr> To: caml-list@inria.fr, maggesi@math.unifi.it Subject: Re: [Caml-list] mark_slice, sweep_slice, oldify Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk >From: Marco Maggesi >With the aid of gprof I noticed that my program spends most of the time >of the computation in three functions `mark_slice', `sweep_slice' and >`oldify' (see below the output of gprof obtained after the computation >of some products of polynomials). >What does this exactly mean? This is the generational garbage >collector, right? May I low the cost of GC? As Xavier said, "oldify" is the minor collector and "mark_slice" and "sweep_slice" are the major collector. Spending 17% of your time in "oldify" is not shockingly bad. Spending 45% in the major collector is a lot higher than normal. >I do not pretend a precise answer. Simply tell me what I have to look >or study to understand which operation stress the GC more and which not. It is not really operations that stress the GC, but the behavior of your programs with respect to data structures, so it is always a bit hard to understand what is going on exactly. The default GC parameters are a compromise between speed and memory use, and there is no way to make them optimal for all programs. If you want more speed (in exchange for more memory "wasted"), there are mainly two parameters that you can change: minor_heap_size and space_overhead. Increasing minor_heap_size will reduce the time spent by both the minor GC and the major GC. The default is 32768 words, and depending on your machine and the behaviour of your program, it can be reasonable to increase it to 250000, or even to a few millions. It is often (but not always) better to keep it small enough to fit in the cache of your machine. Increasing space_overhead will reduce the time spent in the major GC. The default is 42 percent. If your program uses 100 megabytes of data structures, then the GC will (on average) use an additional 42 megabytes of memory (you can think of it as a buffer for allocations; the larger the buffer, the less work the GC has to do). It can be reasonable to increase space_overhead to 100 or 200, again depending on the amount of memory in your machine and the behaviour of your program. It may also be interesting to increase major_heap_increment if the program uses a lot of memory (which seems to be true in your case). By increasing it, you reduce the number of times that add_to_heap is called. The default value is 62k words. To have a better idea of what's going on, you need to use the Gc module of the standard library. You should print the GC statistics at the end of your program. The most important figure is the ratio of promoted_words to minor_words. This should be as small as possible. If it is more than 10%, you'll spend too much time in the GC (both minor and major). Increasing minor_heap_size often helps in this case. Please send me mail if the above hints are not enough to solve your problem. -- Damien ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr