caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* More re GC hanging
@ 2010-08-15  5:57 Paul Steckler
  2010-08-15  7:03 ` [Caml-list] " Basile Starynkevitch
  2010-08-18 12:22 ` Goswin von Brederlow
  0 siblings, 2 replies; 19+ messages in thread
From: Paul Steckler @ 2010-08-15  5:57 UTC (permalink / raw)
  To: caml-list

I haven't yet come up with a solution to the GC hanging problem I
mentioned the other day.

But here's something that looks funny.  I changed the default minor
heap size, the major
heap increment, the allocation policy.  I also threw in a
`Gc.major_slice 0' in the code.
After turning on the Gc verbose option, I see:

 New heap increment size: 1000k bytes
 New allocation policy: 1
 New minor heap size: 500k bytes
 <>Starting new major GC cycle
 allocated_words = 9404
 extra_heap_resources = 49000u
 amount of work to do = 249956u
 ordered work = 0 words
 computed work = 44081 words
 Marking 44081 words
 Subphase = 10
 !<>Sweeping 9223372036854775807 words
 Starting new major GC cycle
 Marking 9223372036854775807 words
 Subphase = 10
 Sweeping 9223372036854775807 words

Those are some big mark and sweep numbers at the end!

I'm using the x64 version of Fedora 12.  Maybe the 64-bit garbage
collector has some integer
overflow problems?

I wasn't seeing those huge numbers in other experiments where the Gc
hangs, so maybe that's
not the underlying problem.

-- Paul


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Caml-list] More re GC hanging
  2010-08-15  5:57 More re GC hanging Paul Steckler
@ 2010-08-15  7:03 ` Basile Starynkevitch
  2010-08-15  8:34   ` Paul Steckler
                     ` (2 more replies)
  2010-08-18 12:22 ` Goswin von Brederlow
  1 sibling, 3 replies; 19+ messages in thread
From: Basile Starynkevitch @ 2010-08-15  7:03 UTC (permalink / raw)
  To: Paul Steckler; +Cc: caml-list

On Sun, 2010-08-15 at 15:57 +1000, Paul Steckler wrote:
> I haven't yet come up with a solution to the GC hanging problem I
> mentioned the other day.
> 
> But here's something that looks funny. [..]

> After turning on the Gc verbose option, I see:

[...]
>  !<>Sweeping 9223372036854775807 words
>  Starting new major GC cycle
>  Marking 9223372036854775807 words
>  Subphase = 10
>  Sweeping 9223372036854775807 words
> 
> Those are some big mark and sweep numbers at the end!

I guess this is related to the fact that recent Linux kernel have turned
on the randomize virtual address space feature -designed to improve
system security. You could disable it by 
  echo 0 > /proc/sys/kernel/randomize_va_space
but first learn more about it.

I believe recent Ocaml versions (did you try 3.12?) have GC improvements
for that.

Cheers.
-- 
Basile STARYNKEVITCH         http://starynkevitch.net/Basile/
email: basile<at>starynkevitch<dot>net mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mine, sont seulement les miennes} ***


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Caml-list] More re GC hanging
  2010-08-15  7:03 ` [Caml-list] " Basile Starynkevitch
@ 2010-08-15  8:34   ` Paul Steckler
  2010-08-15  8:40   ` Paul Steckler
  2010-08-20 15:21   ` Richard Jones
  2 siblings, 0 replies; 19+ messages in thread
From: Paul Steckler @ 2010-08-15  8:34 UTC (permalink / raw)
  To: basile; +Cc: caml-list

On Sun, Aug 15, 2010 at 5:03 PM, Basile Starynkevitch
<basile@starynkevitch.net> wrote:
> I guess this is related to the fact that recent Linux kernel have turned
> on the randomize virtual address space feature -designed to improve
> system security. You could disable it by
>  echo 0 > /proc/sys/kernel/randomize_va_space
> but first learn more about it.

Can't do that, even as root, permission denied.

> I believe recent Ocaml versions (did you try 3.12?) have GC improvements
> for that.

I installed 3.12, same hanging issue.

-- Paul


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Caml-list] More re GC hanging
  2010-08-15  7:03 ` [Caml-list] " Basile Starynkevitch
  2010-08-15  8:34   ` Paul Steckler
@ 2010-08-15  8:40   ` Paul Steckler
  2010-08-15  9:16     ` Basile Starynkevitch
  2010-08-15 10:19     ` rixed
  2010-08-20 15:21   ` Richard Jones
  2 siblings, 2 replies; 19+ messages in thread
From: Paul Steckler @ 2010-08-15  8:40 UTC (permalink / raw)
  To: basile; +Cc: caml-list

> I guess this is related to the fact that recent Linux kernel have turned
> on the randomize virtual address space feature -designed to improve
> system security. You could disable it by
>  echo 0 > /proc/sys/kernel/randomize_va_space
> but first learn more about it.

For some reason, I was able to edit that file using emacs, even when
echo wouldn't work.

But the hanging problem persists.

-- Paul


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Caml-list] More re GC hanging
  2010-08-15  8:40   ` Paul Steckler
@ 2010-08-15  9:16     ` Basile Starynkevitch
       [not found]       ` <AANLkTi=-8ZcxSy08fsh8wkTvscGLYzBsqN07OmOi5o6d@mail.gmail.com>
  2010-08-15 10:45       ` [Caml-list] " Adrien
  2010-08-15 10:19     ` rixed
  1 sibling, 2 replies; 19+ messages in thread
From: Basile Starynkevitch @ 2010-08-15  9:16 UTC (permalink / raw)
  To: Paul Steckler; +Cc: caml-list

On Sun, 2010-08-15 at 18:40 +1000, Paul Steckler wrote:
> > I guess this is related to the fact that recent Linux kernel have turned
> > on the randomize virtual address space feature -designed to improve
> > system security. You could disable it by
> >  echo 0 > /proc/sys/kernel/randomize_va_space
> > but first learn more about it.
> 
> For some reason, I was able to edit that file using emacs, even when
> echo wouldn't work.

To check that it did work as expected (which I doubt) do
   cat /proc/sys/kernel/randomize_va_space
it should give 0
> 
> But the hanging problem persists.

Are you sure that you don't have badly coded C routines that you call
from your Ocaml code (don't forget correct use of CAMLparam & CAMLlocal,
read again carefully
http://caml.inria.fr/pub/docs/manual-ocaml/manual032.html and perhaps
other material about precise garbage collectors). 

Are you sure you don't have a memory leak in your Ocaml code? This could
happen when a reference value refers to a "big" value you don't need
anymore.

Cheers. 
-- 
Basile STARYNKEVITCH         http://starynkevitch.net/Basile/
email: basile<at>starynkevitch<dot>net mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mine, sont seulement les miennes} ***


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Fwd: [Caml-list] More re GC hanging
       [not found]       ` <AANLkTi=-8ZcxSy08fsh8wkTvscGLYzBsqN07OmOi5o6d@mail.gmail.com>
@ 2010-08-15  9:55         ` Paul Steckler
  2010-08-15 10:41           ` Fwd: " Sylvain Le Gall
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Steckler @ 2010-08-15  9:55 UTC (permalink / raw)
  To: caml-list

On Sun, Aug 15, 2010 at 7:16 PM, Basile Starynkevitch
<basile@starynkevitch.net> wrote:
> To check that it did work as expected (which I doubt) do
>   cat /proc/sys/kernel/randomize_va_space
> it should give 0

It did work as expected.

> Are you sure that you don't have badly coded C routines that you call
> from your Ocaml code (don't forget correct use of CAMLparam & CAMLlocal,
> read again carefully
> http://caml.inria.fr/pub/docs/manual-ocaml/manual032.html and perhaps
> other material about precise garbage collectors).

I'm not calling any C code directly.  I am using the ocaml-ssl
library, which has
some simple calls into the OpenSSL library.

> Are you sure you don't have a memory leak in your Ocaml code? This could
> happen when a reference value refers to a "big" value you don't need
> anymore.

Possible.  I'll try printing out some memory statistics to see how much memory
is being consumed.

-- Paul


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Caml-list] More re GC hanging
  2010-08-15  8:40   ` Paul Steckler
  2010-08-15  9:16     ` Basile Starynkevitch
@ 2010-08-15 10:19     ` rixed
  1 sibling, 0 replies; 19+ messages in thread
From: rixed @ 2010-08-15 10:19 UTC (permalink / raw)
  Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 219 bytes --]

> For some reason, I was able to edit that file using emacs, even when
> echo wouldn't work.

maybe you wrote "sudo echo 0 > file" or something similar which perfoms the echo as root but the redirection as normal user ?

[-- Attachment #2: Type: text/html, Size: 535 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Fwd: More re GC hanging
  2010-08-15  9:55         ` Fwd: " Paul Steckler
@ 2010-08-15 10:41           ` Sylvain Le Gall
  2010-08-15 11:32             ` [Caml-list] " Vincent Hanquez
  0 siblings, 1 reply; 19+ messages in thread
From: Sylvain Le Gall @ 2010-08-15 10:41 UTC (permalink / raw)
  To: caml-list

On 15-08-2010, Paul Steckler <steck@stecksoft.com> wrote:
> On Sun, Aug 15, 2010 at 7:16 PM, Basile Starynkevitch
><basile@starynkevitch.net> wrote:
>
>> Are you sure that you don't have badly coded C routines that you call
>> from your Ocaml code (don't forget correct use of CAMLparam & CAMLlocal,
>> read again carefully
>> http://caml.inria.fr/pub/docs/manual-ocaml/manual032.html and perhaps
>> other material about precise garbage collectors).
>
> I'm not calling any C code directly.  I am using the ocaml-ssl
> library, which has
> some simple calls into the OpenSSL library.
>

Maybe it has nothing todo, but you talked about ocaml-ssl possibly and
your application hanging, it reminds me:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=591891

ocaml-ssl and ocaml-dbus are involved, so maybe the guilty party is
ocaml-ssl -- this is just a guess, not sure about anything.

Regards,
Sylvain Le Gall


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Caml-list] More re GC hanging
  2010-08-15  9:16     ` Basile Starynkevitch
       [not found]       ` <AANLkTi=-8ZcxSy08fsh8wkTvscGLYzBsqN07OmOi5o6d@mail.gmail.com>
@ 2010-08-15 10:45       ` Adrien
  2010-08-15 11:17         ` David Allsopp
                           ` (2 more replies)
  1 sibling, 3 replies; 19+ messages in thread
From: Adrien @ 2010-08-15 10:45 UTC (permalink / raw)
  To: basile; +Cc: Paul Steckler, caml-list

Hi,

I recently had similar output from the GC (huge count of words) which
I noticed after my program started to exit with an out-of-memory
error. It doesn't seem to be happening anymore but I'm not sure I
"fixed" it. There are three things I thought of to get rid of it.
(btw, I'm on 64bit linux)

First, remove all non-tail-rec functions: no more List.map, @ or
List.concat. All lists were pretty short (definitely less than 1000
elements) but maybe the amount of calls generated garbage or something
like that: I couldn't get much infos about the problem so I tried what
I could think of and being tail-rec couldn't be a bad thing anyway.
The idea was to create less values since I first suspected a memory
fragmentation issue (iirc I had thousands of fragments), so I also
memoized some functions.

Then, as Basile mentionned, call something like Gc.compact ()
regularly. The overhead is actually pretty small, especially if ran
regularly.

Finally, C bindings. I created a few while not having access to the
internet and they are quite dirty. I highly doubt they play perfectly
well with the garbage collector: they seem ok but probably aren't
perfect. That's definitely something to check, even if the bindings
were written by someone else because working nicely with the GC can be
quite hard.

Now, the problem seems to be gone but I can't say for sure. One really
annoying thing was that adding a line like 'print_endline "pouet";'
would make the out-of-memory problem go away! Same when getting stats
from the GC.


As for the problem with randomize_va_space on 64bit, I thought it had
been fixed in 3.11 so I haven't looked at it (and in the absence of
internet access, I was unable to get details for that problem). I just
tried about a hundred run with VA-space-randomization on and without
Gc.compact calls and ran without problem. Hopefully everything is
tracked in git so I can get the older and non-working code if needed.

Also, on my computer, I have the following behaviour:
    11:44 ~ % sudo echo 0 > /proc/sys/kernel/randomize_va_space
    zsh: permission denied: /proc/sys/kernel/randomize_va_space
    root@jarjar:~# echo 0 > /proc/sys/kernel/randomize_va_space
    root@jarjar:~#
I can't use sudo to write to most files in /proc or /sys: I have to
log in as root ('su -' does the job just fine).


Hope this helps.

 ---

Adrien Nader


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [Caml-list] More re GC hanging
  2010-08-15 10:45       ` [Caml-list] " Adrien
@ 2010-08-15 11:17         ` David Allsopp
  2010-08-15 12:12         ` Basile Starynkevitch
  2010-09-01 15:24         ` Damien Doligez
  2 siblings, 0 replies; 19+ messages in thread
From: David Allsopp @ 2010-08-15 11:17 UTC (permalink / raw)
  To: caml-list

Adrien wrote:
> Hi,

<snip>

> Also, on my computer, I have the following behaviour:
>     11:44 ~ % sudo echo 0 > /proc/sys/kernel/randomize_va_space
>     zsh: permission denied: /proc/sys/kernel/randomize_va_space
>     root@jarjar:~# echo 0 > /proc/sys/kernel/randomize_va_space
>     root@jarjar:~#
> I can't use sudo to write to most files in /proc or /sys: I have to log in
> as root ('su -' does the job just fine).

The redirection (> /proc/sys...) is not part of the sudo invocation, it's a separate instruction to the *shell* to redirect output of the previous part of the command to a file and so it runs with *your* uid. There are two ways to achieve what you're after - one verbose one described in the sudo man page (essentially you pass the whole command line to sudo quoted) or the easier way:

echo 0| sudo tee /proc/sys/kernel/randomize_va_space

You can add > /dev/null if tee's outputting of the 0 to stdout is for some reason annoying.


David


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Caml-list] Re: Fwd: More re GC hanging
  2010-08-15 10:41           ` Fwd: " Sylvain Le Gall
@ 2010-08-15 11:32             ` Vincent Hanquez
  0 siblings, 0 replies; 19+ messages in thread
From: Vincent Hanquez @ 2010-08-15 11:32 UTC (permalink / raw)
  To: Sylvain Le Gall; +Cc: caml-list

On Sun, Aug 15, 2010 at 10:41:44AM +0000, Sylvain Le Gall wrote:
> Maybe it has nothing todo, but you talked about ocaml-ssl possibly and
> your application hanging, it reminds me:
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=591891
> 
> ocaml-ssl and ocaml-dbus are involved, so maybe the guilty party is
> ocaml-ssl -- this is just a guess, not sure about anything.

As i said to the bug reporter, ocaml-dbus is not thread friendly at all; it
never drop the ocaml lock through the caml_enter_blocking_section /
caml_leave_blocking_section, so if any calls block they would block the whole
application using it.

This thread doesn't seems to involves ocaml-dbus here though, so there could
be bugs in ocaml-ssl bindings too.

-- 
Vincent


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Caml-list] More re GC hanging
  2010-08-15 10:45       ` [Caml-list] " Adrien
  2010-08-15 11:17         ` David Allsopp
@ 2010-08-15 12:12         ` Basile Starynkevitch
  2010-09-01 15:24         ` Damien Doligez
  2 siblings, 0 replies; 19+ messages in thread
From: Basile Starynkevitch @ 2010-08-15 12:12 UTC (permalink / raw)
  To: Adrien; +Cc: Paul Steckler, caml-list

On Sun, 2010-08-15 at 12:45 +0200, Adrien wrote:

> 
> Finally, C bindings. I created a few while not having access to the
> internet and they are quite dirty. I highly doubt they play perfectly
> well with the garbage collector: they seem ok but probably aren't
> perfect. That's definitely something to check, even if the bindings
> were written by someone else because working nicely with the GC can be
> quite hard.

Then I suggest to post here a representative (rather complex) example of
your C binding glue code.

> 
> Also, on my computer, I have the following behaviour:
>     11:44 ~ % sudo echo 0 > /proc/sys/kernel/randomize_va_space
>     zsh: permission denied: /proc/sys/kernel/randomize_va_space

This is normal. The sudo applies to the echo command, not to the
redirection. You want to redirect as root, not as the user, so 

sudo sh -c 'echo 0 > /proc/sys/kernel/randomize_va_space'

should work.

Cheers.
-- 
Basile STARYNKEVITCH         http://starynkevitch.net/Basile/
email: basile<at>starynkevitch<dot>net mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mine, sont seulement les miennes} ***


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Caml-list] More re GC hanging
  2010-08-15  5:57 More re GC hanging Paul Steckler
  2010-08-15  7:03 ` [Caml-list] " Basile Starynkevitch
@ 2010-08-18 12:22 ` Goswin von Brederlow
  1 sibling, 0 replies; 19+ messages in thread
From: Goswin von Brederlow @ 2010-08-18 12:22 UTC (permalink / raw)
  To: Paul Steckler; +Cc: caml-list

Paul Steckler <steck@stecksoft.com> writes:

> I haven't yet come up with a solution to the GC hanging problem I
> mentioned the other day.
>
> But here's something that looks funny.  I changed the default minor
> heap size, the major
> heap increment, the allocation policy.  I also threw in a
> `Gc.major_slice 0' in the code.
> After turning on the Gc verbose option, I see:
>
>  New heap increment size: 1000k bytes
>  New allocation policy: 1
>  New minor heap size: 500k bytes
>  <>Starting new major GC cycle
>  allocated_words = 9404
>  extra_heap_resources = 49000u
>  amount of work to do = 249956u
>  ordered work = 0 words
>  computed work = 44081 words
>  Marking 44081 words
>  Subphase = 10
>  !<>Sweeping 9223372036854775807 words
>  Starting new major GC cycle
>  Marking 9223372036854775807 words
>  Subphase = 10
>  Sweeping 9223372036854775807 words
>
> Those are some big mark and sweep numbers at the end!
>
> I'm using the x64 version of Fedora 12.  Maybe the 64-bit garbage
> collector has some integer
> overflow problems?
>
> I wasn't seeing those huge numbers in other experiments where the Gc
> hangs, so maybe that's
> not the underlying problem.
>
> -- Paul

I wondered about that as well. I think this is some uninitialized value
in the GC statistics.

MfG
        Goswin


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Caml-list] More re GC hanging
  2010-08-15  7:03 ` [Caml-list] " Basile Starynkevitch
  2010-08-15  8:34   ` Paul Steckler
  2010-08-15  8:40   ` Paul Steckler
@ 2010-08-20 15:21   ` Richard Jones
  2010-08-20 23:30     ` Eray Ozkural
  2 siblings, 1 reply; 19+ messages in thread
From: Richard Jones @ 2010-08-20 15:21 UTC (permalink / raw)
  To: Basile Starynkevitch; +Cc: Paul Steckler, caml-list


On Sun, Aug 15, 2010 at 09:03:41AM +0200, Basile Starynkevitch wrote:
> On Sun, 2010-08-15 at 15:57 +1000, Paul Steckler wrote:
> > I haven't yet come up with a solution to the GC hanging problem I
> > mentioned the other day.
> > 
> > But here's something that looks funny. [..]
> 
> > After turning on the Gc verbose option, I see:
> 
> [...]
> >  !<>Sweeping 9223372036854775807 words
> >  Starting new major GC cycle
> >  Marking 9223372036854775807 words
> >  Subphase = 10
> >  Sweeping 9223372036854775807 words
> > 
> > Those are some big mark and sweep numbers at the end!
> 
> I guess this is related to the fact that recent Linux kernel have turned
> on the randomize virtual address space feature -designed to improve
> system security. You could disable it by 
>   echo 0 > /proc/sys/kernel/randomize_va_space
> but first learn more about it.
> 
> I believe recent Ocaml versions (did you try 3.12?) have GC improvements
> for that.

Would that be:
https://bugzilla.redhat.com/show_bug.cgi?id=445545
(fixed in OCaml 3.11)?

Rich.

-- 
Richard Jones
Red Hat


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Caml-list] More re GC hanging
  2010-08-20 15:21   ` Richard Jones
@ 2010-08-20 23:30     ` Eray Ozkural
  2010-08-22 13:02       ` Adrien
  0 siblings, 1 reply; 19+ messages in thread
From: Eray Ozkural @ 2010-08-20 23:30 UTC (permalink / raw)
  To: Richard Jones; +Cc: Basile Starynkevitch, Paul Steckler, caml-list

[-- Attachment #1: Type: text/plain, Size: 878 bytes --]

On Fri, Aug 20, 2010 at 6:21 PM, Richard Jones <rich@annexia.org> wrote:

>
> On Sun, Aug 15, 2010 at 09:03:41AM +0200, Basile Starynkevitch wrote:
> > On Sun, 2010-08-15 at 15:57 +1000, Paul Steckler wrote:
> > > I haven't yet come up with a solution to the GC hanging problem I
> > > mentioned the other day.
> > >
>
> > I believe recent Ocaml versions (did you try 3.12?) have GC improvements
> > for that.
>
> Would that be:
> https://bugzilla.redhat.com/show_bug.cgi?id=445545
> (fixed in OCaml 3.11)?
>
> Rich.
>


On many systems old versions of ocaml are shipped. In debian stable there is
a pretty old version (10.x?), which is quite frustrating IMHO. At any rate,
bug reporters must always indicate which ocaml version they're using.

Cheers,


-- 
Eray Ozkural, PhD candidate.  Comp. Sci. Dept., Bilkent University, Ankara
http://groups.yahoo.com/group/ai-philosophy

[-- Attachment #2: Type: text/html, Size: 1479 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Caml-list] More re GC hanging
  2010-08-20 23:30     ` Eray Ozkural
@ 2010-08-22 13:02       ` Adrien
  2010-08-25 15:17         ` Goswin von Brederlow
  0 siblings, 1 reply; 19+ messages in thread
From: Adrien @ 2010-08-22 13:02 UTC (permalink / raw)
  To: Eray Ozkural
  Cc: Richard Jones, Basile Starynkevitch, Paul Steckler, caml-list

David and Basile, you are absolutely right about the redirection
issue. It also pretty obvious actually. I guess I need to pay more
attention.

Back to the original problem, I thought I had somehow gotten rid of it
but it still happens on someone else's computer. Calling 'Gc.compact'
regularly seems to work around the problem but calling 'Array.make'
might actually have the same effect: it might not fix the problem,
only prevent it from being triggered. I'll try to reproduce it this
week.

Also, I found out that I had a pretty ugly error in my C bindings but
it looks like it had no bad impact.
Basically, I had 'external ml_f : *string* -> string array' but the C
side read 'value ml_f()': the C function took *no* argument while
ocaml was passing one (I wasn't actually using the argument). Has
anything been developped against that? Anything to warn about errors
in bindings?

Finally, I don't think it has to do with the bug on 64bit systems with
ASLR, at least not directly: I'm using ocaml 3.11.2 and tried with
ASLR disabled. But I need to make a reproducer: the very high word
count did not always show up (although the out-of-memory error always
did).

 ---

Adrien Nader


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Caml-list] More re GC hanging
  2010-08-22 13:02       ` Adrien
@ 2010-08-25 15:17         ` Goswin von Brederlow
  0 siblings, 0 replies; 19+ messages in thread
From: Goswin von Brederlow @ 2010-08-25 15:17 UTC (permalink / raw)
  To: Adrien
  Cc: Eray Ozkural, caml-list, Basile Starynkevitch, Paul Steckler,
	Richard Jones

Adrien <camaradetux@gmail.com> writes:

> Also, I found out that I had a pretty ugly error in my C bindings but
> it looks like it had no bad impact.
> Basically, I had 'external ml_f : *string* -> string array' but the C
> side read 'value ml_f()': the C function took *no* argument while
> ocaml was passing one (I wasn't actually using the argument). Has
> anything been developped against that? Anything to warn about errors
> in bindings?

That actually makes no problem on any architecture afaik. The parameter
will be placed in a register or on stack and never accessed. The GC
might move the value around making the register/stack value invalid, but
you never access it anyway.

I don't think there is anything that will verify that the external
declaration and C side have the same number of arguments.

MfG
        Goswin


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Caml-list] More re GC hanging
  2010-08-15 10:45       ` [Caml-list] " Adrien
  2010-08-15 11:17         ` David Allsopp
  2010-08-15 12:12         ` Basile Starynkevitch
@ 2010-09-01 15:24         ` Damien Doligez
  2010-09-01 16:44           ` Adrien
  2 siblings, 1 reply; 19+ messages in thread
From: Damien Doligez @ 2010-09-01 15:24 UTC (permalink / raw)
  To: caml-list caml-list


On 2010-08-15, at 12:45, Adrien wrote:

> First, remove all non-tail-rec functions: no more List.map, @ or
> List.concat. All lists were pretty short (definitely less than 1000
> elements) but maybe the amount of calls generated garbage or something
> like that: I couldn't get much infos about the problem so I tried what
> I could think of and being tail-rec couldn't be a bad thing anyway.
> The idea was to create less values since I first suspected a memory
> fragmentation issue (iirc I had thousands of fragments), so I also
> memoized some functions.

That has nothing to do with the GC getting huge counts.  Also, if you
have fragmentation problems, you can try changing the allocation
policy with this statement:

   Gc.set {(Gc.get ()) with Gc.allocation_policy = 1};;

I'm still waiting for feedback on that alternate allocation policy :-)

> Then, as Basile mentionned, call something like Gc.compact ()
> regularly. The overhead is actually pretty small, especially if ran
> regularly.

That's good for tracking down problems, but I wouldn't recommend it
for production code.

> Finally, C bindings. I created a few while not having access to the
> internet and they are quite dirty. I highly doubt they play perfectly
> well with the garbage collector: they seem ok but probably aren't
> perfect. That's definitely something to check, even if the bindings
> were written by someone else because working nicely with the GC can be
> quite hard.
> 
> Now, the problem seems to be gone but I can't say for sure. One really
> annoying thing was that adding a line like 'print_endline "pouet";'
> would make the out-of-memory problem go away! Same when getting stats
> from the GC.


That almost certainly indicates a problem with your C bindings: some
pointer gets garbled and the GC may or may not stumble upon it.

-- Damien


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Caml-list] More re GC hanging
  2010-09-01 15:24         ` Damien Doligez
@ 2010-09-01 16:44           ` Adrien
  0 siblings, 0 replies; 19+ messages in thread
From: Adrien @ 2010-09-01 16:44 UTC (permalink / raw)
  To: Damien Doligez; +Cc: caml-list caml-list

On 01/09/2010, Damien Doligez <damien.doligez@inria.fr> wrote:
>
> On 2010-08-15, at 12:45, Adrien wrote:
>
>> First, remove all non-tail-rec functions: no more List.map, @ or
>> List.concat. All lists were pretty short (definitely less than 1000
>> elements) but maybe the amount of calls generated garbage or something
>> like that: I couldn't get much infos about the problem so I tried what
>> I could think of and being tail-rec couldn't be a bad thing anyway.
>> The idea was to create less values since I first suspected a memory
>> fragmentation issue (iirc I had thousands of fragments), so I also
>> memoized some functions.
>
> That has nothing to do with the GC getting huge counts.

I know but I first had crashes which didn't show the huge counts and
did what I had planned to do for some time.
Also, I was actually generating lots of garbage (well, maybe not 10^20 ;-)).

> Also, if you
> have fragmentation problems, you can try changing the allocation
> policy with this statement:
>
>    Gc.set {(Gc.get ()) with Gc.allocation_policy = 1};;
>
> I'm still waiting for feedback on that alternate allocation policy :-)

I had tried that, it didn't change anything.

>> Then, as Basile mentionned, call something like Gc.compact ()
>> regularly. The overhead is actually pretty small, especially if ran
>> regularly.
>
> That's good for tracking down problems, but I wouldn't recommend it
> for production code.
>
>> Finally, C bindings. I created a few while not having access to the
>> internet and they are quite dirty. I highly doubt they play perfectly
>> well with the garbage collector: they seem ok but probably aren't
>> perfect. That's definitely something to check, even if the bindings
>> were written by someone else because working nicely with the GC can be
>> quite hard.
>>
>> Now, the problem seems to be gone but I can't say for sure. One really
>> annoying thing was that adding a line like 'print_endline "pouet";'
>> would make the out-of-memory problem go away! Same when getting stats
>> from the GC.
>
>
> That almost certainly indicates a problem with your C bindings: some
> pointer gets garbled and the GC may or may not stumble upon it.

That's also what I think: calling Gc.compact () doesn't solve the
problem, it only changes the planet alignment and the phase of the
moon.

Sorry for the late reaction, I was pretty short on time during the
past ten days but it's going to be better now. :-)
I took a quick look at the C stubs and noticed a few variables of type
'value' where not introduced with CAMLlocalX(), in particular the
creation of a list.
I don't know if that's enough to fix the problem since it wasn't
happening anymore on my computer and I'm now waiting for someone to be
able to test.

-- 

Adrien Nader


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2010-09-01 16:44 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-15  5:57 More re GC hanging Paul Steckler
2010-08-15  7:03 ` [Caml-list] " Basile Starynkevitch
2010-08-15  8:34   ` Paul Steckler
2010-08-15  8:40   ` Paul Steckler
2010-08-15  9:16     ` Basile Starynkevitch
     [not found]       ` <AANLkTi=-8ZcxSy08fsh8wkTvscGLYzBsqN07OmOi5o6d@mail.gmail.com>
2010-08-15  9:55         ` Fwd: " Paul Steckler
2010-08-15 10:41           ` Fwd: " Sylvain Le Gall
2010-08-15 11:32             ` [Caml-list] " Vincent Hanquez
2010-08-15 10:45       ` [Caml-list] " Adrien
2010-08-15 11:17         ` David Allsopp
2010-08-15 12:12         ` Basile Starynkevitch
2010-09-01 15:24         ` Damien Doligez
2010-09-01 16:44           ` Adrien
2010-08-15 10:19     ` rixed
2010-08-20 15:21   ` Richard Jones
2010-08-20 23:30     ` Eray Ozkural
2010-08-22 13:02       ` Adrien
2010-08-25 15:17         ` Goswin von Brederlow
2010-08-18 12:22 ` Goswin von Brederlow

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).