caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Brian Hurt <bhurt@janestcapital.com>
To: John Caml <camljohn42@gmail.com>
Cc: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] large hash tables
Date: Fri, 22 Feb 2008 09:19:34 -0500	[thread overview]
Message-ID: <47BED9F6.4050700@janestcapital.com> (raw)
In-Reply-To: <33d2b3f70802211445q7781d296ka7dd94114b8033b1@mail.gmail.com>

John Caml wrote:

>The equivalent C++ program uses 874 MB of memory in total. Each of the
>1 million records is stored in a vector using 1 single-precision float
>and 1 int. Indeed, my machine is AMD64 so Ocaml int's are presumably 8
>bytes.
>  
>
C int's on AMD64 are still 4 bytes- longs are 8 bytes.  You can prove 
this by compiling a quick program:
#include <stdio.h>
int main(void) {
    printf("Ints are %lu bytes long.\n", (unsigned long) sizeof(int));
    return 0;
}

>I've rewritten my Ocaml program again, this time using Bigarray. Its
>memory usage is now the same as under C++, so that's good news.
>However, my program is quite ugly now, and it's actually more than
>twice as long as my C++ program. Any suggestions for simplifying this
>program? The way I initialize the "movieMajor" Array seems especially
>wonky, but I couldn't figure out a better way.
>
>  
>
It's generally a good idea to back off and think about what problem 
you're trying to solve.

Where Ocaml generally wins on memory utilization is using immutable data 
structures and sharing data, instead of copying them.  This is where a 
lot of decisions Ocaml made on how to represent things suddenly make a 
lot of sense, if you think in terms of data sharing.  And in lots of 
complicated "real" code, the memory gains made by sharing are huge 
compared to the losses not incurred by not copying.

Brian


      parent reply	other threads:[~2008-02-22 14:19 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-19 23:01 John Caml
2008-02-19 23:34 ` [Caml-list] " Gabriel Kerneis
2008-02-19 23:36 ` Gerd Stolpmann
2008-02-19 23:51 ` Francois Rouaix
2008-02-20  9:37   ` Berke Durak
2008-02-20  9:56     ` Berke Durak
2008-02-20 12:48 ` Richard Jones
2008-02-20 15:54 ` Oliver Bandel
2008-02-21 22:45   ` John Caml
2008-02-22  0:33     ` Richard Jones
2008-02-24  5:39       ` John Caml
2008-02-22 14:19     ` Brian Hurt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47BED9F6.4050700@janestcapital.com \
    --to=bhurt@janestcapital.com \
    --cc=caml-list@yquem.inria.fr \
    --cc=camljohn42@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).