caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* 32- and 64-bit performance
@ 2005-03-30  2:40 Jon Harrop
  2005-03-30  7:46 ` [Caml-list] " Alex Baretta
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Jon Harrop @ 2005-03-30  2:40 UTC (permalink / raw)
  To: caml-list


I just bought a new Athlon 64 laptop and installed 32- and 64-bit Debian.
Here are some timings, showing the performance change when moving from 32-
to 64-bit using ocamlopt (3.08.2) and g++ (3.4.4):

Sieve primes up to 10^8 (bit-twiddling/array limited):

32-bit OCaml: 7.102s
64-bit OCaml: 5.697s
Ratio: 1.25

32-bit C++: 19.145s
64-bit C++: 13.433s
Ratio: 1.43

100th-nearest neighbours from a 10k-atom model of amorphous silicon
(de/allocation limited):

32-bit OCaml: 28.407s
64-bit OCaml: 35.538s
Ratio: 0.80

32-bit C++: 14.035s
64-bit C++: 12.392s
Ratio: 1.13

Generate, bubble sort and accumulate an array of 10^4 double-precision
random floating-point numbers:

32-bit OCaml: 1.185s
64-bit OCaml: 0.785s
Ratio: 1.51

32-bit C++: 1.471s
64-bit C++: 0.957s
Ratio: 1.54

without bounds checking:

32-bit OCaml: 0.992s
64-bit OCaml: 0.591s
Ratio: 1.68

32-bit C++: 1.249s
64-bit C++: 0.705s
Ratio: 1.77

2048^2 mandelbrot (float-arithmetic limited):

32-bit OCaml: 2.946s
64-bit OCaml: 1.704s
Ratio: 1.73

32-bit C++: 1.479s
64-bit C++: 1.161s
Ratio: 1.27

1024 FFTs and iFFTs (float-arithmetic limited):

32-bit OCaml: 31.491s
64-bit OCaml: 9.260s
Ratio: 3.40

32-bit C++: 8.441s
64-bit C++: 8.562s
Ratio: 0.99

Accumulate a Lorentzian over the number of integer triples (i, j, k) which
lie in i^2 + j^2 + k^2 < 400 (float-arithmetic limited):

32-bit OCaml: 16.329s
64-bit OCaml: 9.459s
Ratio: 1.73

32-bit C++: 8.002s
64-bit C++: 5.933s
Ratio: 1.35

So ocamlopt does seem to generate significantly better code in these examples, 
particularly when they are floating point intensive. Also, only one test is 
slower in 64-bit, due to its heavy use of trees.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] 32- and 64-bit performance
  2005-03-30  2:40 32- and 64-bit performance Jon Harrop
@ 2005-03-30  7:46 ` Alex Baretta
  2005-03-30  8:00   ` Ville-Pertti Keinonen
                     ` (2 more replies)
  2005-03-30 13:46 ` Eijiro Sumii
  2005-03-31 15:05 ` Stefan Monnier
  2 siblings, 3 replies; 17+ messages in thread
From: Alex Baretta @ 2005-03-30  7:46 UTC (permalink / raw)
  To: Jon Harrop; +Cc: Ocaml

Jon Harrop wrote:
> I just bought a new Athlon 64 laptop and installed 32- and 64-bit Debian.
> Here are some timings, showing the performance change when moving from 32-
> to 64-bit using ocamlopt (3.08.2) and g++ (3.4.4):
...
> 
> So ocamlopt does seem to generate significantly better code in these examples, 
> particularly when they are floating point intensive. Also, only one test is 
> slower in 64-bit, due to its heavy use of trees.
> 

Why do you suppose is there *any* benchmark faster in 32 bit mode than 
in 64 bit mode on the Athlon64? Since the AMD64 architecture is 
generally better than IA32--were it only for the additional registers--I 
would expect all benchmarks to run as fast or faster when compiled to 
the AMD64 instruction set.

Alex

-- 
*********************************************************************
http://www.barettadeit.com/
Baretta DE&IT
A division of Baretta SRL

tel. +39 02 370 111 55
fax. +39 02 370 111 54

Our technology:

The Application System/Xcaml (AS/Xcaml)
<http://www.asxcaml.org/>

The FreerP Project
<http://www.freerp.org/>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] 32- and 64-bit performance
  2005-03-30  7:46 ` [Caml-list] " Alex Baretta
@ 2005-03-30  8:00   ` Ville-Pertti Keinonen
  2005-03-30  8:41     ` Alex Baretta
  2005-03-30  8:10   ` Robert Roessler
  2005-03-30  8:11   ` Alexander S. Usov
  2 siblings, 1 reply; 17+ messages in thread
From: Ville-Pertti Keinonen @ 2005-03-30  8:00 UTC (permalink / raw)
  To: Alex Baretta; +Cc: Jon Harrop, Ocaml

On Wed, 2005-03-30 at 09:46 +0200, Alex Baretta wrote:

> Why do you suppose is there *any* benchmark faster in 32 bit mode than 
> in 64 bit mode on the Athlon64? Since the AMD64 architecture is 
> generally better than IA32--were it only for the additional registers--I 
> would expect all benchmarks to run as fast or faster when compiled to 
> the AMD64 instruction set.

64-bit data structures (due to bigger pointers, alignment, and in OCaml
bigger default integers) are bigger, so things that are constrained by
memory bandwidth are obviously going to be faster as 32-bit.

On other architectures where you can use 32-bit or 64-bit that are
otherwise identical, 32-bit is generally faster.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] 32- and 64-bit performance
  2005-03-30  7:46 ` [Caml-list] " Alex Baretta
  2005-03-30  8:00   ` Ville-Pertti Keinonen
@ 2005-03-30  8:10   ` Robert Roessler
  2005-03-30  8:11   ` Alexander S. Usov
  2 siblings, 0 replies; 17+ messages in thread
From: Robert Roessler @ 2005-03-30  8:10 UTC (permalink / raw)
  To: Ocaml

Alex Baretta wrote:

> Jon Harrop wrote:
> 
>> I just bought a new Athlon 64 laptop and installed 32- and 64-bit Debian.
>> Here are some timings, showing the performance change when moving from 
>> 32-
>> to 64-bit using ocamlopt (3.08.2) and g++ (3.4.4):
> 
> ...
> 
>>
>> So ocamlopt does seem to generate significantly better code in these 
>> examples, particularly when they are floating point intensive. Also, 
>> only one test is slower in 64-bit, due to its heavy use of trees.
>>
> 
> Why do you suppose is there *any* benchmark faster in 32 bit mode than 
> in 64 bit mode on the Athlon64? Since the AMD64 architecture is 
> generally better than IA32--were it only for the additional registers--I 
> would expect all benchmarks to run as fast or faster when compiled to 
> the AMD64 instruction set.

I believe Jon was alluding to the issues that would be encountered on 
any structure that is pointer-heavy - every pointer is now 8 bytes 
long - you just get *killed* on the [extra] memory accessing, and your 
L1/L2 caches holding "less".  This effect will be more pronounced for 
tasks that are mostly pointer-shuffling...

Robert Roessler
roessler@rftp.com
http://www.rftp.com


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] 32- and 64-bit performance
  2005-03-30  7:46 ` [Caml-list] " Alex Baretta
  2005-03-30  8:00   ` Ville-Pertti Keinonen
  2005-03-30  8:10   ` Robert Roessler
@ 2005-03-30  8:11   ` Alexander S. Usov
  2 siblings, 0 replies; 17+ messages in thread
From: Alexander S. Usov @ 2005-03-30  8:11 UTC (permalink / raw)
  To: caml-list

On Wednesday 30 March 2005 09:46, Alex Baretta wrote:
> Jon Harrop wrote:
> > So ocamlopt does seem to generate significantly better code in these
> > examples, particularly when they are floating point intensive. Also, only
> > one test is slower in 64-bit, due to its heavy use of trees.
>
> Why do you suppose is there *any* benchmark faster in 32 bit mode than
> in 64 bit mode on the Athlon64? Since the AMD64 architecture is
> generally better than IA32--were it only for the additional registers--I
> would expect all benchmarks to run as fast or faster when compiled to
> the AMD64 instruction set.

A one of the simplest reasons -- somewhat increased memory use.
And memory access is incredibly expensive novadays.

-- 
Best regards,
  Alexander.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] 32- and 64-bit performance
  2005-03-30  8:00   ` Ville-Pertti Keinonen
@ 2005-03-30  8:41     ` Alex Baretta
  2005-03-30  9:01       ` Ville-Pertti Keinonen
  0 siblings, 1 reply; 17+ messages in thread
From: Alex Baretta @ 2005-03-30  8:41 UTC (permalink / raw)
  To: Ville-Pertti Keinonen; +Cc: Jon Harrop, Ocaml

Ville-Pertti Keinonen wrote:
> On Wed, 2005-03-30 at 09:46 +0200, Alex Baretta wrote:
> 
> 64-bit data structures (due to bigger pointers, alignment, and in OCaml
> bigger default integers) are bigger, so things that are constrained by
> memory bandwidth are obviously going to be faster as 32-bit.
> 
> On other architectures where you can use 32-bit or 64-bit that are
> otherwise identical, 32-bit is generally faster.

Ah, obviously! But this seems to imply that a 32-bit machine/compiler 
couple would be generally faster on symbolic processing algorithms, 
which generally require a good deal of memory allocations/deallocations. 
Since this is the kind of code which seems to be most idiomatic in 
Ocaml, I wonder how well or how badly 64 bits will actually impact all 
our software.

Alex

-- 
*********************************************************************
http://www.barettadeit.com/
Baretta DE&IT
A division of Baretta SRL

tel. +39 02 370 111 55
fax. +39 02 370 111 54

Our technology:

The Application System/Xcaml (AS/Xcaml)
<http://www.asxcaml.org/>

The FreerP Project
<http://www.freerp.org/>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] 32- and 64-bit performance
  2005-03-30  8:41     ` Alex Baretta
@ 2005-03-30  9:01       ` Ville-Pertti Keinonen
  2005-03-30 12:53         ` Jon Harrop
  0 siblings, 1 reply; 17+ messages in thread
From: Ville-Pertti Keinonen @ 2005-03-30  9:01 UTC (permalink / raw)
  To: Alex Baretta; +Cc: Jon Harrop, Ocaml

On Wed, 2005-03-30 at 10:41 +0200, Alex Baretta wrote:

> Ah, obviously! But this seems to imply that a 32-bit machine/compiler 
> couple would be generally faster on symbolic processing algorithms, 
> which generally require a good deal of memory allocations/deallocations. 
> Since this is the kind of code which seems to be most idiomatic in 
> Ocaml, I wonder how well or how badly 64 bits will actually impact all 
> our software.

As long as the choice is between i386 and amd64, 64-bit is probably the
way to go; in Jon Harrop's benchmarks, i386 is seldom a win.  Back when
I got my first amd64 machine, I ran some benchmarks that were less
computationally intensive, in which the differences were generally
something like 10-20%.  Which was faster seemed fairly random.

Note that it isn't memory allocation and deallocation that is slower (on
amd64, memory allocation is probably faster, since the allocation
pointer is kept in a register), but programs that use fairly large
amounts of memory.  32-bit vs. 64-bit might be the difference between
everything fitting in L2 or not...



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] 32- and 64-bit performance
  2005-03-30  9:01       ` Ville-Pertti Keinonen
@ 2005-03-30 12:53         ` Jon Harrop
  2005-03-30 14:34           ` Ville-Pertti Keinonen
  0 siblings, 1 reply; 17+ messages in thread
From: Jon Harrop @ 2005-03-30 12:53 UTC (permalink / raw)
  To: caml-list

On Wednesday 30 March 2005 10:01, Ville-Pertti Keinonen wrote:
> Back when I got my first amd64 machine...

=:-p

> Note that it isn't memory allocation and deallocation that is slower (on
> amd64, memory allocation is probably faster, since the allocation
> pointer is kept in a register), but programs that use fairly large
> amounts of memory.  32-bit vs. 64-bit might be the difference between
> everything fitting in L2 or not...

Yes. I was thinking that the GC would be slower due to worse cache use in 
64-bit. I used the phrase "de/allocation" as I was applying it to both C++ 
and OCaml.

This raises the question of exactly which OCaml types incur 64-bit quantities 
in the run-time. My guess:

  int (d'oh)
  constant (polymorphic) variant constructor?
  non-constant (polymorphic) variant constructor
  records (except those with all-float fields)
  tuples
  arrays

But isn't there quite a low limit on the number of constant variant type 
constructors allowed? So maybe they're squeezed into something a little 
smaller...

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] 32- and 64-bit performance
  2005-03-30  2:40 32- and 64-bit performance Jon Harrop
  2005-03-30  7:46 ` [Caml-list] " Alex Baretta
@ 2005-03-30 13:46 ` Eijiro Sumii
  2005-03-31 13:42   ` Jon Harrop
  2005-03-31 15:05 ` Stefan Monnier
  2 siblings, 1 reply; 17+ messages in thread
From: Eijiro Sumii @ 2005-03-30 13:46 UTC (permalink / raw)
  To: caml-list; +Cc: sumii

From: "Jon Harrop" <jon@ffconsultancy.com>
> I just bought a new Athlon 64 laptop and installed 32- and 64-bit Debian.
> Here are some timings, showing the performance change when moving from 32-
> to 64-bit using ocamlopt (3.08.2) and g++ (3.4.4):

Would you mind publishing the source code of these benchmark programs?
I want to try them in my environments too.

	Eijiro


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] 32- and 64-bit performance
  2005-03-30 12:53         ` Jon Harrop
@ 2005-03-30 14:34           ` Ville-Pertti Keinonen
  0 siblings, 0 replies; 17+ messages in thread
From: Ville-Pertti Keinonen @ 2005-03-30 14:34 UTC (permalink / raw)
  To: Jon Harrop; +Cc: caml-list

On Wed, 2005-03-30 at 13:53 +0100, Jon Harrop wrote:

> This raises the question of exactly which OCaml types incur 64-bit quantities 
> in the run-time. My guess:

Basically everything except strings and Bigarrays.

>   records (except those with all-float fields)

They are 64-bit, but for records and arrays of floats, the
representation is essentially the same as on 32-bit systems (except that
the header word is bigger, of course), so you don't really lose much,
plus you avoid the loss in performance due to an average of half of all
floats being misaligned (which is pretty significant).

> But isn't there quite a low limit on the number of constant variant type 
> constructors allowed? So maybe they're squeezed into something a little 
> smaller...

I think the limitation is simply due to non-constant variant
constructors, which store the tag in the header word.  Constant
constructors are simply ints.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] 32- and 64-bit performance
  2005-03-30 13:46 ` Eijiro Sumii
@ 2005-03-31 13:42   ` Jon Harrop
  0 siblings, 0 replies; 17+ messages in thread
From: Jon Harrop @ 2005-03-31 13:42 UTC (permalink / raw)
  To: caml-list

On Wednesday 30 March 2005 14:46, Eijiro Sumii wrote:
> From: "Jon Harrop" <jon@ffconsultancy.com>
> > I just bought a new Athlon 64 laptop and installed 32- and 64-bit Debian.
> > Here are some timings, showing the performance change when moving from
> > 32- to 64-bit using ocamlopt (3.08.2) and g++ (3.4.4):
>
> Would you mind publishing the source code of these benchmark programs?
> I want to try them in my environments too.

Sure:


SIEVE

open Bigarray

let nsieve n =
  let a =  Array1.create int8_unsigned c_layout ((n lsr 3) + 2) in
  Array1.fill a 0xFF;
  let rec clear i j =
    if j <= n then (
      let ic = j lsr 3 in
      let bit = a.{ic} land lnot(1 lsl (j land 0x7)) in
      if a.{ic} <> bit then a.{ic} <- bit;
      clear i (j + i)
    ) in
  let count = ref 0 in
  for i = 2 to n do
    if a.{i lsr 3} land (1 lsl (i land 0x7)) > 0 then begin
      incr count;
      if i*i <= n then clear i (2*i)
    end
  done;
  !count

let () =
  let n =
    try int_of_string Sys.argv.(1)
    with _ -> Printf.printf "usage: %s <n>\n" Sys.argv.(0); exit 1 in
  Printf.printf "Primes up to %8i%8i\n" n (nsieve n)


#include <iostream>
#include <vector>
#include <cstring>

int sieve(int n) {
  std::vector<bool> t(n+1);

  int count = 0;

  fill(t.begin(), t.end(), true);

  for (int i=2; i<=n; ++i)
    if (t.at(i)) {
      ++count;
      for (int j=i*2; j<=n; j+=i)
        t.at(j) = false;
    }

  return count;
}

int main(int argc, char *argv[]) {
  if (argc != 2) {
    std::cerr << "Usage: sieve <n>\n";
    return 1;
  }
  int n = atoi(argv[1]);
  std::cout << "Primes up to " << n << ": " << sieve(n) << "\n";
  return 0;
}



NTH

This example is taken from my book. The OCaml source is available from the 
on-line "Complete Examples" chapter. The C++ is too long to post here. I've 
submitted it to the shootout but I'll put it up on the web ASAP. I've 
actually done some more detailed performance measurements on this as it is 
one of the few non-micro benchmarks. :-)



BUBBLE

open Array

let fold_left (f : float -> float -> float) (x : float) (a : float array) =
  let r = ref x in
  for i = 0 to length a - 1 do
    r := f !r (unsafe_get a i)
  done;
  !r

let _ = match Sys.argv with
    [| _; n |] ->
      let n = int_of_string n in
      let a = make n 0. in
      for i=0 to n-1 do
        a.(i) <- log (Random.float 1.)
      done;
      let sorted = ref false in
      while (not !sorted) do
        sorted := true;
        for i=0 to n-2 do
          if a.(i) > a.(i+1) then
            let t = a.(i) in
            a.(i) <- a.(i+1);
            a.(i+1) <- t;
            sorted := false
        done;
      done;
      Printf.printf "%f\n" (fold_left ( +. ) 0. a)
  | _ -> output_string stderr "Usage: ./sort <n>"


#include <cstdlib>
#include <cmath>
#include <iostream>
#include <vector>
#include <numeric>

typedef std::vector<double> C;

double frand() {
  return double(rand()) / RAND_MAX;
}

int main(int argc, char *argv[]) {
  if (argc != 2) {
    std::cerr << "Usage: ./sort <n>\n";
    return 1;
  }
  int n=atoi(argv[1]);
  C a(n);
  for (C::iterator it=a.begin(); it != a.end(); ++it)
    *it = log(frand());
  bool sorted=false;
  while (!sorted) {
    sorted = true;
    for (int i=0; i<n-1; ++i)
      if (a.at(i) > a.at(i+1)) {
        double t=a.at(i);
        a.at(i) = a.at(i+1);
        a.at(i+1) = t;
        sorted = false;
      }
  }
  double sum = accumulate(a.begin(), a.end(), 0.);
  std::cout << sum << std::endl;
  return 0;
}


MANDELBROT

Ruthlessly stolen from the shootout.



FFTs

Evilly plagiarised from Isaac Trotts.

  http://caml.inria.fr/pub/ml-archives/caml-list/2003/03/f47b42a88cf17b8de52cc70d409883d1.en.html



LORENTZIAN

let _ = match Sys.argv with
    [| _; r |] ->
      let r = int_of_string r in
      let accu = ref 0. in
      let radius x y z =
        let x = float x and y = float y and z = float z in
        x *. x +. y *. y +. z *. z in
      let r2 = float (r*r) in
      for x = -r to r do
        for y = -r to r do
          for z = -r to r do
            let r2' = radius x y z in
            if r2' < r2 then accu := !accu +. 1. /. (1. +. r2')
          done
        done
      done;
      Printf.printf "%f\n" !accu
  | _ -> output_string stderr "Usage: ./series <n>\n"


#include <iostream>
#include <cmath>

double radius(int i, int j, int k) {
  double x(i), y(j), z(k);
  return x*x + y*y + z*z;
}

int main(int argc, char *argv[]) {
  if (argc != 2) {
    std::cerr << "Usage: ./series <n>\n";
    return 1;
  }
  int r=atoi(argv[1]);
  double r2(r*r);
  double accu=0.;
  for (int x=-r; x<=r; ++x)
    for (int y=-r; y<=r; ++y)
      for (int z=-r; z<=r; ++z) {
        double r2p = radius(x, y, z);
        if (r2p < r2) accu += 1. / (1. + r2p);
      }
  std::cout << accu << std::endl;
  return 0;
}

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 32- and 64-bit performance
  2005-03-30  2:40 32- and 64-bit performance Jon Harrop
  2005-03-30  7:46 ` [Caml-list] " Alex Baretta
  2005-03-30 13:46 ` Eijiro Sumii
@ 2005-03-31 15:05 ` Stefan Monnier
  2005-03-31 18:40   ` [Caml-list] " Jon Harrop
  2 siblings, 1 reply; 17+ messages in thread
From: Stefan Monnier @ 2005-03-31 15:05 UTC (permalink / raw)
  To: caml-list

> I just bought a new Athlon 64 laptop and installed 32- and 64-bit Debian.
> Here are some timings, showing the performance change when moving from 32-
> to 64-bit using ocamlopt (3.08.2) and g++ (3.4.4):

Is there a GNU/Linux (and OCaml) mode supporting the half-way case where you
use the new amd64 registers but still stay with 32bit pointers?
That should fix the one case where the amd64 mode is slower because of the
extra memory use.


        Stefan


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] Re: 32- and 64-bit performance
  2005-03-31 15:05 ` Stefan Monnier
@ 2005-03-31 18:40   ` Jon Harrop
  2005-03-31 22:41     ` Richard Jones
  2005-04-02 20:23     ` Stefan Monnier
  0 siblings, 2 replies; 17+ messages in thread
From: Jon Harrop @ 2005-03-31 18:40 UTC (permalink / raw)
  To: caml-list

On Thursday 31 March 2005 16:05, Stefan Monnier wrote:
> > I just bought a new Athlon 64 laptop and installed 32- and 64-bit Debian.
> > Here are some timings, showing the performance change when moving from
> > 32- to 64-bit using ocamlopt (3.08.2) and g++ (3.4.4):
>
> Is there a GNU/Linux (and OCaml) mode supporting the half-way case where
> you use the new amd64 registers but still stay with 32bit pointers?
> That should fix the one case where the amd64 mode is slower because of the
> extra memory use.

I do not believe there is such a mode, no. I'm not entirely sure how this 
could work but I'm quite happy to take a performance hit in a few programs. 
Particularly because arrays are so much more useful now. :-)

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] Re: 32- and 64-bit performance
  2005-03-31 18:40   ` [Caml-list] " Jon Harrop
@ 2005-03-31 22:41     ` Richard Jones
  2005-04-02 20:23     ` Stefan Monnier
  1 sibling, 0 replies; 17+ messages in thread
From: Richard Jones @ 2005-03-31 22:41 UTC (permalink / raw)
  To: Jon Harrop; +Cc: caml-list

On Thu, Mar 31, 2005 at 07:40:18PM +0100, Jon Harrop wrote:
> Particularly because arrays are so much more useful now. :-)

Don't forget Strings!

Rich.

-- 
Richard Jones, CTO Merjis Ltd.
Merjis - web marketing and technology - http://merjis.com
Team Notepad - intranets and extranets for business - http://team-notepad.com


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 32- and 64-bit performance
  2005-03-31 18:40   ` [Caml-list] " Jon Harrop
  2005-03-31 22:41     ` Richard Jones
@ 2005-04-02 20:23     ` Stefan Monnier
  2005-04-02 20:50       ` [Caml-list] " David Brown
  1 sibling, 1 reply; 17+ messages in thread
From: Stefan Monnier @ 2005-04-02 20:23 UTC (permalink / raw)
  To: caml-list

>> Is there a GNU/Linux (and OCaml) mode supporting the half-way case where
>> you use the new amd64 registers but still stay with 32bit pointers?
>> That should fix the one case where the amd64 mode is slower because of the
>> extra memory use.
> I do not believe there is such a mode, no.  I'm not entirely sure how this 
> could work

Under Alpha they called it "taso".  Mostly used for compatibility with apps
that were assuming the size of ptrs is the same as the size of int: it just
made the C library use only the bottom 4GB of the address space such that
the compiler could use only 32bit to store pointer values.


        Stefan


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] Re: 32- and 64-bit performance
  2005-04-02 20:23     ` Stefan Monnier
@ 2005-04-02 20:50       ` David Brown
  2005-04-03 10:01         ` Ville-Pertti Keinonen
  0 siblings, 1 reply; 17+ messages in thread
From: David Brown @ 2005-04-02 20:50 UTC (permalink / raw)
  To: Stefan Monnier, caml-list

On Sat, 02 Apr 2005 15:23:21 -0500, Stefan Monnier  
<monnier@iro.umontreal.ca> wrote:

> Under Alpha they called it "taso".  Mostly used for compatibility with  
> apps
> that were assuming the size of ptrs is the same as the size of int: it  
> just
> made the C library use only the bottom 4GB of the address space such that
> the compiler could use only 32bit to store pointer values.

The default mode in gcc for amd64 is almost this (-mcmodel=small).  It  
assumes pointers live in the lower 2GB of address space, but 'sizeof (void  
*)' is still 8.  I'm not sure why the small model doesn't use 32-bit  
pointers.

The -mcmodel=medium model does 64-bit operations on pointers, but the code  
must still live in the lower 2GB of address space.

The -mcmodel=large isn't yet implemented.

Dave


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] Re: 32- and 64-bit performance
  2005-04-02 20:50       ` [Caml-list] " David Brown
@ 2005-04-03 10:01         ` Ville-Pertti Keinonen
  0 siblings, 0 replies; 17+ messages in thread
From: Ville-Pertti Keinonen @ 2005-04-03 10:01 UTC (permalink / raw)
  To: David Brown; +Cc: Stefan Monnier, caml-list

David Brown wrote:

> On Sat, 02 Apr 2005 15:23:21 -0500, Stefan Monnier  
> <monnier@iro.umontreal.ca> wrote:
>
>> Under Alpha they called it "taso".  Mostly used for compatibility 
>> with  apps
>> that were assuming the size of ptrs is the same as the size of int: 
>> it  just
>> made the C library use only the bottom 4GB of the address space such 
>> that
>> the compiler could use only 32bit to store pointer values.
>
>
> The default mode in gcc for amd64 is almost this (-mcmodel=small).  
> It  assumes pointers live in the lower 2GB of address space, but 
> 'sizeof (void  *)' is still 8.  I'm not sure why the small model 
> doesn't use 32-bit  pointers.

I think you've misunderstood the gcc documentation, -mcmodel=small is 
not really comparable to the -xtaso compiler option on the Alpha at all.

-mcmodel=small handles pointers as fully 64-bit, with the exception of 
loading constant pointers to symbols in the text and data segments.  
Programs compiled using this mode are entirely capable of addressing the 
entire address space.  Memory allocations, shared libraries, memory 
mapped files and stack can and do live outside the 2GB range and 
pointers to them work just fine.

The reason for this mode is to avoid having to store and load full 
64-bit constant pointers in the binary.  It really only affects 
relocations for particular types of address loads.  You still get full 
64-bit relocations e.g. if you initialize a global pointer to a constant 
symbol.


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2005-04-03 10:01 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-03-30  2:40 32- and 64-bit performance Jon Harrop
2005-03-30  7:46 ` [Caml-list] " Alex Baretta
2005-03-30  8:00   ` Ville-Pertti Keinonen
2005-03-30  8:41     ` Alex Baretta
2005-03-30  9:01       ` Ville-Pertti Keinonen
2005-03-30 12:53         ` Jon Harrop
2005-03-30 14:34           ` Ville-Pertti Keinonen
2005-03-30  8:10   ` Robert Roessler
2005-03-30  8:11   ` Alexander S. Usov
2005-03-30 13:46 ` Eijiro Sumii
2005-03-31 13:42   ` Jon Harrop
2005-03-31 15:05 ` Stefan Monnier
2005-03-31 18:40   ` [Caml-list] " Jon Harrop
2005-03-31 22:41     ` Richard Jones
2005-04-02 20:23     ` Stefan Monnier
2005-04-02 20:50       ` [Caml-list] " David Brown
2005-04-03 10:01         ` Ville-Pertti Keinonen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).