* [Caml-list] c is 4 times faster than ocaml?
@ 2004-08-04 2:39 effbiae
2004-08-04 4:59 ` John Prevost
0 siblings, 1 reply; 11+ messages in thread
From: effbiae @ 2004-08-04 2:39 UTC (permalink / raw)
To: caml-list
hello,
my first post to the list. not intended to be inflammatory or to generate
ill feeling in any way. :)
i am evaluating languages for implementing a fast dbms. i would like to
use a 'higher level' language without resorting to portable assembler.
ocaml looks really nice, and it drew my attention in doug's language
shootout:
http://www.bagley.org/~doug/shootout/craps.shtml
and i have noticed that it is used to win programming contests -- indeed
the language for a discriminating hacker!
it was with great hope that i started on my first benchmark -- testing
what all fast dbmses use: mmap. after a bit of searching, i found that
Bigarray was the way to go (short of writing my own C extension). the
benchmark sources in c and ocaml are appended, along with the Makefile.
in summary, on my Mandrake 10 PIII 500 system, i get these timings:
$ time -p ./cbs 26 (* the C version *)
real 1.06
user 0.54
sys 0.51
$ time -p ./ocbs 26 (* the O'Caml version *)
real 2.95
user 2.39
sys 0.51
the real time can vary a bit due to different states of cache, but user
and sys remain fairly constant. the real time is not significant for my
purposes because the dbms will not be IO bound for most of it's queries.
so there you have it! i would really like to be able to optimise the
ocaml benchmark to be within 10% of C. i have read a post by John Prevost
"mmap for O'Caml" in which he implies he wrote mmap primitives but not
using the O'Caml-C interface. what does he mean? i assume Bigarray is
written in the fastest possible way -- or is there a faster way?
also note that i'll need msync, so i will need to extend O'Caml in some
way regardless (unless there's some library out there for mmap that i
haven't discovered).
any help greatly appreciated,
jack.
$ cat Makefile
oc:
ocamlopt -unsafe -inline 2 bigarray.cmxa unix.cmxa -o ocbs bs.ml
c:
gcc -O3 -o cbs bs.c
$ cat bs.ml
let f x y z = x + y + z;;
let g x = function y -> function z -> f x y z;;
let h x = let k=1 in function y -> f x y k;;
let mapit = let k=(-1) in function ty -> function fd ->
Bigarray.Array1.map_file fd ty Bigarray.c_layout true k;;
let maprwbs=mapit Bigarray.int8_unsigned;;
if Array.length Sys.argv = 2 then begin
let p=int_of_string Sys.argv.(1)
and fn=Sys.argv.(0) ^ ".bs" in
let fd=Unix.openfile fn [Unix.O_RDWR;Unix.O_CREAT;Unix.O_TRUNC] 0o640
and n=1 lsl p in
let _=Unix.lseek fd (n-1) Unix.SEEK_SET
and _=Unix.write fd "\000" 0 1
and _=assert (Unix.lseek fd 0 Unix.SEEK_END == n)
and ar=mapit Bigarray.int8_unsigned fd in
let _=for i=0 to n-1 do ar.{i} <- i done
and odds=ref 0 in for i=0 to n-1 do
if ar.{i} land 1 = 1 then odds:=!odds+1 done
end else begin
print_endline "Usage: bs <power-of-2>" end;;
$ cat bs.c
#include <stdlib.h>
#include <stdio.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define CHKact(x,act) do \
if(!(x)){fprintf(stderr,"!CHK (%s:%d)\n",__FILE__,__LINE__);act;} while(0)
#define CHK(x) CHKact(x,return -1)
#define CHKp(x) CHKact(x,perror(0);return -1)
main(int argc,char**argv)
{if(argc==2)
{char fn[1024];CHK(sprintf(fn,"%s.bs",argv[0]));int p=atoi(argv[1]);
int fd;CHKp(-1!=(fd=open(fn,O_RDWR|O_CREAT|O_TRUNC,S_IRUSR|S_IWUSR)));
int n=1<<p;lseek(fd,n-1,SEEK_SET);int zero=0;CHKp(write(fd,&zero,1)==1);
CHKp(lseek(fd,0,SEEK_END)==n);unsigned char*ar;
CHKp(-1!=(int)(ar=mmap(0,n,PROT_READ|PROT_WRITE,MAP_FILE|MAP_SHARED,fd,0)));
int i;for(i=0;i<n;++i)ar[i]=i;
int odds=0;for(i=0;i<n;++i)if(ar[i]&1)odds++;
CHKp(!munmap(ar,n));
}else fprintf(stderr, "Usage: %s <power-of-2>\n",argv[0]);
}
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] c is 4 times faster than ocaml?
2004-08-04 2:39 [Caml-list] c is 4 times faster than ocaml? effbiae
@ 2004-08-04 4:59 ` John Prevost
2004-08-04 5:05 ` John Prevost
2004-08-04 5:24 ` effbiae
0 siblings, 2 replies; 11+ messages in thread
From: John Prevost @ 2004-08-04 4:59 UTC (permalink / raw)
To: effbiae; +Cc: caml-list
Just from a first look, I'd say that there are two likely reasons that
this artificial (and incredibly hard to read) benchmark program
performs poorly:
First, even with -unsafe, bounds checking is performed on BigArray types.
Second, working with single byte values is likely painful here--O'Caml
always works with word-aligned values, so it's going to lose bigtime.
gcc, on the other hand, knows that the crazy intel instruction set can
handle non-word-aligned values. Here's the main setting loop from
gcc:
.L84:
movb %al, (%eax,%ecx)
incl %eax
cmpl %edx, %eax
jl .L84
And here it is from O'Caml:
.L109:
movl %esi, %ecx ;; grab our index into %ecx
sarl $1, %ecx ;; shift off the tag bit
movl 20(%eax), %edx ;; get the array's length into %edx
cmpl %ecx, %edx ;; compare the two
jbe .L111 ;; if the index is too high, punt
movl 4(%eax), %edi ;; ? probably figure which byte in word
movl %esi, %edx ;; load the loop value into %edx
sarl $1, %edx ;; shift off the low bit
movb %dl, (%edi, %ecx) ;; shove %edx's byte into the word
movl %esi, %ecx ;; store back into array
addl $2, %esi ;; add 1 to index
cmpl %ebx, %ecx ;; compare to target
jne .L109 ;; not equal? loop
That jbe .L111 is what happens if a bounds check fails, by the way!
Anyway, you can see that the bounds check takes a bunch of
instructions. THe main loop is also a bit more expensive. One thing
going on is those "sarl" instructions, which are shifting out the tag
bit on the right end of O'Caml integers. If you were working on
integers instead, I think it might be less painful. Especially if you
could use int32s held in registers to index into things.
Anyway, the main two things slowing stuff down here are the bounds
check and the fact that O'Caml needs to do so much work turning caml
integers into c integers.
(Just as a note, I accidentally tweaked your file to make the loops
not know the type of their arguments at one point while looking for
this loop--you *always* want exact types known at a deep level for
this kind of thing, as that made ocamlopt use C calls to access the
array.)
Oh--and ignore my old ramblings on mmap stuff. That code was bad
then, and is worse now. :)
As for your project, I suspect we could provide better suggestions on
how to optimize if we were looking at real code. My suspicion is that
you might want to write one or two low-level routines in C, rather
than using Bigarrays for this task. (Just assuming, though--from the
sound of it you're going to have larger structured data in the mmap'd
areas.)
Good luck!
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] c is 4 times faster than ocaml?
2004-08-04 4:59 ` John Prevost
@ 2004-08-04 5:05 ` John Prevost
2004-08-04 5:24 ` effbiae
1 sibling, 0 replies; 11+ messages in thread
From: John Prevost @ 2004-08-04 5:05 UTC (permalink / raw)
To: caml-list; +Cc: caml-list
Oh, one last parting thought:
Part of why gcc is winning here is that it's actively working with the
fact that the loop variable, the index, and the value to be inserted
are the same value. O'Caml is doing extra work because it's not
linking them up (it could at the very least avoid shifting a few
registers around and avoid an extra sarl instruction if it did spot
that.)
But this is the trouble with artificial benchmarks: no real code is
simply going to be copying the loop value into the array. It's going
to be fetching the value from somewhere else, probably by doing
pointer arithmetic on the loop value and the source address, then it
will do pointer arithmetic on the loop value and the destination
address. Then it will set the result. A smart C coder will do the
arithmetic ahead of time, which means incrementing two values instead
of one each time through the loop, but wins overall. Anyway, the
short is: artificial benchmarks are bad.
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] c is 4 times faster than ocaml?
2004-08-04 4:59 ` John Prevost
2004-08-04 5:05 ` John Prevost
@ 2004-08-04 5:24 ` effbiae
2004-08-04 7:28 ` John Prevost
1 sibling, 1 reply; 11+ messages in thread
From: effbiae @ 2004-08-04 5:24 UTC (permalink / raw)
To: John Prevost; +Cc: caml-list
oooh - a gmail account :)
> this artificial (and incredibly hard to read) benchmark program
was the C hard to read or the O'Caml? Any style tips for my caml?
> First, even with -unsafe, bounds checking is performed on BigArray
> types.
if i write a c extension that mmaps and msyncs then will the vector
element assignment become a call rather than a movb (or movl)? that is,
is Bigarray a 'special' c extension that ocaml knows how to optimize and
access just like C or is it a c extension that i can model my C extension
code on?
> If you were working on integers instead, I think it might be less
> painful. Especially if you could use int32s held in registers
> to index into things.
can i specify that an int32 is held in a register or does the compiler do
this?
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] c is 4 times faster than ocaml?
2004-08-04 5:24 ` effbiae
@ 2004-08-04 7:28 ` John Prevost
2004-08-04 8:18 ` [Caml-list] " Jack Andrews
0 siblings, 1 reply; 11+ messages in thread
From: John Prevost @ 2004-08-04 7:28 UTC (permalink / raw)
To: caml-list
On Wed, 4 Aug 2004 15:24:28 +1000 (EST), effbiae@ivorykite.com
<effbiae@ivorykite.com> wrote:
> was the C hard to read or the O'Caml? Any style tips for my caml?
Mmm. They were both pretty blinding. For simple Caml style, read
some code that's around. There's bad style and good style and
inbetween style, but it all pretty much works.
> if i write a c extension that mmaps and msyncs then will the vector
> element assignment become a call rather than a movb (or movl)? that is,
> is Bigarray a 'special' c extension that ocaml knows how to optimize and
> access just like C or is it a c extension that i can model my C extension
> code on?
The basic idea is that you would take something that you might
otherwise do as a long sequence of calls and turn it into a single
call. For example, if you're interested in blitting strings (which
are essentially byte arrays) into a Bigarray containing bytes, you
might write a C function that checks the bounds one time, converts the
O'Caml integers to native C integers one time, and then just does the
fastest memory copy it can. This will turn into a function call, but
since the main idea is mainly just to amortize the necessary overhead
across a larger amount of data, it should be preferable.
> can i specify that an int32 is held in a register or does the compiler do
> this?
I would expect (and I may be mistaken) that if you have an int32 that
is scoped to just a given function or loop, you can expect it to go
into a register (if there are enough registers to go around.) Or, for
example, when you have a single expression (Int32.add 5l (Int32.mul x
3l)) it's not going to be allocating a box for all of those constants,
nor for an intermediate result. When in doubt, try it and take a look
at the assembly file from ocamlopt -S to get a feel for how things
work. Note that I would generally recommend that you only go to these
lengths when you know it's going to be an issue. And only after
you've actual evidence that the system is indeed not fast enough.
Your chosen testcase has more necessary overhead than most, mainly
because it's interacting heavily with a datastructure *meant* to
interoperate with C. On the whole, ocamlopt produces binaries that
are very fast. Just remember that it does best when you write things
in the most natural way for this language, and that learning what's
natural in O'Caml will take a little exposure.
John.
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Caml-list] Re: c is 4 times faster than ocaml?
2004-08-04 7:28 ` John Prevost
@ 2004-08-04 8:18 ` Jack Andrews
2004-08-04 10:06 ` Mikhail Fedotov
0 siblings, 1 reply; 11+ messages in thread
From: Jack Andrews @ 2004-08-04 8:18 UTC (permalink / raw)
To: John Prevost; +Cc: caml-list
John Prevost said:
>> was the C hard to read or the O'Caml? Any style tips for my caml?
>
> Mmm. They were both pretty blinding.
my c style is inspired by arthur whitney of kx.com. he is a genius. his
language, k, is superquick. it's an APL dialect. he's written kdb in k,
and it goes like the clappers. the most impressive thing is that k comes
in at <100Kb and kdb <50Kb. he's a genius.
> The basic idea is that you would take something that you might
> otherwise do as a long sequence of calls and turn it into a single
> call.
yeah, i'm familiar with the pattern. basically, i want to write my dbms
core in ocaml -- my only other option at the moment is c. i have to say
that looking at the -S output i am given great hope that ocaml has got
what it takes. i thought i'd never find a functional language that was
fast, but i always believed it was possible to write a fast compiler for
one! (i was brought up on miranda and prolog)
> ... but
> since the main idea is mainly just to amortize the necessary overhead
> across a larger amount of data, it should be preferable.
the only interface where such amortizing could occur is the API to the
database core, but i want to write the core in ocaml and i think it's
possible (see thread 'what is this magic?')
> Your chosen testcase has more necessary overhead than most, mainly
> because it's interacting heavily with a datastructure *meant* to
> interoperate with C.
you mean ocaml is not a suitable language for developing a dbms?
thanks,
jack
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] Re: c is 4 times faster than ocaml?
2004-08-04 8:18 ` [Caml-list] " Jack Andrews
@ 2004-08-04 10:06 ` Mikhail Fedotov
2004-08-04 10:25 ` [Caml-list] " Jack Andrews
0 siblings, 1 reply; 11+ messages in thread
From: Mikhail Fedotov @ 2004-08-04 10:06 UTC (permalink / raw)
To: John Prevost; +Cc: caml-list
Jack Andrews wrote:
>yeah, i'm familiar with the pattern. basically, i want to write my dbms
>core in ocaml -- my only other option at the moment is c. i have to say
>
>
Out of curiosity, why you don't want to use the exiting ones with
c/ocaml mappings - sqlite, for instance ?
Mikhail
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] c is 4 times faster than ocaml?
2004-08-04 10:06 ` Mikhail Fedotov
@ 2004-08-04 10:25 ` Jack Andrews
2004-08-04 15:38 ` [Caml-list] custom mmap modeled on bigarray Jack Andrews
0 siblings, 1 reply; 11+ messages in thread
From: Jack Andrews @ 2004-08-04 10:25 UTC (permalink / raw)
To: Mikhail Fedotov; +Cc: caml-list
Mikhail Fedotov said:
> Out of curiosity, why you don't want to use the exiting ones with
> c/ocaml mappings - sqlite, for instance ?
the existing ones aren't that good at storing and quering a terabyte.
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Caml-list] custom mmap modeled on bigarray
2004-08-04 10:25 ` [Caml-list] " Jack Andrews
@ 2004-08-04 15:38 ` Jack Andrews
2004-08-10 5:06 ` Jack Andrews
0 siblings, 1 reply; 11+ messages in thread
From: Jack Andrews @ 2004-08-04 15:38 UTC (permalink / raw)
To: caml-list
sorry if i was at all obnoxious earlier -- i'm a bit manic at the moment.
with that in mind... i've written a prototype C library for mmapping files
and viewing the map as an array of ints. i (optimistically) tried
substituting my get/set externals with the inlineable "%bigarray_ref_1"
and "%bigarray_set_1" with the expected result -- bang!
is there a way to hack this so %bigarray_get_1 will work? do i need to
layout my custom struct in a particular way?
without further ado, here is the mm.ml program, Makefile and mm.c native
code. (i've made my C code look more conventional than my last post)
$ cat mm.ml
external init : unit -> unit = "mm_init"
let _ = init()
module Mm = struct
type 'a t
external create: string -> int -> 'a t = "mm_create"
external mopen: string -> 'a t = "mm_open"
external resize: 'a t -> int -> unit = "mm_resize"
external sync: 'a t -> unit = "mm_sync"
(* here's the slow get/set..... *)
(* note that 'a is always int for test purposes *)
external slow_get: 'a t -> int -> 'a = "mm_get_int"
external slow_set: 'a t -> 'a -> int -> unit = "mm_set_int"
(********************************************************)
(* ... and here's the optimistic experiment *)
external get: 'a t -> int -> 'a = "%bigarray_ref_1"
external set: 'a t -> 'a -> int -> unit = "%bigarray_set_1"
(********************************************************)
end;;
(* this program crashes after successful completion and before
the finalizer for mm is called...?
*)
let mm=Mm.create "tmp1" 1024 in
for i=0 to 200 do Mm.slow_set mm i i done;
Mm.sync mm;;
let mm=Mm.mopen "tmp1" in
print_string "expecting eleven, got ";print_int (Mm.slow_get mm 11);
print_newline();;
let mm=Mm.mopen "tmp1"
and odds=ref 0 in
for i=0 to 200 do
if (Mm.slow_get mm i) land 1 = 1 then odds:=!odds+1
done;
print_string "number of odds: ";print_int !odds;print_newline();;
let optimistic=false in
if optimistic then begin
let mm=Mm.mopen "tmp1"
and odds=ref 0 in
for i=0 to 200 do
if (Mm.get mm i) land 1 = 1 then odds:=!odds+1
done;
let mm=Mm.mopen "tmp2" in
for i=0 to 200 do Mm.set mm i i done;
Mm.sync mm;
end;;
$ cat mm.c
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <errno.h>
#include <caml/alloc.h>
#include <caml/custom.h>
#include <caml/fail.h>
#include <caml/intext.h>
#include <caml/memory.h>
#include <caml/mlvalues.h>
/******************************************************\
** CHK(mm) is error handler - calls failwith **
** CHKp(mm) appends system error from strerror **
\******************************************************/
#define CHKbase(x,y) do { if(!(int)(x))\
{sprintf(err,"!CHK (%s:%d): %s",__FILE__,__LINE__,#x);\
if(y)sprintf(err+strlen(err),"\n%s",strerror(errno));\
failwith(err);\
}} while(0)
#define CHK(x) CHKbase(x,0)
#define CHKp(x) CHKbase(x,1)
static char* err;static long errz;/* error buf and buf size */
/*page size and mask for aligning in mmap*/
static size_t page_size;static unsigned long page_mask;
/* MMP is the structure of the file: a UL header specifying length
** of array, followed by the array
*/
typedef struct _MMP{
unsigned long n;/* number of bytes in ar */
int ar[];/* the data in the array to be cast to any type */
}*MMP;
typedef struct _MM {
void* data; /* a copy of mmp->ar for use as bigarray */
char* filename;
long fd; /* file descriptor */
MMP mmp; /* the return value of mmap is stored here */
unsigned long fz;/* the size of the file (will be multiples of page_size) */
}*MM;
/******************************************************\
** Here are the C functions that are glued to ocaml **
\******************************************************/
MM nMM() /* malloc a MM -- only for use in C test code */
{MM mm=malloc(sizeof(struct _MM));
memset(mm,0,sizeof(struct _MM));
return mm;
}
int mapMM(MM mm,unsigned long n) /*used by create, open and realloc*/
{unsigned long fz=(n+sizeof(unsigned long)+(page_size-1))&page_mask;
CHK(fz>0);
if(fz>mm->fz)
{unsigned long lsk,zero=0;
CHKp((lsk=lseek(mm->fd,fz-1,SEEK_SET))==fz-1);
CHKp(1==write(mm->fd,&zero,1));CHK(lseek(mm->fd,0,SEEK_CUR)==fz);
}
void* p;
p=mmap(0,fz,PROT_READ|PROT_WRITE,MAP_FILE|MAP_SHARED,mm->fd,0);
CHKp(p&&-1!=(long)p);
mm->mmp=p;
mm->mmp->n=n;
mm->data=mm->mmp->ar;
mm->fz=fz;
return n;
}
/* mode in {'c':create,'o':open}*/
int iniMM(MM mm,const char* filename,char mode,unsigned long size)
{int open_mode=O_RDWR,permissions=S_IRUSR|S_IWUSR;
mm->filename=strdup(filename);
switch(mode)
{case 'c':
open_mode|=O_CREAT;
CHKp(-1!=(mm->fd=open(filename,open_mode,permissions)));
return mapMM(mm,size);
case 'o':
{FILE *fp;CHK(fp=fopen(filename,"r"));
unsigned long this_size;
CHK(1==fread(&this_size,sizeof(unsigned long),1,fp));
CHK(!fclose(fp));
CHKp(-1!=(mm->fd=open(filename,open_mode)));
CHK(((mm->fz=lseek(mm->fd,0,SEEK_END))&page_mask)==mm->fz);
CHK(this_size+sizeof(unsigned long)<=mm->fz);
return mapMM(mm,this_size);/*ignore size - use size from head of file*/
}}return -1;
}
int unMM(MM mm)
{unsigned long oldn=mm->mmp->n;CHK(mm->fz);
CHKp(!munmap(mm->mmp,mm->fz));
FILE *fp;CHKp(fp=fopen(mm->filename,"r"));
unsigned long n;CHK(1==fread(&n,sizeof(unsigned long),1,fp));
CHK(!fclose(fp));CHK(n==oldn);return 0;
}
int finalizeMM(MM mm)
{CHK(unMM(mm));
CHK(!close(mm->fd));
free(mm->filename);
return 0;
}
int resizeMM(MM mm,unsigned long n)
{if(n>(mm->fz-sizeof(unsigned long)))
{unMM(mm);
CHK(0<mapMM(mm,n));
}
mm->mmp->n=n;
return n;
}
int syncMM(MM mm)
{CHKp(!msync(mm->mmp,mm->fz,MS_SYNC));
return 0;
}
/******************************************************\
** Here's the ocaml interface **
\******************************************************/
#define v2mm MM mm=(MM)Data_custom_val(vmm)
static void mm_finalize(value vmm)
{v2mm;
printf("mm_finalize\n");
finalizeMM(mm);
}
static struct custom_operations mm_ops = {
"mm", mm_finalize,
custom_compare_default, custom_hash_default,
custom_serialize_default, custom_deserialize_default
};
CAMLprim value mm_init(value unit) /*must be called before use*/
{register_custom_operations(&mm_ops);
page_size = (size_t) sysconf (_SC_PAGESIZE);
page_mask=0;unsigned long ps=page_size;
unsigned long pbit;for(pbit=0;!(ps&1);pbit++)ps>>=1;
page_mask=(((unsigned long)-1)>>pbit)<<pbit;
errz=1024;err=malloc(errz);if(!err)failwith("unable to alloc error buffer");
return Val_unit;
}
value mm_ini(value vfn,char mode,value vsize)
{value vmm=alloc_custom(&mm_ops,sizeof(struct _MM),1,100);
v2mm;
CHK(mode=='c'||mode=='o');
{unsigned long size=-1;
if(mode=='c')
size=Long_val(vsize)*sizeof(unsigned long);
iniMM(mm,String_val(vfn),mode,size);
}
return vmm;
}
CAMLprim value mm_create(value vfn, value vsize) /* see iniMM 'c' */
{return mm_ini(vfn,'c',vsize);
}
CAMLprim value mm_open(value vfn) /* see iniMM 'o' */
{return mm_ini(vfn,'o',-1);
}
CAMLprim value mm_resize(value vmm, value vsize) /* see resizeMM */
{v2mm;
resizeMM(mm,Val_long(vsize));
return Val_unit;
}
CAMLprim value mm_sync(value vmm) /* see syncMM */
{v2mm;
syncMM(mm);
return Val_unit;
}
/******************************************************\
** **
** Here's where we want bigarray_ref_1 instead **
** **
\******************************************************/
CAMLprim value mm_get_int(value vmm, value vind)
{long*ar=((MM)vmm)->data;return Val_int(ar[Int_val(vind)]);
}
/******************************************************\
** **
** Here's where we want bigarray_set_1 instead **
** **
\******************************************************/
CAMLprim value mm_set_int(value vmm, value vind, value newval)
{long*ar=((MM)vmm)->data;ar[Int_val(vind)]=Int_val(newval);return Val_unit;
}
$ cat Makefile
all:
ocamlc mm.c
ocamlc -custom mm.o bigarray.cma mm.ml -o mm
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] custom mmap modeled on bigarray
2004-08-04 15:38 ` [Caml-list] custom mmap modeled on bigarray Jack Andrews
@ 2004-08-10 5:06 ` Jack Andrews
2004-08-11 14:52 ` Eric Stokes
0 siblings, 1 reply; 11+ messages in thread
From: Jack Andrews @ 2004-08-10 5:06 UTC (permalink / raw)
To: caml-list
i know the argument for developing first, optimizing later. i also know
the arguments for not caring about fine-grain performance and to look
at the big picture. i've argued both and seen where these arguments fail.
consider compressed data as a bit stream from disk.
say it has simple encoding:
phrase :=
byte:<number-of-bits>, byte:<number-of-values>, int[]:<bit-stream>
eg:
0x03 0x0a 0b1110 0011 1001 0100 1110 0101 1101 1100
| | +<bit-stream>
| +number-of-values
+number-of-bits
represents the sequence of 10 3-bit numbers:
111,000,111,001,010,011,100,101,110,111
now consider a sentence as
sentence := <empty> | phrase, sentence
without an enhanced FFI, ocaml will be considerably slower than C for
uncompressing (and compressing).
in my previous post, i suggest that some language primitives similar to
%bigarray_ref_1 could be introduced to make ocaml comparable to C. i
have investigated this possibility, and my suggestion is that
%bigarray_ref is replaced by a primitive %ffi_ref and made public.
then bigarray can be built on the more general %ffi_ref and developers
have a fast means of accessing C arrays like mmap regions.
if i spend time implementing %ffi_ref/set, is there any chance of it being
incorporated into ocaml?
thanks,
jack
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] custom mmap modeled on bigarray
2004-08-10 5:06 ` Jack Andrews
@ 2004-08-11 14:52 ` Eric Stokes
0 siblings, 0 replies; 11+ messages in thread
From: Eric Stokes @ 2004-08-11 14:52 UTC (permalink / raw)
To: effbiae; +Cc: caml-list
Even if it isn't it can be distributed as a GODI patch fairly easily. I
think a patch such as this, if architected well
would allow me to improve performance of some of my C bindings even
more, and that is a good thing.
On Aug 9, 2004, at 10:06 PM, Jack Andrews wrote:
> i know the argument for developing first, optimizing later. i also
> know
> the arguments for not caring about fine-grain performance and to look
> at the big picture. i've argued both and seen where these arguments
> fail.
>
> consider compressed data as a bit stream from disk.
> say it has simple encoding:
> phrase :=
> byte:<number-of-bits>, byte:<number-of-values>, int[]:<bit-stream>
> eg:
> 0x03 0x0a 0b1110 0011 1001 0100 1110 0101 1101 1100
> | | +<bit-stream>
> | +number-of-values
> +number-of-bits
>
> represents the sequence of 10 3-bit numbers:
>
> 111,000,111,001,010,011,100,101,110,111
>
> now consider a sentence as
> sentence := <empty> | phrase, sentence
>
> without an enhanced FFI, ocaml will be considerably slower than C for
> uncompressing (and compressing).
>
> in my previous post, i suggest that some language primitives similar to
> %bigarray_ref_1 could be introduced to make ocaml comparable to C. i
> have investigated this possibility, and my suggestion is that
> %bigarray_ref is replaced by a primitive %ffi_ref and made public.
> then bigarray can be built on the more general %ffi_ref and developers
> have a fast means of accessing C arrays like mmap regions.
>
> if i spend time implementing %ffi_ref/set, is there any chance of it
> being
> incorporated into ocaml?
>
> thanks,
>
>
>
> jack
>
>
> -------------------
> To unsubscribe, mail caml-list-request@inria.fr Archives:
> http://caml.inria.fr
> Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ:
> http://caml.inria.fr/FAQ/
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2004-08-11 14:52 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-08-04 2:39 [Caml-list] c is 4 times faster than ocaml? effbiae
2004-08-04 4:59 ` John Prevost
2004-08-04 5:05 ` John Prevost
2004-08-04 5:24 ` effbiae
2004-08-04 7:28 ` John Prevost
2004-08-04 8:18 ` [Caml-list] " Jack Andrews
2004-08-04 10:06 ` Mikhail Fedotov
2004-08-04 10:25 ` [Caml-list] " Jack Andrews
2004-08-04 15:38 ` [Caml-list] custom mmap modeled on bigarray Jack Andrews
2004-08-10 5:06 ` Jack Andrews
2004-08-11 14:52 ` Eric Stokes
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).