From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id 32C8FBC59 for ; Wed, 24 Nov 2010 08:14:36 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AlECAMtJ7EzUUiSF/2dsb2JhbACDTpBjjzSsD5BwhFlzBIUvhTA X-IronPort-AV: E=Sophos;i="4.59,247,1288566000"; d="diff'?scan'208";a="67941104" Received: from one4vision-tunnel.absint.com (HELO mail.absint.com) ([212.82.36.133]) by mail3-smtp-sop.national.inria.fr with ESMTP; 24 Nov 2010 08:14:35 +0100 Received: from mail.absint.com (localhost [127.0.0.1]) by mail.absint.com (Postfix) with ESMTP id 7420611C008B for ; Wed, 24 Nov 2010 08:14:28 +0100 (CET) X-Spam-Checker-Version: SpamAssassin 3.3.1-spampd (2010-03-16) on mail.absint.com X-Spam-Level: Received: from absint.com (imp.absint.com [192.168.10.48]) by mail.absint.com (Postfix) with ESMTP for ; Wed, 24 Nov 2010 08:14:28 +0100 (CET) Received: by absint.com (Postfix, from userid 15028) id 61695105763; Wed, 24 Nov 2010 08:14:28 +0100 (CET) From: Christoph Cullmann Organization: AbsInt To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] OCaml + mingw-w64 Date: Wed, 24 Nov 2010 08:14:28 +0100 User-Agent: KMail/1.13.2 (Linux/2.6.35-14-generic; KDE/4.4.2; x86_64; ; ) References: <2017975850.988226.1290427453930.JavaMail.root@zmbs1.inria.fr> <201011231731.12846.cullmann@absint.de> In-Reply-To: <201011231731.12846.cullmann@absint.de> MIME-Version: 1.0 Content-Type: Multipart/Mixed; boundary="Boundary-00=_UtL7MMnFcfdZwFn" Message-Id: <201011240814.28105.cullmann@absint.de> X-Spam: no; 0.00; ocaml:01 damien:01 ocaml:01 byterun:01 arises:01 asmrun:01 asmcomp:01 mls:01 patched:01 segfault:01 23,:98 gesch:98 ferdinand:98 doligez:01 wrote:01 X-Attachments: type="text/x-patch" cset="UTF-8" name="flexdll.0.26.diff" name="flexdll.0.26.diff" type="text/x-patch" cset="UTF-8" name="ocaml.3.12.diff" name="ocaml.3.12.diff" --Boundary-00=_UtL7MMnFcfdZwFn Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable On Tuesday, November 23, 2010 05:31:12 pm Christoph Cullmann wrote: > On Monday, November 22, 2010 04:33:08 pm Damien Doligez wrote: > > On 2010-11-22, at 13:04, Christoph Cullmann wrote: > > > In ocaml 3.12, in byterun/major_gc.h, there is: > > >=20 > > >=20 > > > intnat caml_major_collection_slice (long howmuch) > > >=20 > > >=20 > > >=20 > > > whereas in the .c file that is: > > >=20 > > >=20 > > > intnat caml_major_collection_slice (intnat howmuch) > > >=20 > > >=20 > > >=20 > > > As intnat is with mingw-w64 long long, this doesn't match. > > > Should not the header use intnat, too? > >=20 > > You're right. This will be fixed in 3.12.1. >=20 > Thanks. >=20 > We just started here to port ocaml to mingw-w64, question arises: >=20 > Which files need adoption? >=20 > I have seen that in asmrun the amd64.S needs patches and in asmcomp the > emit and proc mls. >=20 > Do I miss files? We started to have some version that at least links (with patched flexdll) = but=20 runs in a nice segfault already ;) Attached are the current changed states, any ideas where to look? Atm not seeing the obvious fault :/ Greetings Christoph =2D-=20 =2D------------------------------------- Christoph Cullmann --------- AbsInt Angewandte Informatik GmbH Email: cullmann@AbsInt.com Science Park 1 Tel: +49-681-38360-22 66123 Saarbr=FCcken Fax: +49-681-38360-20 GERMANY WWW: http://www.AbsInt.com =2D------------------------------------------------------------------- Gesch=E4ftsf=FChrung: Dr.-Ing. Christian Ferdinand Eingetragen im Handelsregister des Amtsgerichts Saarbr=FCcken, HRB 11234 --Boundary-00=_UtL7MMnFcfdZwFn Content-Type: text/x-patch; charset="UTF-8"; name="flexdll.0.26.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="flexdll.0.26.diff" diff --combined ocaml/flexdll/cmdline.ml index eccd13e,ae96a0e..0000000 --- a/ocaml/flexdll/cmdline.ml +++ b/ocaml/flexdll/cmdline.ml @@@ -18,7 -18,7 +18,7 @@@ let use_default_libs = ref tru let subsystem = ref "console" let explain = ref false let builtin_linker = ref false -let toolchain : [ `MSVC | `MSVC64 | `MINGW | `CYGWIN | `LIGHTLD ] ref = ref `MSVC +let toolchain : [ `MSVC | `MSVC64 | `MINGW | `MINGW64 | `CYGWIN | `LIGHTLD ] ref = ref `MSVC let save_temps = ref false let show_exports = ref false let show_imports = ref false @@@ -87,7 -87,7 +87,7 @@@ let specs = "-l", Arg.String (fun s -> files := ("-l" ^ s) :: !files), " Library file"; - "-chain", Arg.Symbol (["msvc";"msvc64";"cygwin";"mingw";"ld"], + "-chain", Arg.Symbol (["msvc";"msvc64";"cygwin";"mingw";"mingw64";"ld"], (fun s -> machine := `x86; underscore := true; toolchain := match s with @@@ -95,7 -95,6 +95,7 @@@ | "msvc64" -> machine := `x64; underscore := false; `MSVC64 | "cygwin" -> `CYGWIN | "mingw" -> `MINGW + | "mingw64" -> machine := `x64; underscore := false; `MINGW64 | "ld" -> `LIGHTLD | _ -> assert false)), " Choose which linker to use"; diff --combined ocaml/flexdll/reloc.ml index eb331dc,9a37b5b..0000000 --- a/ocaml/flexdll/reloc.ml +++ b/ocaml/flexdll/reloc.ml @@@ -112,7 -112,7 +112,7 @@@ type cmdline = let new_cmdline () = let rf = match !toolchain with | `MSVC | `MSVC64 | `LIGHTLD -> true - | `MINGW | `CYGWIN -> false + | `MINGW | `MINGW64 | `CYGWIN -> false in { may_use_response_file = rf; @@@ -160,7 -160,7 +160,7 @@@ let cygpath l let gcclib () = let extra = match !toolchain with - | `MINGW -> "-mno-cygwin " + | `MINGW | `MINGW64 -> "-mno-cygwin " | _ -> "" in Filename.dirname (get_output1 ~use_bash:(!toolchain = `CYGWIN) (Printf.sprintf "gcc %s-print-libgcc-file-name" extra)) @@@ -492,7 -492,7 +492,7 @@@ let parse_dll_exports fn let dll_exports fn = match !toolchain with | `MSVC | `MSVC64 | `LIGHTLD -> failwith "Creation of import library not supported for this toolchain" - | `CYGWIN | `MINGW -> + | `CYGWIN | `MINGW | `MINGW64 -> let dmp = temp_file "dyndll" ".dmp" in if cmd_verbose (Printf.sprintf "objdump -p %s > %s" fn dmp) <> 0 then failwith "Error while extracting exports from a DLL"; @@@ -898,7 -898,7 +898,7 @@@ let build_dll link_exe output_file file files def_file extra_args - | `MINGW -> + | `MINGW | `MINGW64 -> let def_file = if main_pgm then "" else @@@ -911,7 -911,7 +911,7 @@@ "gcc -mno-cygwin -m%s %s%s -L. %s %s -o %s %s %s %s %s %s" !subsystem (if link_exe = `EXE then "" else "-shared ") - (if main_pgm then "" else if !noentry then "-Wl,-e0 " else "-Wl,-e_FlexDLLiniter@12 ") + (if main_pgm then "" else if !noentry then "-Wl,-e0 " else match !machine with | `x86 -> "-Wl,-e_FlexDLLiniter@12 " | `x64 -> "-Wl,-e_FlexDLLiniter " ) (mk_dirs_opt "-I") (mk_dirs_opt "-L") (Filename.quote output_file) @@@ -983,7 -983,7 +983,7 @@@ let setup_toolchain () parse_libpath (try Sys.getenv "LIB" with Not_found -> ""); if not !custom_crt then default_libs := ["msvcrt.lib"] - | `MINGW -> + | `MINGW | `MINGW64 -> search_path := !dirs @ [ "/lib/mingw"; @@@ -1015,7 -1015,7 +1015,7 @@@ let compile_if_needed file (Filename.quote tmp_obj) (mk_dirs_opt "-I") file - | `MINGW -> + | `MINGW | `MINGW64 -> Printf.sprintf "gcc -mno-cygwin -c -o %s %s %s" (Filename.quote tmp_obj) @@@ -1054,7 -1054,6 +1054,7 @@@ let all_files () | `MSVC -> "msvc.obj" | `MSVC64 -> "msvc64.obj" | `CYGWIN -> "cygwin.o" + | `MINGW64 -> "mingw64.o" | `MINGW | `LIGHTLD -> "mingw.o" in if !exe_mode <> `DLL then if !add_flexdll_obj then f ("flexdll_" ^ tc) :: files @@@ -1073,7 -1072,7 +1073,7 @@@ let main () | _, `Yes -> true | _, `No -> false | `CYGWIN, `None -> (Sys.command "cygpath -v 2>/dev/null >/dev/null" = 0) - | `MINGW, `None -> (Sys.command "cygpath -v 2>NUL >NUL" = 0) + | (`MINGW|`MINGW64), `None -> (Sys.command "cygpath -v 2>NUL >NUL" = 0) | (`MSVC|`MSVC64|`LIGHTLD), `None -> false end; diff --git a/ocaml/flexdll/version.ml b/ocaml/flexdll/version.ml new file mode 100755 index 0000000..47eaa2f --- /dev/null +++ b/ocaml/flexdll/version.ml @@ -0,0 +1 @@ +let version = "0.26" --Boundary-00=_UtL7MMnFcfdZwFn Content-Type: text/x-patch; charset="UTF-8"; name="ocaml.3.12.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="ocaml.3.12.diff" diff --combined ocaml/3.12/asmrun/Makefile.nt index 7d72312,f9ed767..0000000 --- a/ocaml/3.12/asmrun/Makefile.nt +++ b/ocaml/3.12/asmrun/Makefile.nt @@@ -54,9 -54,6 +54,9 @@@ amd64nt.obj: amd64nt.as i386.o: i386.S $(CC) -c -DSYS_$(SYSTEM) i386.S +amd64.o: amd64.S + $(CC) -c -DSYS_$(SYSTEM) amd64.S + install: cp libasmrun.$(A) $(LIBDIR) diff --combined ocaml/3.12/asmrun/amd64.S index abf74ee,8eb4ebf..0000000 --- a/ocaml/3.12/asmrun/amd64.S +++ b/ocaml/3.12/asmrun/amd64.S @@@ -31,21 -31,6 +31,21 @@@ .align FUNCTION_ALIGN; \ name: +#elif defined(SYS_mingw64) + +#undef __PIC__ + +#define G(r) r +#define GREL(r) r +#define GCALL(r) r +#define FUNCTION_ALIGN 16 +#define EIGHT_ALIGN 8 +#define SIXTEEN_ALIGN 16 +#define FUNCTION(name) \ + .globl name; \ + .align FUNCTION_ALIGN; \ + name: + #else #define G(r) r @@@ -106,7 -91,7 +106,7 @@@ popq %r11 #else - + /* Non-PIC operations on global variables. Slightly faster. */ #define STORE_VAR(srcreg,dstlabel) \ @@@ -179,13 -164,7 +179,13 @@@ FUNCTION(G(caml_call_gc) movlpd %xmm14, 14*8(%rsp) movlpd %xmm15, 15*8(%rsp) /* Call the garbage collector */ +#ifdef SYS_mingw64 + subq $32, %rsp /* PR#5008: bottom 32 bytes are reserved for callee */ +#endif call GCALL(caml_garbage_collection) +#ifdef SYS_mingw64 + addq $32, %rsp /* PR#5008 */ +#endif /* Restore caml_young_ptr, caml_exception_pointer */ LOAD_VAR(caml_young_ptr, %r15) LOAD_VAR(caml_exception_pointer, %r14) @@@ -298,28 -277,6 +298,28 @@@ FUNCTION(G(caml_c_call) /* Start the Caml program */ FUNCTION(G(caml_start_program)) +#ifdef SYS_mingw64 + /* Save callee-save registers */ + pushq %rbx + pushq %rbp + pushq %rsi + pushq %rdi + pushq %r12 + pushq %r13 + pushq %r14 + pushq %r15 + subq $(8+10*16), %rsp /* stack 16-aligned + 10 saved xmm regs */ + movapd %xmm6, 0*16(%rsp) + movapd %xmm7, 1*16(%rsp) + movapd %xmm8, 2*16(%rsp) + movapd %xmm9, 3*16(%rsp) + movapd %xmm10, 4*16(%rsp) + movapd %xmm11, 5*16(%rsp) + movapd %xmm12, 6*16(%rsp) + movapd %xmm13, 7*16(%rsp) + movapd %xmm14, 8*16(%rsp) + movapd %xmm15, 9*16(%rsp) +#else /* Save callee-save registers */ pushq %rbx pushq %rbp @@@ -328,7 -285,6 +328,7 @@@ pushq %r14 pushq %r15 subq $8, %rsp /* stack 16-aligned */ +#endif /* Initial entry point is G(caml_program) */ leaq GCALL(caml_program)(%rip), %r12 /* Common code for caml_start_program and caml_callback* */ @@@ -361,28 -317,6 +361,28 @@@ POP_VAR(caml_last_return_address) POP_VAR(caml_gc_regs) addq $8, %rsp +#ifdef SYS_mingw64 + /* Restore callee-save registers. */ + movapd 0*16(%rsp), %xmm6 + movapd 1*16(%rsp), %xmm7 + movapd 2*16(%rsp), %xmm8 + movapd 3*16(%rsp), %xmm9 + movapd 4*16(%rsp), %xmm10 + movapd 5*16(%rsp), %xmm11 + movapd 6*16(%rsp), %xmm12 + movapd 7*16(%rsp), %xmm13 + movapd 8*16(%rsp), %xmm14 + movapd 9*16(%rsp), %xmm15 + addq $(8+10*16), %rsp + popq %r15 + popq %r14 + popq %r13 + popq %r12 + popq %rdi + popq %rsi + popq %rbp + popq %rbx +#else /* Restore callee-save registers. */ addq $8, %rsp popq %r15 @@@ -391,7 -325,6 +391,7 @@@ popq %r12 popq %rbp popq %rbx +#endif /* Return to caller. */ ret .L108: @@@ -409,20 -342,11 +409,20 @@@ FUNCTION(G(caml_raise_exn) popq %r14 ret .L110: +#ifdef SYS_mingw64 + movq %rax, %r12 /* Save exception bucket in r12 */ + movq %rax, %rcx /* Arg 1: exception bucket */ + movq 0(%rsp), %rdx /* arg 2: pc of raise */ + leaq 8(%rsp), %r8 /* arg 3: sp of raise */ + movq %r14, %r9 /* arg 4: sp of handler */ + subq $32, %rsp /* PR#5008: bottom 32 bytes are reserved for callee */ +#else movq %rax, %r12 /* Save exception bucket */ movq %rax, %rdi /* arg 1: exception bucket */ movq 0(%rsp), %rsi /* arg 2: pc of raise */ leaq 8(%rsp), %rdx /* arg 3: sp of raise */ movq %r14, %rcx /* arg 4: sp of handler */ +#endif call GCALL(caml_stash_backtrace) movq %r12, %rax /* Recover exception bucket */ movq %r14, %rsp @@@ -434,30 -358,17 +434,30 @@@ FUNCTION(G(caml_raise_exception)) TESTL_VAR($1, caml_backtrace_active) jne .L111 +#ifdef SYS_mingw64 + movq %rcx, %rax +#else movq %rdi, %rax +#endif LOAD_VAR(caml_exception_pointer, %rsp) /* Cut stack */ popq %r14 /* Recover previous exception handler */ LOAD_VAR(caml_young_ptr, %r15) /* Reload alloc ptr */ ret .L111: +#ifdef SYS_mingw64 + movq %rcx, %r12 /* Save exception bucket */ + /* arg 1: exception bucket */ + LOAD_VAR(caml_last_return_address,%rdx) /* arg 2: pc of raise */ + LOAD_VAR(caml_bottom_of_stack,%r8) /* arg 3: sp of raise */ + LOAD_VAR(caml_exception_pointer,%r9) /* arg 4: sp of handler */ + subq $32, %rsp /* PR#5008: bottom 32 bytes are reserved for callee */ +#else movq %rdi, %r12 /* Save exception bucket */ /* arg 1: exception bucket */ - LOAD_VAR(caml_last_return_address,%rsi) /* arg 2: pc of raise */ + LOAD_VAR(caml_last_return_address,%rsi) /* arg 2: pc of raise */ LOAD_VAR(caml_bottom_of_stack,%rdx) /* arg 3: sp of raise */ LOAD_VAR(caml_exception_pointer,%rcx) /* arg 4: sp of handler */ +#endif call GCALL(caml_stash_backtrace) movq %r12, %rax /* Recover exception bucket */ LOAD_VAR(caml_exception_pointer,%rsp) @@@ -468,31 -379,6 +468,31 @@@ /* Callback from C to Caml */ FUNCTION(G(caml_callback_exn)) +#ifdef SYS_mingw64 + /* Save callee-save registers */ + pushq %rbx + pushq %rbp + pushq %rsi + pushq %rdi + pushq %r12 + pushq %r13 + pushq %r14 + pushq %r15 + subq $(8+10*16), %rsp /* stack 16-aligned + 10 saved xmm regs */ + movapd %xmm6, 0*16(%rsp) + movapd %xmm7, 1*16(%rsp) + movapd %xmm8, 2*16(%rsp) + movapd %xmm9, 3*16(%rsp) + movapd %xmm10, 4*16(%rsp) + movapd %xmm11, 5*16(%rsp) + movapd %xmm12, 6*16(%rsp) + movapd %xmm13, 7*16(%rsp) + movapd %xmm14, 8*16(%rsp) + movapd %xmm15, 9*16(%rsp) + /* Initial loading of arguments */ + movq %rcx, %rbx /* closure */ + movq %rdx, %rax /* argument */ +#else /* Save callee-save registers */ pushq %rbx pushq %rbp @@@ -500,41 -386,14 +500,41 @@@ pushq %r13 pushq %r14 pushq %r15 - subq $8, %rsp /* stack 16-aligned */ + subq $8, %rsp /* stack 16-aligned */ /* Initial loading of arguments */ movq %rdi, %rbx /* closure */ movq %rsi, %rax /* argument */ +#endif movq 0(%rbx), %r12 /* code pointer */ jmp .Lcaml_start_program FUNCTION(G(caml_callback2_exn)) +#ifdef SYS_mingw64 + /* Save callee-save registers */ + pushq %rbx + pushq %rbp + pushq %rsi + pushq %rdi + pushq %r12 + pushq %r13 + pushq %r14 + pushq %r15 + subq $(8+10*16), %rsp /* stack 16-aligned + 10 saved xmm regs */ + movapd %xmm6, 0*16(%rsp) + movapd %xmm7, 1*16(%rsp) + movapd %xmm8, 2*16(%rsp) + movapd %xmm9, 3*16(%rsp) + movapd %xmm10, 4*16(%rsp) + movapd %xmm11, 5*16(%rsp) + movapd %xmm12, 6*16(%rsp) + movapd %xmm13, 7*16(%rsp) + movapd %xmm14, 8*16(%rsp) + movapd %xmm15, 9*16(%rsp) + /* Initial loading of arguments */ + movq %rcx, %rdi /* closure */ + movq %rdx, %rax /* first argument */ + movq %r8, %rbx /* second argument */ +#else /* Save callee-save registers */ pushq %rbx pushq %rbp @@@ -542,43 -401,15 +542,43 @@@ pushq %r13 pushq %r14 pushq %r15 - subq $8, %rsp /* stack 16-aligned */ + subq $8, %rsp /* stack 16-aligned */ /* Initial loading of arguments */ /* closure stays in %rdi */ movq %rsi, %rax /* first argument */ movq %rdx, %rbx /* second argument */ +#endif leaq GCALL(caml_apply2)(%rip), %r12 /* code pointer */ jmp .Lcaml_start_program FUNCTION(G(caml_callback3_exn)) +#ifdef SYS_mingw64 + /* Save callee-save registers */ + pushq %rbx + pushq %rbp + pushq %rsi + pushq %rdi + pushq %r12 + pushq %r13 + pushq %r14 + pushq %r15 + subq $(8+10*16), %rsp /* stack 16-aligned + 10 saved xmm regs */ + movapd %xmm6, 0*16(%rsp) + movapd %xmm7, 1*16(%rsp) + movapd %xmm8, 2*16(%rsp) + movapd %xmm9, 3*16(%rsp) + movapd %xmm10, 4*16(%rsp) + movapd %xmm11, 5*16(%rsp) + movapd %xmm12, 6*16(%rsp) + movapd %xmm13, 7*16(%rsp) + movapd %xmm14, 8*16(%rsp) + movapd %xmm15, 9*16(%rsp) + /* Initial loading of arguments */ + movq %rcx, %rsi /* closure */ + movq %rdx, %rax /* first argument */ + movq %r8, %rbx /* second argument */ + movq %r9, %rdi /* third argument */ +#else /* Save callee-save registers */ pushq %rbx pushq %rbp @@@ -586,13 -417,12 +586,13 @@@ pushq %r13 pushq %r14 pushq %r15 - subq $8, %rsp /* stack 16-aligned */ + subq $8, %rsp /* stack 16-aligned */ /* Initial loading of arguments */ movq %rsi, %rax /* first argument */ movq %rdx, %rbx /* second argument */ movq %rdi, %rsi /* closure */ movq %rcx, %rdi /* third argument */ +#endif leaq GCALL(caml_apply3)(%rip), %r12 /* code pointer */ jmp .Lcaml_start_program @@@ -612,7 -442,7 +612,7 @@@ G(caml_system__frametable) #ifdef SYS_macosx .literal16 -#else +#elif defined(SYS_linux) .section .rodata.cst8,"a",@progbits #endif .globl G(caml_negf_mask) diff --git a/ocaml/3.12/asmcomp/amd64/emit_mingw64.mlp b/ocaml/3.12/asmcomp/amd64/emit_mingw64.mlp new file mode 100644 index 0000000..6bc0bdf --- /dev/null +++ b/ocaml/3.12/asmcomp/amd64/emit_mingw64.mlp @@ -0,0 +1,791 @@ +(***********************************************************************) +(* *) +(* Objective Caml *) +(* *) +(* Xavier Leroy, projet Cristal, INRIA Rocquencourt *) +(* *) +(* Copyright 1996 Institut National de Recherche en Informatique et *) +(* en Automatique. All rights reserved. This file is distributed *) +(* under the terms of the Q Public License version 1.0. *) +(* *) +(***********************************************************************) + +(* $Id: emit.mlp 10488 2010-06-02 08:55:35Z xleroy $ *) + +(* Emission of x86-64 (AMD 64) assembly code *) + +open Misc +open Cmm +open Arch +open Proc +open Reg +open Mach +open Linearize +open Emitaux + +let macosx = + match Config.system with + | "macosx" -> true + | _ -> false + + +(* Tradeoff between code size and code speed *) + +let fastcode_flag = ref true + +let stack_offset = ref 0 + +(* Layout of the stack frame *) + +let frame_required () = + !contains_calls || num_stack_slots.(0) > 0 || num_stack_slots.(1) > 0 + +let frame_size () = (* includes return address *) + if frame_required() then begin + let sz = + (!stack_offset + 8 * (num_stack_slots.(0) + num_stack_slots.(1)) + 8) + in Misc.align sz 16 + end else + !stack_offset + 8 + +let slot_offset loc cl = + match loc with + Incoming n -> frame_size() + n + | Local n -> + if cl = 0 + then !stack_offset + n * 8 + else !stack_offset + (num_stack_slots.(0) + n) * 8 + | Outgoing n -> n + +(* Symbols *) + +let emit_symbol s = + if macosx then emit_string "_"; + Emitaux.emit_symbol '$' s + +let emit_call s = + `call {emit_symbol s}` + +let emit_jump s = + `jmp {emit_symbol s}` + +let load_symbol_addr s = + if !pic_code + then `leaq {emit_symbol s}(%rip)` + else `movq ${emit_symbol s}` + +(* Output a label *) + +let emit_label lbl = + emit_string ".L"; emit_int lbl + +(* Output a .align directive. *) + +let emit_align n = + let n = if macosx then Misc.log2 n else n in + ` .align {emit_int n}\n` + +let emit_Llabel fallthrough lbl = + if not fallthrough && !fastcode_flag then emit_align 4; + emit_label lbl + +(* Output a pseudo-register *) + +let emit_reg = function + { loc = Reg r } -> + emit_string (register_name r) + | { loc = Stack s } as r -> + let ofs = slot_offset s (register_class r) in + `{emit_int ofs}(%rsp)` + | { loc = Unknown } -> + assert false + +(* Output a reference to the lower 8, 16 or 32 bits of a register *) + +let reg_low_8_name = + [| "%al"; "%bl"; "%dil"; "%sil"; "%dl"; "%cl"; "%r8b"; "%r9b"; + "%r10b"; "%r11b"; "%bpl"; "%r12b"; "%r13b" |] +let reg_low_16_name = + [| "%ax"; "%bx"; "%di"; "%si"; "%dx"; "%cx"; "%r8w"; "%r9w"; + "%r10w"; "%r11w"; "%bp"; "%r12w"; "%r13w" |] +let reg_low_32_name = + [| "%eax"; "%ebx"; "%edi"; "%esi"; "%edx"; "%ecx"; "%r8d"; "%r9d"; + "%r10d"; "%r11d"; "%ebp"; "%r12d"; "%r13d" |] + +let emit_subreg tbl r = + match r.loc with + Reg r when r < 13 -> + emit_string tbl.(r) + | Stack s -> + let ofs = slot_offset s (register_class r) in + `{emit_int ofs}(%rsp)` + | _ -> + assert false + +let emit_reg8 r = emit_subreg reg_low_8_name r +let emit_reg16 r = emit_subreg reg_low_16_name r +let emit_reg32 r = emit_subreg reg_low_32_name r + +(* Output an addressing mode *) + +let emit_addressing addr r n = + match addr with + | Ibased _ when !Clflags.dlcode -> assert false + | Ibased(s, d) -> + `{emit_symbol s}`; + if d <> 0 then ` + {emit_int d}`; + `(%rip)` + | Iindexed d -> + if d <> 0 then emit_int d; + `({emit_reg r.(n)})` + | Iindexed2 d -> + if d <> 0 then emit_int d; + `({emit_reg r.(n)}, {emit_reg r.(n+1)})` + | Iscaled(2, d) -> + if d <> 0 then emit_int d; + `({emit_reg r.(n)}, {emit_reg r.(n)})` + | Iscaled(scale, d) -> + if d <> 0 then emit_int d; + `(, {emit_reg r.(n)}, {emit_int scale})` + | Iindexed2scaled(scale, d) -> + if d <> 0 then emit_int d; + `({emit_reg r.(n)}, {emit_reg r.(n+1)}, {emit_int scale})` + +(* Record live pointers at call points -- see Emitaux *) + +let record_frame_label live dbg = + let lbl = new_label() in + let live_offset = ref [] in + Reg.Set.iter + (function + {typ = Addr; loc = Reg r} -> + live_offset := ((r lsl 1) + 1) :: !live_offset + | {typ = Addr; loc = Stack s} as reg -> + live_offset := slot_offset s (register_class reg) :: !live_offset + | _ -> ()) + live; + frame_descriptors := + { fd_lbl = lbl; + fd_frame_size = frame_size(); + fd_live_offset = !live_offset; + fd_debuginfo = dbg } :: !frame_descriptors; + lbl + +let record_frame live dbg = + let lbl = record_frame_label live dbg in `{emit_label lbl}:\n` + +(* Record calls to the GC -- we've moved them out of the way *) + +type gc_call = + { gc_lbl: label; (* Entry label *) + gc_return_lbl: label; (* Where to branch after GC *) + gc_frame: label } (* Label of frame descriptor *) + +let call_gc_sites = ref ([] : gc_call list) + +let emit_call_gc gc = + `{emit_label gc.gc_lbl}: {emit_call "caml_call_gc"}\n`; + `{emit_label gc.gc_frame}: jmp {emit_label gc.gc_return_lbl}\n` + +(* Record calls to caml_ml_array_bound_error. + In -g mode, we maintain one call to caml_ml_array_bound_error + per bound check site. Without -g, we can share a single call. *) + +type bound_error_call = + { bd_lbl: label; (* Entry label *) + bd_frame: label } (* Label of frame descriptor *) + +let bound_error_sites = ref ([] : bound_error_call list) +let bound_error_call = ref 0 + +let bound_error_label dbg = + if !Clflags.debug then begin + let lbl_bound_error = new_label() in + let lbl_frame = record_frame_label Reg.Set.empty dbg in + bound_error_sites := + { bd_lbl = lbl_bound_error; bd_frame = lbl_frame } :: !bound_error_sites; + lbl_bound_error + end else begin + if !bound_error_call = 0 then bound_error_call := new_label(); + !bound_error_call + end + +let emit_call_bound_error bd = + `{emit_label bd.bd_lbl}: {emit_call "caml_ml_array_bound_error"}\n`; + `{emit_label bd.bd_frame}:\n` + +let emit_call_bound_errors () = + List.iter emit_call_bound_error !bound_error_sites; + if !bound_error_call > 0 then + `{emit_label !bound_error_call}: {emit_call "caml_ml_array_bound_error"}\n` + +(* Names for instructions *) + +let instr_for_intop = function + Iadd -> "addq" + | Isub -> "subq" + | Imul -> "imulq" + | Iand -> "andq" + | Ior -> "orq" + | Ixor -> "xorq" + | Ilsl -> "salq" + | Ilsr -> "shrq" + | Iasr -> "sarq" + | _ -> assert false + +let instr_for_floatop = function + Iaddf -> "addsd" + | Isubf -> "subsd" + | Imulf -> "mulsd" + | Idivf -> "divsd" + | _ -> assert false + +let instr_for_floatarithmem = function + Ifloatadd -> "addsd" + | Ifloatsub -> "subsd" + | Ifloatmul -> "mulsd" + | Ifloatdiv -> "divsd" + +let name_for_cond_branch = function + Isigned Ceq -> "e" | Isigned Cne -> "ne" + | Isigned Cle -> "le" | Isigned Cgt -> "g" + | Isigned Clt -> "l" | Isigned Cge -> "ge" + | Iunsigned Ceq -> "e" | Iunsigned Cne -> "ne" + | Iunsigned Cle -> "be" | Iunsigned Cgt -> "a" + | Iunsigned Clt -> "b" | Iunsigned Cge -> "ae" + +(* Output an = 0 or <> 0 test. *) + +let output_test_zero arg = + match arg.loc with + Reg r -> ` testq {emit_reg arg}, {emit_reg arg}\n` + | _ -> ` cmpq $0, {emit_reg arg}\n` + +(* Output a floating-point compare and branch *) + +let emit_float_test cmp neg arg lbl = + (* Effect of comisd on flags and conditional branches: + ZF PF CF cond. branches taken + unordered 1 1 1 je, jb, jbe, jp + > 0 0 0 jne, jae, ja + < 0 0 1 jne, jbe, jb + = 1 0 0 je, jae, jbe. + If FP traps are on (they are off by default), + comisd traps on QNaN and SNaN but ucomisd traps on SNaN only. + *) + match (cmp, neg) with + | (Ceq, false) | (Cne, true) -> + let next = new_label() in + ` ucomisd {emit_reg arg.(1)}, {emit_reg arg.(0)}\n`; + ` jp {emit_label next}\n`; (* skip if unordered *) + ` je {emit_label lbl}\n`; (* branch taken if x=y *) + `{emit_label next}:\n` + | (Cne, false) | (Ceq, true) -> + ` ucomisd {emit_reg arg.(1)}, {emit_reg arg.(0)}\n`; + ` jp {emit_label lbl}\n`; (* branch taken if unordered *) + ` jne {emit_label lbl}\n` (* branch taken if xy *) + | (Clt, _) -> + ` comisd {emit_reg arg.(0)}, {emit_reg arg.(1)}\n`; (* swap compare *) + if not neg then + ` ja {emit_label lbl}\n` (* branch taken if y>x i.e. x + ` comisd {emit_reg arg.(0)}, {emit_reg arg.(1)}\n`; (* swap compare *) + if not neg then + ` jae {emit_label lbl}\n` (* branch taken if y>=x i.e. x<=y *) + else + ` jb {emit_label lbl}\n` (* taken if unordered or y + ` comisd {emit_reg arg.(1)}, {emit_reg arg.(0)}\n`; + if not neg then + ` ja {emit_label lbl}\n` (* branch taken if x>y *) + else + ` jbe {emit_label lbl}\n` (* taken if unordered or x<=y i.e. !(x>y) *) + | (Cge, _) -> + ` comisd {emit_reg arg.(1)}, {emit_reg arg.(0)}\n`; (* swap compare *) + if not neg then + ` jae {emit_label lbl}\n` (* branch taken if x>=y *) + else + ` jb {emit_label lbl}\n` (* taken if unordered or x=y) *) + +(* Deallocate the stack frame before a return or tail call *) + +let output_epilogue () = + if frame_required() then begin + let n = frame_size() - 8 in + ` addq ${emit_int n}, %rsp\n` + end + +(* Output the assembly code for an instruction *) + +(* Name of current function *) +let function_name = ref "" +(* Entry point for tail recursive calls *) +let tailrec_entry_point = ref 0 + +let float_constants = ref ([] : (int * string) list) + +let emit_instr fallthrough i = + match i.desc with + Lend -> () + | Lop(Imove | Ispill | Ireload) -> + let src = i.arg.(0) and dst = i.res.(0) in + if src.loc <> dst.loc then begin + if src.typ = Float then + ` movsd {emit_reg src}, {emit_reg dst}\n` + else + ` movq {emit_reg src}, {emit_reg dst}\n` + end + | Lop(Iconst_int n) -> + if n = 0n then begin + match i.res.(0).loc with + Reg n -> ` xorq {emit_reg i.res.(0)}, {emit_reg i.res.(0)}\n` + | _ -> ` movq $0, {emit_reg i.res.(0)}\n` + end else if n <= 0x7FFFFFFFn && n >= -0x80000000n then + ` movq ${emit_nativeint n}, {emit_reg i.res.(0)}\n` + else + ` movabsq ${emit_nativeint n}, {emit_reg i.res.(0)}\n` + | Lop(Iconst_float s) -> + begin match Int64.bits_of_float (float_of_string s) with + | 0x0000_0000_0000_0000L -> (* +0.0 *) + ` xorpd {emit_reg i.res.(0)}, {emit_reg i.res.(0)}\n` + | _ -> + let lbl = new_label() in + float_constants := (lbl, s) :: !float_constants; + ` movlpd {emit_label lbl}(%rip), {emit_reg i.res.(0)}\n` + end + | Lop(Iconst_symbol s) -> + ` {load_symbol_addr s}, {emit_reg i.res.(0)}\n` + | Lop(Icall_ind) -> + ` call *{emit_reg i.arg.(0)}\n`; + record_frame i.live i.dbg + | Lop(Icall_imm(s)) -> + ` {emit_call s}\n`; + record_frame i.live i.dbg + | Lop(Itailcall_ind) -> + output_epilogue(); + ` jmp *{emit_reg i.arg.(0)}\n` + | Lop(Itailcall_imm s) -> + if s = !function_name then + ` jmp {emit_label !tailrec_entry_point}\n` + else begin + output_epilogue(); + ` {emit_jump s}\n` + end + | Lop(Iextcall(s, alloc)) -> + if alloc then begin + ` {load_symbol_addr s}, %rax\n`; + ` {emit_call "caml_c_call"}\n`; + record_frame i.live i.dbg + end else begin + ` {emit_call s}\n` + end + | Lop(Istackoffset n) -> + if n < 0 + then ` addq ${emit_int(-n)}, %rsp\n` + else ` subq ${emit_int(n)}, %rsp\n`; + stack_offset := !stack_offset + n + | Lop(Iload(chunk, addr)) -> + let dest = i.res.(0) in + begin match chunk with + | Word -> + ` movq {emit_addressing addr i.arg 0}, {emit_reg dest}\n` + | Byte_unsigned -> + ` movzbq {emit_addressing addr i.arg 0}, {emit_reg dest}\n` + | Byte_signed -> + ` movsbq {emit_addressing addr i.arg 0}, {emit_reg dest}\n` + | Sixteen_unsigned -> + ` movzwq {emit_addressing addr i.arg 0}, {emit_reg dest}\n` + | Sixteen_signed -> + ` movswq {emit_addressing addr i.arg 0}, {emit_reg dest}\n` + | Thirtytwo_unsigned -> + ` movl {emit_addressing addr i.arg 0}, {emit_reg32 dest}\n` + | Thirtytwo_signed -> + ` movslq {emit_addressing addr i.arg 0}, {emit_reg dest}\n` + | Single -> + ` cvtss2sd {emit_addressing addr i.arg 0}, {emit_reg dest}\n` + | Double | Double_u -> + ` movlpd {emit_addressing addr i.arg 0}, {emit_reg dest}\n` + end + | Lop(Istore(chunk, addr)) -> + begin match chunk with + | Word -> + ` movq {emit_reg i.arg.(0)}, {emit_addressing addr i.arg 1}\n` + | Byte_unsigned | Byte_signed -> + ` movb {emit_reg8 i.arg.(0)}, {emit_addressing addr i.arg 1}\n` + | Sixteen_unsigned | Sixteen_signed -> + ` movw {emit_reg16 i.arg.(0)}, {emit_addressing addr i.arg 1}\n` + | Thirtytwo_signed | Thirtytwo_unsigned -> + ` movl {emit_reg32 i.arg.(0)}, {emit_addressing addr i.arg 1}\n` + | Single -> + ` cvtsd2ss {emit_reg i.arg.(0)}, %xmm15\n`; + ` movss %xmm15, {emit_addressing addr i.arg 1}\n` + | Double | Double_u -> + ` movlpd {emit_reg i.arg.(0)}, {emit_addressing addr i.arg 1}\n` + end + | Lop(Ialloc n) -> + if !fastcode_flag then begin + let lbl_redo = new_label() in + `{emit_label lbl_redo}: subq ${emit_int n}, %r15\n`; + if !Clflags.dlcode then begin + ` {load_symbol_addr "caml_young_limit"}, %rax\n`; + ` cmpq (%rax), %r15\n`; + end else + ` cmpq {emit_symbol "caml_young_limit"}(%rip), %r15\n`; + let lbl_call_gc = new_label() in + let lbl_frame = record_frame_label i.live Debuginfo.none in + ` jb {emit_label lbl_call_gc}\n`; + ` leaq 8(%r15), {emit_reg i.res.(0)}\n`; + call_gc_sites := + { gc_lbl = lbl_call_gc; + gc_return_lbl = lbl_redo; + gc_frame = lbl_frame } :: !call_gc_sites + end else begin + begin match n with + 16 -> ` {emit_call "caml_alloc1"}\n` + | 24 -> ` {emit_call "caml_alloc2"}\n` + | 32 -> ` {emit_call "caml_alloc3"}\n` + | _ -> ` movq ${emit_int n}, %rax\n`; + ` {emit_call "caml_allocN"}\n` + end; + `{record_frame i.live Debuginfo.none} leaq 8(%r15), {emit_reg i.res.(0)}\n` + end + | Lop(Iintop(Icomp cmp)) -> + ` cmpq {emit_reg i.arg.(1)}, {emit_reg i.arg.(0)}\n`; + let b = name_for_cond_branch cmp in + ` set{emit_string b} %al\n`; + ` movzbq %al, {emit_reg i.res.(0)}\n` + | Lop(Iintop_imm(Icomp cmp, n)) -> + ` cmpq ${emit_int n}, {emit_reg i.arg.(0)}\n`; + let b = name_for_cond_branch cmp in + ` set{emit_string b} %al\n`; + ` movzbq %al, {emit_reg i.res.(0)}\n` + | Lop(Iintop Icheckbound) -> + let lbl = bound_error_label i.dbg in + ` cmpq {emit_reg i.arg.(1)}, {emit_reg i.arg.(0)}\n`; + ` jbe {emit_label lbl}\n` + | Lop(Iintop_imm(Icheckbound, n)) -> + let lbl = bound_error_label i.dbg in + ` cmpq ${emit_int n}, {emit_reg i.arg.(0)}\n`; + ` jbe {emit_label lbl}\n` + | Lop(Iintop(Idiv | Imod)) -> + ` cqto\n`; + ` idivq {emit_reg i.arg.(1)}\n` + | Lop(Iintop(Ilsl | Ilsr | Iasr as op)) -> + (* We have i.arg.(0) = i.res.(0) and i.arg.(1) = %rcx *) + ` {emit_string(instr_for_intop op)} %cl, {emit_reg i.res.(0)}\n` + | Lop(Iintop op) -> + (* We have i.arg.(0) = i.res.(0) *) + ` {emit_string(instr_for_intop op)} {emit_reg i.arg.(1)}, {emit_reg i.res.(0)}\n` + | Lop(Iintop_imm(Iadd, n)) when i.arg.(0).loc <> i.res.(0).loc -> + ` leaq {emit_int n}({emit_reg i.arg.(0)}), {emit_reg i.res.(0)}\n` + | Lop(Iintop_imm(Iadd, 1) | Iintop_imm(Isub, -1)) -> + ` incq {emit_reg i.res.(0)}\n` + | Lop(Iintop_imm(Iadd, -1) | Iintop_imm(Isub, 1)) -> + ` decq {emit_reg i.res.(0)}\n` + | Lop(Iintop_imm(Idiv, n)) -> + (* Note: i.arg.(0) = i.res.(0) = rdx (cf. selection.ml) *) + let l = Misc.log2 n in + ` movq {emit_reg i.arg.(0)}, %rax\n`; + ` addq ${emit_int(n-1)}, {emit_reg i.arg.(0)}\n`; + ` testq %rax, %rax\n`; + ` cmovns %rax, {emit_reg i.arg.(0)}\n`; + ` sarq ${emit_int l}, {emit_reg i.res.(0)}\n` + | Lop(Iintop_imm(Imod, n)) -> + (* Note: i.arg.(0) = i.res.(0) = rdx (cf. selection.ml) *) + ` movq {emit_reg i.arg.(0)}, %rax\n`; + ` testq %rax, %rax\n`; + ` leaq {emit_int(n-1)}(%rax), %rax\n`; + ` cmovns {emit_reg i.arg.(0)}, %rax\n`; + ` andq ${emit_int (-n)}, %rax\n`; + ` subq %rax, {emit_reg i.res.(0)}\n` + | Lop(Iintop_imm(op, n)) -> + (* We have i.arg.(0) = i.res.(0) *) + ` {emit_string(instr_for_intop op)} ${emit_int n}, {emit_reg i.res.(0)}\n` + | Lop(Inegf) -> + ` xorpd {emit_symbol "caml_negf_mask"}(%rip), {emit_reg i.res.(0)}\n` + | Lop(Iabsf) -> + ` andpd {emit_symbol "caml_absf_mask"}(%rip), {emit_reg i.res.(0)}\n` + | Lop(Iaddf | Isubf | Imulf | Idivf as floatop) -> + ` {emit_string(instr_for_floatop floatop)} {emit_reg i.arg.(1)}, {emit_reg i.res.(0)}\n` + | Lop(Ifloatofint) -> + ` cvtsi2sdq {emit_reg i.arg.(0)}, {emit_reg i.res.(0)}\n` + | Lop(Iintoffloat) -> + ` cvttsd2siq {emit_reg i.arg.(0)}, {emit_reg i.res.(0)}\n` + | Lop(Ispecific(Ilea addr)) -> + ` leaq {emit_addressing addr i.arg 0}, {emit_reg i.res.(0)}\n` + | Lop(Ispecific(Istore_int(n, addr))) -> + ` movq ${emit_nativeint n}, {emit_addressing addr i.arg 0}\n` + | Lop(Ispecific(Istore_symbol(s, addr))) -> + assert (not !pic_code && not !Clflags.dlcode); + ` movq ${emit_symbol s}, {emit_addressing addr i.arg 0}\n` + | Lop(Ispecific(Ioffset_loc(n, addr))) -> + ` addq ${emit_int n}, {emit_addressing addr i.arg 0}\n` + | Lop(Ispecific(Ifloatarithmem(op, addr))) -> + ` {emit_string(instr_for_floatarithmem op)} {emit_addressing addr i.arg 1}, {emit_reg i.res.(0)}\n` + | Lreloadretaddr -> + () + | Lreturn -> + output_epilogue(); + ` ret\n` + | Llabel lbl -> + `{emit_Llabel fallthrough lbl}:\n` + | Lbranch lbl -> + ` jmp {emit_label lbl}\n` + | Lcondbranch(tst, lbl) -> + begin match tst with + Itruetest -> + output_test_zero i.arg.(0); + ` jne {emit_label lbl}\n` + | Ifalsetest -> + output_test_zero i.arg.(0); + ` je {emit_label lbl}\n` + | Iinttest cmp -> + ` cmpq {emit_reg i.arg.(1)}, {emit_reg i.arg.(0)}\n`; + let b = name_for_cond_branch cmp in + ` j{emit_string b} {emit_label lbl}\n` + | Iinttest_imm((Isigned Ceq | Isigned Cne | + Iunsigned Ceq | Iunsigned Cne) as cmp, 0) -> + output_test_zero i.arg.(0); + let b = name_for_cond_branch cmp in + ` j{emit_string b} {emit_label lbl}\n` + | Iinttest_imm(cmp, n) -> + ` cmpq ${emit_int n}, {emit_reg i.arg.(0)}\n`; + let b = name_for_cond_branch cmp in + ` j{emit_string b} {emit_label lbl}\n` + | Ifloattest(cmp, neg) -> + emit_float_test cmp neg i.arg lbl + | Ioddtest -> + ` testb $1, {emit_reg8 i.arg.(0)}\n`; + ` jne {emit_label lbl}\n` + | Ieventest -> + ` testb $1, {emit_reg8 i.arg.(0)}\n`; + ` je {emit_label lbl}\n` + end + | Lcondbranch3(lbl0, lbl1, lbl2) -> + ` cmpq $1, {emit_reg i.arg.(0)}\n`; + begin match lbl0 with + None -> () + | Some lbl -> ` jb {emit_label lbl}\n` + end; + begin match lbl1 with + None -> () + | Some lbl -> ` je {emit_label lbl}\n` + end; + begin match lbl2 with + None -> () + | Some lbl -> ` jg {emit_label lbl}\n` + end + | Lswitch jumptbl -> + let lbl = new_label() in + (* rax and rdx are clobbered by the Lswitch, + meaning that no variable that is live across the Lswitch + is assigned to rax or rdx. However, the argument to Lswitch + can still be assigned to one of these two registers, so + we must be careful not to clobber it before use. *) + let (tmp1, tmp2) = + if i.arg.(0).loc = Reg 0 (* rax *) + then (phys_reg 4 (*rdx*), phys_reg 0 (*rax*)) + else (phys_reg 0 (*rax*), phys_reg 4 (*rdx*)) in + ` leaq {emit_label lbl}(%rip), {emit_reg tmp1}\n`; + ` movslq ({emit_reg tmp1}, {emit_reg i.arg.(0)}, 4), {emit_reg tmp2}\n`; + ` addq {emit_reg tmp2}, {emit_reg tmp1}\n`; + ` jmp *{emit_reg tmp1}\n`; + if macosx + then ` .const\n` + else ` .section .rodata\n`; + emit_align 4; + `{emit_label lbl}:`; + for i = 0 to Array.length jumptbl - 1 do + ` .long {emit_label jumptbl.(i)} - {emit_label lbl}\n` + done; + ` .text\n` + | Lsetuptrap lbl -> + ` call {emit_label lbl}\n` + | Lpushtrap -> + ` pushq %r14\n`; + ` movq %rsp, %r14\n`; + stack_offset := !stack_offset + 16 + | Lpoptrap -> + ` popq %r14\n`; + ` addq $8, %rsp\n`; + stack_offset := !stack_offset - 16 + | Lraise -> + if !Clflags.debug then begin + ` {emit_call "caml_raise_exn"}\n`; + record_frame Reg.Set.empty i.dbg + end else begin + ` movq %r14, %rsp\n`; + ` popq %r14\n`; + ` ret\n` + end + +let rec emit_all fallthrough i = + match i.desc with + | Lend -> () + | _ -> + emit_instr fallthrough i; + emit_all (Linearize.has_fallthrough i.desc) i.next + +(* Emission of the floating-point constants *) + +let emit_float_constant (lbl, cst) = + `{emit_label lbl}:`; + emit_float64_directive ".quad" cst + +(* Emission of the profiling prelude *) + +let emit_profile () = + match Config.system with + | "linux" | "gnu" -> + (* mcount preserves rax, rcx, rdx, rsi, rdi, r8, r9 explicitly + and rbx, rbp, r12-r15 like all C functions. + We need to preserve r10 and r11 ourselves, since Caml can + use them for argument passing. *) + ` pushq %r10\n`; + ` movq %rsp, %rbp\n`; + ` pushq %r11\n`; + ` {emit_call "mcount"}\n`; + ` popq %r11\n`; + ` popq %r10\n` + | _ -> + () (*unsupported yet*) + +(* Emission of a function declaration *) + +let fundecl fundecl = + function_name := fundecl.fun_name; + fastcode_flag := fundecl.fun_fast; + tailrec_entry_point := new_label(); + stack_offset := 0; + float_constants := []; + call_gc_sites := []; + bound_error_sites := []; + bound_error_call := 0; + ` .text\n`; + emit_align 16; + if macosx + && not !Clflags.output_c_object + && is_generic_function fundecl.fun_name + then (* PR#4690 *) + ` .private_extern {emit_symbol fundecl.fun_name}\n` + else + ` .globl {emit_symbol fundecl.fun_name}\n`; + `{emit_symbol fundecl.fun_name}:\n`; + if !Clflags.gprofile then emit_profile(); + if frame_required() then begin + let n = frame_size() - 8 in + ` subq ${emit_int n}, %rsp\n` + end; + `{emit_label !tailrec_entry_point}:\n`; + emit_all true fundecl.fun_body; + List.iter emit_call_gc !call_gc_sites; + emit_call_bound_errors (); + if !float_constants <> [] then begin + if macosx + then ` .literal8\n` + else ` .section .rodata.cst8,\"a\"\n`; + List.iter emit_float_constant !float_constants + end; + match Config.system with + "linux" | "gnu" -> + ` .size {emit_symbol fundecl.fun_name},.-{emit_symbol fundecl.fun_name}\n` + | _ -> () + +(* Emission of data *) + +let emit_item = function + Cglobal_symbol s -> + ` .globl {emit_symbol s}\n`; + | Cdefine_symbol s -> + `{emit_symbol s}:\n` + | Cdefine_label lbl -> + `{emit_label (100000 + lbl)}:\n` + | Cint8 n -> + ` .byte {emit_int n}\n` + | Cint16 n -> + ` .word {emit_int n}\n` + | Cint32 n -> + ` .long {emit_nativeint n}\n` + | Cint n -> + ` .quad {emit_nativeint n}\n` + | Csingle f -> + emit_float32_directive ".long" f + | Cdouble f -> + emit_float64_directive ".quad" f + | Csymbol_address s -> + ` .quad {emit_symbol s}\n` + | Clabel_address lbl -> + ` .quad {emit_label (100000 + lbl)}\n` + | Cstring s -> + emit_string_directive " .ascii " s + | Cskip n -> + if n > 0 then ` .space {emit_int n}\n` + | Calign n -> + emit_align n + +let data l = + ` .data\n`; + List.iter emit_item l + +(* Beginning / end of an assembly file *) + +let begin_assembly() = + if !Clflags.dlcode then begin + (* from amd64.S; could emit these constants on demand *) + if macosx then + ` .literal16\n` + else + ` .section .rodata.cst8,\"a\"\n`; + emit_align 16; + `{emit_symbol "caml_negf_mask"}: .quad 0x8000000000000000, 0\n`; + emit_align 16; + `{emit_symbol "caml_absf_mask"}: .quad 0x7FFFFFFFFFFFFFFF, 0xFFFFFFFFFFFFFFFF\n` + end; + let lbl_begin = Compilenv.make_symbol (Some "data_begin") in + ` .data\n`; + ` .globl {emit_symbol lbl_begin}\n`; + `{emit_symbol lbl_begin}:\n`; + let lbl_begin = Compilenv.make_symbol (Some "code_begin") in + ` .text\n`; + ` .globl {emit_symbol lbl_begin}\n`; + `{emit_symbol lbl_begin}:\n`; + if macosx then ` nop\n` (* PR#4690 *) + +let end_assembly() = + let lbl_end = Compilenv.make_symbol (Some "code_end") in + ` .text\n`; + if macosx then ` nop\n`; (* suppress "ld warning: atom sorting error" *) + ` .globl {emit_symbol lbl_end}\n`; + `{emit_symbol lbl_end}:\n`; + ` .data\n`; + let lbl_end = Compilenv.make_symbol (Some "data_end") in + ` .globl {emit_symbol lbl_end}\n`; + `{emit_symbol lbl_end}:\n`; + ` .long 0\n`; + let lbl = Compilenv.make_symbol (Some "frametable") in + ` .globl {emit_symbol lbl}\n`; + `{emit_symbol lbl}:\n`; + emit_frames + { efa_label = (fun l -> ` .quad {emit_label l}\n`); + efa_16 = (fun n -> ` .word {emit_int n}\n`); + efa_32 = (fun n -> ` .long {emit_int32 n}\n`); + efa_word = (fun n -> ` .quad {emit_int n}\n`); + efa_align = emit_align; + efa_label_rel = + if macosx then begin + let setcnt = ref 0 in + fun lbl ofs -> + incr setcnt; + ` .set L$set${emit_int !setcnt}, ({emit_label lbl} - .) + {emit_int32 ofs}\n`; + ` .long L$set${emit_int !setcnt}\n` + end else begin + fun lbl ofs -> + ` .long ({emit_label lbl} - .) + {emit_int32 ofs}\n` + end; + efa_def_label = (fun l -> `{emit_label l}:\n`); + efa_string = (fun s -> emit_string_directive " .asciz " s) }; + if Config.system = "linux" then + (* Mark stack as non-executable, PR#4564 *) + ` .section .note.GNU-stack,\"\"\n` diff --git a/ocaml/3.12/asmcomp/amd64/proc_mingw64.ml b/ocaml/3.12/asmcomp/amd64/proc_mingw64.ml new file mode 100644 index 0000000..338e169 --- /dev/null +++ b/ocaml/3.12/asmcomp/amd64/proc_mingw64.ml @@ -0,0 +1,233 @@ +(***********************************************************************) +(* *) +(* Objective Caml *) +(* *) +(* Xavier Leroy, projet Cristal, INRIA Rocquencourt *) +(* *) +(* Copyright 2000 Institut National de Recherche en Informatique et *) +(* en Automatique. All rights reserved. This file is distributed *) +(* under the terms of the Q Public License version 1.0. *) +(* *) +(***********************************************************************) + +(* $Id: proc.ml 9547 2010-01-22 12:48:24Z doligez $ *) + +(* Description of the AMD64 processor with Win64 conventions *) + +open Misc +open Arch +open Cmm +open Reg +open Mach + +(* Registers available for register allocation *) + +(* Register map: + rax 0 rax - r11: Caml function arguments + rbx 1 rcx - r9: C function arguments + rdi 2 rax: Caml and C function results + rsi 3 rbx, rbp, rsi, rdi r12-r15 are preserved by C + rdx 4 + rcx 5 + r8 6 + r9 7 + r10 8 + r11 9 + rbp 10 + r12 11 + r13 12 + r14 trap pointer + r15 allocation pointer + + xmm0 - xmm15 100 - 115 xmm0 - xmm9: Caml function arguments + xmm0 - xmm3: C function arguments + xmm0: Caml and C function results + xmm6-xmm15 are preserved by C *) + +let int_reg_name = + [| "%rax"; "%rbx"; "%rdi"; "%rsi"; "%rdx"; "%rcx"; "%r8"; "%r9"; + "%r10"; "%r11"; "%rbp"; "%r12"; "%r13" |] + +let float_reg_name = + [| "%xmm0"; "%xmm1"; "%xmm2"; "%xmm3"; "%xmm4"; "%xmm5"; "%xmm6"; "%xmm7"; + "%xmm8"; "%xmm9"; "%xmm10"; "%xmm11"; + "%xmm12"; "%xmm13"; "%xmm14"; "%xmm15" |] + +let num_register_classes = 2 + +let register_class r = + match r.typ with + Int -> 0 + | Addr -> 0 + | Float -> 1 + +let num_available_registers = [| 13; 16 |] + +let first_available_register = [| 0; 100 |] + +let register_name r = + if r < 100 then int_reg_name.(r) else float_reg_name.(r - 100) + +(* Pack registers starting at %rax so as to reduce the number of REX + prefixes and thus improve code density *) +let rotate_registers = false + +(* Representation of hard registers by pseudo-registers *) + +let hard_int_reg = + let v = Array.create 13 Reg.dummy in + for i = 0 to 12 do v.(i) <- Reg.at_location Int (Reg i) done; + v + +let hard_float_reg = + let v = Array.create 16 Reg.dummy in + for i = 0 to 15 do v.(i) <- Reg.at_location Float (Reg (100 + i)) done; + v + +let all_phys_regs = + Array.append hard_int_reg hard_float_reg + +let phys_reg n = + if n < 100 then hard_int_reg.(n) else hard_float_reg.(n - 100) + +let rax = phys_reg 0 +let rcx = phys_reg 5 +let rdx = phys_reg 4 +let r11 = phys_reg 9 +let rxmm15 = phys_reg 115 + +let stack_slot slot ty = + Reg.at_location ty (Stack slot) + +(* Instruction selection *) + +let word_addressed = false + +(* Calling conventions *) + +let calling_conventions first_int last_int first_float last_float make_stack + arg = + let loc = Array.create (Array.length arg) Reg.dummy in + let int = ref first_int in + let float = ref first_float in + let ofs = ref 0 in + for i = 0 to Array.length arg - 1 do + match arg.(i).typ with + Int | Addr as ty -> + if !int <= last_int then begin + loc.(i) <- phys_reg !int; + incr int + end else begin + loc.(i) <- stack_slot (make_stack !ofs) ty; + ofs := !ofs + size_int + end + | Float -> + if !float <= last_float then begin + loc.(i) <- phys_reg !float; + incr float + end else begin + loc.(i) <- stack_slot (make_stack !ofs) Float; + ofs := !ofs + size_float + end + done; + (loc, Misc.align !ofs 16) (* keep stack 16-aligned *) + +let incoming ofs = Incoming ofs +let outgoing ofs = Outgoing ofs +let not_supported ofs = fatal_error "Proc.loc_results: cannot call" + +let loc_arguments arg = + calling_conventions 0 9 100 109 outgoing arg +let loc_parameters arg = + let (loc, ofs) = calling_conventions 0 9 100 109 incoming arg in loc +let loc_results res = + let (loc, ofs) = calling_conventions 0 0 100 100 not_supported res in loc + +(* C calling conventions (Win64): + first integer args in rcx, rdx, r8, r9 (4 - 7) + first float args in xmm0 ... xmm3 (100 - 103) + each integer arg consumes a float reg, and conversely + remaining args on stack + always 32 bytes reserved at bottom of stack. + Return value in rax or xmm0 +*) + +let loc_external_results res = + let (loc, ofs) = calling_conventions 0 0 100 100 not_supported res in loc + +let int_external_arguments = + [| 5 (*rcx*); 4 (*rdx*); 6 (*r8*); 7 (*r9*) |] +let float_external_arguments = + [| 100 (*xmm0*); 101 (*xmm1*); 102 (*xmm2*); 103 (*xmm3*) |] + +let loc_external_arguments arg = + let loc = Array.create (Array.length arg) Reg.dummy in + let reg = ref 0 + and ofs = ref 32 in + for i = 0 to Array.length arg - 1 do + match arg.(i).typ with + Int | Addr as ty -> + if !reg < 4 then begin + loc.(i) <- phys_reg int_external_arguments.(!reg); + incr reg + end else begin + loc.(i) <- stack_slot (Outgoing !ofs) ty; + ofs := !ofs + size_int + end + | Float -> + if !reg < 4 then begin + loc.(i) <- phys_reg float_external_arguments.(!reg); + incr reg + end else begin + loc.(i) <- stack_slot (Outgoing !ofs) Float; + ofs := !ofs + size_float + end + done; + (loc, Misc.align !ofs 16) (* keep stack 16-aligned *) + +let loc_exn_bucket = rax + +(* Registers destroyed by operations *) + +let destroyed_at_c_call = + (* Win64: rbx, rbp, rsi, rdi, r12-r15, xmm6-xmm15 preserved *) + Array.of_list(List.map phys_reg + [0;4;5;6;7;8;9; + 100;101;102;103;104;105]) + +let destroyed_at_oper = function + Iop(Icall_ind | Icall_imm _ | Iextcall(_, true)) -> all_phys_regs + | Iop(Iextcall(_, false)) -> destroyed_at_c_call + | Iop(Iintop(Idiv | Imod)) -> [| rax; rdx |] + | Iop(Istore(Single, _)) -> [| rxmm15 |] + | Iop(Ialloc _ | Iintop(Icomp _) | Iintop_imm((Idiv|Imod|Icomp _), _)) + -> [| rax |] + | Iswitch(_, _) when !pic_code -> [| r11 |] + | _ -> [||] + +let destroyed_at_raise = all_phys_regs + +(* Maximal register pressure *) + +let safe_register_pressure = function + Iextcall(_,_) -> 8 + | _ -> 11 + +let max_register_pressure = function + Iextcall(_, _) -> [| 8; 10 |] + | Iintop(Idiv | Imod) -> [| 11; 16 |] + | Ialloc _ | Iintop(Icomp _) | Iintop_imm((Idiv|Imod|Icomp _), _) + -> [| 12; 16 |] + | Istore(Single, _) -> [| 13; 15 |] + | _ -> [| 13; 16 |] + +(* Layout of the stack frame *) + +let num_stack_slots = [| 0; 0 |] +let contains_calls = ref false + +(* Calling the assembler *) + +let assemble_file infile outfile = + Ccomp.command (Config.asm ^ " -o " ^ + Filename.quote outfile ^ " " ^ Filename.quote infile) diff --git a/ocaml/3.12/build.sh b/ocaml/3.12/build.sh new file mode 100755 index 0000000..a26083e --- /dev/null +++ b/ocaml/3.12/build.sh @@ -0,0 +1,15 @@ +# fix settings +cp config/m-nt.h config/m.h || exit 1 +cp config/s-nt.h config/s.h || exit 1 +cp config/Makefile.mingw64 config/Makefile || exit 1 + +# fix asmcomp +cp asmcomp/amd64/emit_mingw64.mlp asmcomp/amd64/emit.mlp || exit 1 +cp asmcomp/amd64/proc_mingw64.ml asmcomp/amd64/proc.ml || exit 1 + +# run make +gmake -f Makefile.nt world || exit 1 +gmake -f Makefile.nt bootstrap || exit 1 +gmake -f Makefile.nt opt || exit 1 +gmake -f Makefile.nt opt.opt || exit 1 +gmake -f Makefile.nt install || exit 1 \ No newline at end of file diff --git a/ocaml/3.12/config/Makefile.mingw64 b/ocaml/3.12/config/Makefile.mingw64 new file mode 100644 index 0000000..cf17d41 --- /dev/null +++ b/ocaml/3.12/config/Makefile.mingw64 @@ -0,0 +1,158 @@ +######################################################################### +# # +# Objective Caml # +# # +# Xavier Leroy, projet Cristal, INRIA Rocquencourt # +# # +# Copyright 1999 Institut National de Recherche en Informatique et # +# en Automatique. All rights reserved. This file is distributed # +# under the terms of the GNU Library General Public License, with # +# the special exception on linking described in file ../LICENSE. # +# # +######################################################################### + +# $Id: Makefile.mingw 10461 2010-05-25 10:00:39Z frisch $ + +# Configuration for Windows, Mingw compiler + +######### General configuration + +PREFIX=C:/ocamlmgw + +### Where to install the binaries +BINDIR=$(PREFIX)/bin + +### Where to install the standard library +LIBDIR=$(PREFIX)/lib + +### Where to install the stub DLLs +STUBLIBDIR=$(LIBDIR)/stublibs + +### Where to install the info files +DISTRIB=$(PREFIX) + +### Where to install the man pages +MANDIR=$(PREFIX)/man + +########## Toolchain and OS dependencies + +TOOLCHAIN=mingw +CCOMPTYPE=cc +O=o +A=a +S=s +SO=s.o +DO=d.o +EXE=.exe +EXT_DLL=.dll +EXT_OBJ=.$(O) +EXT_LIB=.$(A) +EXT_ASM=.$(S) +MANEXT=1 +SHARPBANGSCRIPTS=false +PTHREAD_LINK= +X11_INCLUDES= +X11_LINK= +DBM_INCLUDES= +DBM_LINK= +BYTECCRPATH= +SUPPORTS_SHARED_LIBRARIES=true +SHAREDCCCOMPOPTS= +MKSHAREDLIBRPATH= +NATIVECCPROFOPTS= +NATIVECCRPATH= +ASM=as +ASPP=gcc +ASPPPROFFLAGS= +PROFILING=noprof +DYNLINKOPTS= +DEBUGGER=ocamldebugger +CC_PROFILE= +SYSTHREAD_SUPPORT=true +EXTRALIBS= +NATDYNLINK=true +CMXS=cmxs + +########## Configuration for the bytecode compiler + +### Which C compiler to use for the bytecode interpreter. +BYTECC=gcc -mno-cygwin + +### Additional compile-time options for $(BYTECC). (For static linking.) +BYTECCCOMPOPTS=-O -mms-bitfields -Wall -Wno-unused + +### Additional link-time options for $(BYTECC). (For static linking.) +BYTECCLINKOPTS= + +### Additional compile-time options for $(BYTECC). (For building a DLL.) +DLLCCCOMPOPTS=-O -mms-bitfields -Wall -Wno-unused -DCAML_DLL + +### Libraries needed +BYTECCLIBS=-lws2_32 +NATIVECCLIBS=-lws2_32 + +### How to invoke the C preprocessor +CPP=$(BYTECC) -E + +### Flexlink +FLEXLINK=flexlink -chain mingw64 -LC:/mingw64/x86_64-w64-mingw32/lib +FLEXDIR=$(shell $(FLEXLINK) -where) +IFLEXDIR=-I"$(FLEXDIR)" +MKDLL=$(FLEXLINK) +MKEXE=$(FLEXLINK) -exe +MKMAINDLL=$(FLEXLINK) -maindll + +### How to build a static library +MKLIB=rm -f $(1); ar rcs $(1) $(2) +#ml let mklib out files opts = Printf.sprintf "rm -f %s && ar rcs %s %s %s" out opts out files;; + +### Canonicalize the name of a system library +SYSLIB=-l$(1) +#ml let syslib x = "-l"^x;; + +### The ranlib command +RANLIB=ranlib +RANLIBCMD=ranlib + +############# Configuration for the native-code compiler + +### Name of architecture for the native-code compiler +ARCH=amd64 + +### Name of architecture model for the native-code compiler. +MODEL=default + +### Name of operating system family for the native-code compiler. +SYSTEM=mingw64 + +### Which C compiler to use for the native-code compiler. +NATIVECC=$(BYTECC) + +### Additional compile-time options for $(NATIVECC). +NATIVECCCOMPOPTS=-O -mms-bitfields -Wall -Wno-unused + +### Additional link-time options for $(NATIVECC) +NATIVECCLINKOPTS= + +### Build partially-linked object file +PACKLD=ld -r $(NATIVECCLINKOPTS) -o #there must be a space after this '-o' + +############# Configuration for the contributed libraries + +OTHERLIBRARIES=win32unix str num win32graph dynlink bigarray systhreads + +### Name of the target architecture for the "num" library +BNG_ARCH=generic +BNG_ASM_LEVEL=0 + +### Configuration for LablTk +# Set TK_ROOT to the directory where you installed TCL/TK 8.5 +# There must be no spaces or special characters in $(TK_ROOT) +TK_ROOT=c:/tcl +TK_DEFS=-I$(TK_ROOT)/include +TK_LINK=$(TK_ROOT)/bin/tk85.dll $(TK_ROOT)/bin/tcl85.dll -lws2_32 + +############# Aliases for common commands + +MAKEREC=$(MAKE) -f Makefile.nt +MAKECMD=$(MAKE) --Boundary-00=_UtL7MMnFcfdZwFn--