On Oct 12, 2007, at 09:17, Christoph Sieghart wrote: > Is there any documentation for adding a new architecture to > ocamlopt? I would like to do a crosscompiler from one of the > existing architectures to an embedded microcontroller. > > I have searched the mailinglist archives and the documenation, but > have not found anything. Any pointers are welcome? Is my assumption > that the major codegeneration work is done by the code in $caml/ > asmcomp? Christoph, Yes, asmcomp contains both the middle-end and the back-end code generators. Note that the architecture-specific features are injected by configure creating various symlinks of the form asmcomp/.ml - > asmcomp//.ml. On one hand, this means you should be able to clone the contents of one of the asmcomp/ subdirectories and get your project off to a start pretty quickly. On the other, ocamlopt is not a cross-compiler, so you may have a bit of a challenge just getting the paths to the cross tools into the right places without breaking ocamlc. I'm sure you'll get more detailed pointers, but here's a quick overview... ocamlc and ocamlopt share code through the "Lambda" representation (bytecomp/lambda.mli). After this point, ocamlopt transfers control into asmcomp/asmgen.ml, which has a fairly straightforward pass pipeline in Asmgen.compile_implementation. The Lambda representation is first translated into Closed Lambda (asmcomp/clambda.mli), which is similar except that closures are explicit. Next, ocamlopt transforms Clambda into its middle-end representation, C--. This form is somewhat well documented at http://cminusminus.org/ and in various academic papers. The C-- representation is architecture-neutral in form, but not content. Target dependencies are injected through the Arch module, which specifies address sizes, endianness, etc. This is the point where displacement calculations are performed, etc. The C-- representation is the input to the architecture-specific back- end code generators, which are driven by the architecture-neutral Asmgen.compile_phrase and Asmgen.compile_fundecl. In particular, this pipeline is pleasantly self-documenting: let (++) x f = f x let compile_fundecl (ppf : formatter) fd_cmm = Reg.reset(); fd_cmm (* <-- The C-- representation for the function *) ++ Selection.fundecl ++ pass_dump_if ppf dump_selection "After instruction selection" ++ Comballoc.fundecl ++ pass_dump_if ppf dump_combine "After allocation combining" ++ liveness ppf ++ pass_dump_if ppf dump_live "Liveness analysis" ++ Spill.fundecl ++ liveness ppf ++ pass_dump_if ppf dump_spill "After spilling" ++ Split.fundecl ++ pass_dump_if ppf dump_split "After live range splitting" ++ liveness ppf ++ regalloc ppf 1 ++ Linearize.fundecl ++ pass_dump_linear_if ppf dump_linear "Linearized code" ++ Scheduling.fundecl ++ pass_dump_linear_if ppf dump_scheduling "After instruction scheduling" ++ Emit.fundecl You can identify the target-dependent phases by correlating the passes with the contents of a target subdirectory. Have fun! — Gordon