caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Re: VLIW & caml: how?
@ 1998-08-29 17:36 Christopher Oliver
  0 siblings, 0 replies; 5+ messages in thread
From: Christopher Oliver @ 1998-08-29 17:36 UTC (permalink / raw)
  To: caml-list

Todd Lewis writes:
> I've been reading that VLIW as implemented on the IA-64/Merced will post
> problems for conventional compilers such as gcc which don't have a very
> expansive view of the code they're compiling.  How well will o'caml deal
> with optimizing for this sort of architecture?  Any thoughts?

Well... since no one in the free software community knows much
about Merced, and Intel is not talking, this chip isn't yet real
as far as authors of compiler back ends for free languages.  I
think we should make sure Intel doesn't play Appendix H games
again before wasting much time speculating on how any compiler
handles optimization on this architecture.

-- 
Christopher Oliver                     Traverse Internet
Systems Coordinator                    223 Grandview Pkwy, Suite 108
oliver@traverse.net                    Traverse City, Michigan, 49684
let magic f = fun x -> x and more_magic n f = fun x -> f ((n f) x);;





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: VLIW & caml: how?
  1998-09-02 17:23 ` Xavier Leroy
@ 1998-09-03 19:29   ` Joel Jones
  0 siblings, 0 replies; 5+ messages in thread
From: Joel Jones @ 1998-09-03 19:29 UTC (permalink / raw)
  To: caml-list

One thing to look at is the work done by the CAR group at HP Labs.  They
have a collabaration with two university research groups, one at UIUC and
another at NYU.  For more information, see:

   http:www.trimaran.org/

The upcoming Merced is a big break from current ILP microarchitectures, and
even from older VLIW designs.  Xavier Leroy is correct in asserting that
compilers have to be almost completely rethought to take advantage of ILP
designs with lots of parallelism.

Joel Jones
jjones@uiuc.edu







^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: VLIW & caml: how?
  1998-08-28  5:18 Todd Graham Lewis
  1998-09-01 11:33 ` Ping Hu
@ 1998-09-02 17:23 ` Xavier Leroy
  1998-09-03 19:29   ` Joel Jones
  1 sibling, 1 reply; 5+ messages in thread
From: Xavier Leroy @ 1998-09-02 17:23 UTC (permalink / raw)
  To: Todd Graham Lewis, caml-list

> I've been reading that VLIW as implemented on the IA-64/Merced will post
> problems for conventional compilers such as gcc which don't have a very
> expansive view of the code they're compiling.  How well will o'caml deal
> with optimizing for this sort of architecture?  Any thoughts?

It's hard to say anything precise until Intel releases detailed
documentation on the IA64 instruction set.

If your question is about instruction-level parallelism (ILP) in
general, it must be noted that today's superscalar architectures (ushc
as the Alpha 21264 and the PowerPC 604) already offer more parallelism
(i.e. 4 instructions issued per cycle) than can be exploited by most
compiled programs.  This is due in part to insufficient optimizations in
compilers (extracting ILP from sequential code might require
significant program transformations) and in part to the fact that many
programs simply do not contain enough parallelism by nature of the
algorithms used.  Often, the only way to exploit fully the resources
of those superscalar processors is to write carefully tuned assembly
code by hand...

Code generated by ocamlopt has characteristics similar to the
so-called "commercial workload" subset of Spec95, i.e. high number
of memory accesses, low to medium ILP, and relatively low CPI.  This
is not surprising, as hardware manufacturers generally increase ILP by
throwing more integer and floating-point ALUs, which are not useful for
most Caml applications, but don't increase the number of load-store
units, which would be good for Caml but is very hard to implement in
hardware.

However, there is some hope that the clean semantics of Caml might
allow more aggressive scheduling of memory accesses as is possible
with e.g. C programs.  In particular, the type system gives a lot of
non-aliasing properties "for free" (e.g. a load from an immutable data
structure cannot interfere with a non-initializing store).  See my
PLDI'98 tutorial for more details (http://pauillac.inria.fr/~xleroy/).
But again, this can be useful only if the hardware supports many
pending memory accesses simultaneously.

All in all, I'm not expecting much speedups from ILP.  The important
speedups we've observed on Caml programs when moving from older
architectures (e.g. the Alpha 21064) to newer ones (e.g. the Alpha
21164 or PowerPC G3) are due to better caches and faster memory
subsystems much more than to increased on-chip parallelism.

- Xavier Leroy





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: VLIW & caml: how?
  1998-08-28  5:18 Todd Graham Lewis
@ 1998-09-01 11:33 ` Ping Hu
  1998-09-02 17:23 ` Xavier Leroy
  1 sibling, 0 replies; 5+ messages in thread
From: Ping Hu @ 1998-09-01 11:33 UTC (permalink / raw)
  To: Todd Graham Lewis, caml-list

Todd Graham Lewis wrote:
> 
> I've been reading that VLIW as implemented on the IA-64/Merced will post
> problems for conventional compilers such as gcc which don't have a very
> expansive view of the code they're compiling.  How well will o'caml deal
> with optimizing for this sort of architecture?  Any thoughts?
> 


If you can describe the IA-64/Merced at assembly language and hardware
level, 
such as

-- the lexical and syntactical structure of the assembly language used,
-- the hardware resources(say register, memories, functional units etc),
...

in the environment SALTO(a retargetable System for Assembly Language
Transformation and Optimization,
http://www.irisa.fr/caps/PROJECTS/Salto/),
which has already offered several desciption examples for realistic
architectures(Sparc, TM1000(VLIW), etc),

then the compiler back-ends can handle the local and global optimization
(even Software pipelining) provided in SALTO. Ofcource,
the compilers can also implement  theirs own optimizing algorithms with
the support of SALTO.   

--
Ping Hu





^ permalink raw reply	[flat|nested] 5+ messages in thread

* VLIW & caml: how?
@ 1998-08-28  5:18 Todd Graham Lewis
  1998-09-01 11:33 ` Ping Hu
  1998-09-02 17:23 ` Xavier Leroy
  0 siblings, 2 replies; 5+ messages in thread
From: Todd Graham Lewis @ 1998-08-28  5:18 UTC (permalink / raw)
  To: caml-list

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=US-ASCII, Size: 441 bytes --]

I've been reading that VLIW as implemented on the IA-64/Merced will post
problems for conventional compilers such as gcc which don't have a very
expansive view of the code they're compiling.  How well will o'caml deal
with optimizing for this sort of architecture?  Any thoughts?

--
Todd Graham Lewis            32°49'N,83°36'W          (800) 719-4664, x2804
******Linux******         MindSpring Enterprises      tlewis@mindspring.net






^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~1998-09-04  7:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-08-29 17:36 VLIW & caml: how? Christopher Oliver
  -- strict thread matches above, loose matches on Subject: below --
1998-08-28  5:18 Todd Graham Lewis
1998-09-01 11:33 ` Ping Hu
1998-09-02 17:23 ` Xavier Leroy
1998-09-03 19:29   ` Joel Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).