On 17 October 2014 18:48, Yoann Padioleau wrote: > Also is there a place explaining the naming conventions? > I have a hard time understanding what is the difference > between tcom(), acom(), xcom(), and then there’s complex() … > the compiler assumes you've know your Aho and Ullmann either by reading them or by working with them. this will be useful later. complex runs tcom, ccom, acom, xcom in that order on a node (and each of those can recurse). the compiler works by a series of tree transformations, informed by attributes attached to each node, representing types, notional costs, addressability, and other ... stuff. tcom has a comment! /* * evaluate types * evaluate lvalues (addable == 1) */ so that's sorted. "evaluate types" means it does most of the type checking and making casts explicit. t->addable is set to 1 if t is an l-value. L-value and R-value are terms introduced by Christopher Strachey (see https://www.itu.dk/courses/BPRD/E2009/fundamental-1967.pdf): briefly, lvalues are forms legal on the left-hand side of an assignment ("address-like objects"), and r-values legal only the right-hand side ("value-like objects") where "objects" means what it used to mean before it was ruined. acom is Aho and Ullman's acommute (with all the exercises done). basically it does code improvements taking advantage of associative-commutative operations, and extends that to look for further optimisations mentioned in Thompson's paper on the compiler. Ritchie's PDP-11 C compiler did something similar (but he called it acommute and distrib). comments such as "bust terms out" and "look for factorable terms c1*i + c1*c2*j -> c1*(i + c2*j) suggest what it's doing). essentially it flattens levels of an expression tree using associative and commutative operators into an array and rearranges them. xcom, which is machine-dependent, also has a comment. it assigns an index of complexity to a sub-expression, which is usually expressed as the number of registers needed to compute it (and good luck with that on an x86), but it's also a good place to look for machine-dependent aspects of machine addressing, such as indexing, double-indexing, scaling, etc and it sets values in each tree node to mark that. "balancing", "casting", "coercion" and other terms connected with languages, types and compilation were often introduced by Algol-68. "Balancing" for instance is making operand types match each other (eg, int+long => long+long).