From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id NAA28479; Wed, 26 May 2004 13:31:50 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id NAA28826 for ; Wed, 26 May 2004 13:31:48 +0200 (MET DST) Received: from qrnik.knm.org.pl (paf87.warszawa.sdi.tpnet.pl [217.96.225.87]) by nez-perce.inria.fr (8.12.10/8.12.10) with ESMTP id i4QBVkEV027912 for ; Wed, 26 May 2004 13:31:47 +0200 Received: from qrnik ([192.168.0.1] ident=qrczak) by qrnik.knm.org.pl with esmtp (Exim 3.36 #1) id 1BSwdK-0005To-00; Wed, 26 May 2004 13:31:42 +0200 Subject: Re: [Caml-list] handling forward references in a compiler From: "Marcin 'Qrczak' Kowalczyk" To: dido@imperium.ph Cc: caml-list@inria.fr In-Reply-To: <20040525095213.GA2655@imperium.ph> References: <20040525095213.GA2655@imperium.ph> Content-Type: text/plain; charset=iso-8859-2 Message-Id: <1085571101.23506.15.camel@qrnik> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.5.7 Date: Wed, 26 May 2004 13:31:42 +0200 Content-Transfer-Encoding: quoted-printable X-Miltered: at nez-perce with ID 40B48022.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; caml-list:01 marcin:01 'qrczak':01 kowalczyk:01 qrczak:01 wto:99 dido:01 undeclared:01 resorting:01 ids:01 avoided:01 implemented:01 ids:01 marcin:01 kowalczyk:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk W li=B6cie z wto, 25-05-2004, godz. 17:52 +0800, dido@imperium.ph napisa=B3= : > I'm writing a compiler for a language that allows undeclared forward > references in OCaml, and I'm just wondering how one is supposed to > handle these sorts of things without resorting to imperative features or > multiple passes. These are indeed two most obvious choices. I prefer the second one. In the first pass we scan only "heads" of definitions, creating an environment which maps strings to some unique ids. In the second pass we know the meaning of each identifier, so we can process code normally no matter which identifiers have their definitions before or after. > Another option is to make a second pass through the > generated intermediate representation, and fixing up the forward > references after all of the declarations have been processed, but this > seems like a second traversal that can be avoided. There is no need to have a separate pass which only patches references. You just lookup references in the previously constructed environment as necessary, when processing the code tree in the next pass which also does whatever it was supposed to do anyway. In a compiler I implemented it was not that easy to scan heads of definitions, because some syntactic sugar was not yet resolved, so it would have to be resolved twice. To avoid this, the pass which gathers defined names also produces a modified list of definitions, with resolved toplevel syntactic sugar, with bound names represented by unique ids, but with everything else - in particular all embedded expressions - left unprocessed. They are processed in the next pass, which resolves used identifiers and expression-level syntactic sugar, and produces a new code representation for further processing. --=20 __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/ ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners