Ok I think a good place to start a tour of the compiler is in
parsing/parsetree.mli. This file is actually very well documented, with
terse but effective examples of almost every constructor and type.

I had to refer to the OCaml manual for a few of the corner cases. For
example, I didn't know about the #class type shortcut. I think a few
comments explaining the more obscure facets of the language could be
helpful.

Since the file is so well documented, I only have a few questions. I'll
accept an answer or a hunch from anyone -- don't feel shy because you think
you're not sure about the answer:

1. What is the difference between an extension and an attribute? From what
I understand, they are both means of integrating additional metadata into
the AST that can then be parsed by implementations of the ast-mapper, but
why are there 2 mechanisms?

2. What is demonstrated in lines 114-117 regarding polymorphic variant row
fields:

  | Rtag of label * bool * core_type list
        (* [`A]                   ( true,  [] )
           [`A of T]              ( false, [T] )
           [`A of T1 & .. & Tn]   ( false, [T1;...Tn] )
           [`A of & T1 & .. & Tn] ( true,  [T1;...Tn] )
         *)

What does the bool value represent?
Why are the type separators in the comments using the & symbol?
What is the difference between the 3rd and 4th example?

3. line 684: what is the purpose of the override flag on Pstr_open? It's
not explained by the comment.

4. The toplevel phrases are not clear. What is the purpose of Ptop_dir on
line 721?

Like I said, feel free to jump in and answer any one of these questions.

Thanks in advance for everyone's help

-Yotam




On Tue, Apr 1, 2014 at 6:03 AM, Mark Shinwell <mshinwell@janestreet.com>wrote:

> I would suggest that it's probably better to keep the documentation as
> comments where possible.  However, I think it is important to avoid
> excessive commentary, especially if it is likely to get out of sync as
> a result of future modifications to the code.  It may be that in some
> cases making alterations to the code (for example, improving the name
> of a variable) is a more satisfactory approach than adding a comment.
>
> Thanks for working on this.
>
> Mark
>
> On 31 March 2014 18:51, Yotam Barnoy <yotambarnoy@gmail.com> wrote:
> > I think it depends on how much feedback I get on any particular
> question. By
> > default, I would like comments to go in the code. Additionally, there's
> the
> > ocaml-internals wiki at https://github.com/ocamllabs/ocaml-internalswhich
> > will be useful for any concepts that span multiple files, or that are too
> > beginner-oriented. I'm guessing that for many things, it will just have
> to
> > be decided on a case-by-case basis.
> >
> > Of course, the most important ingredient for the success of this
> 'project'
> > is the willing, patient participation of the core team, as well as the
> other
> > experts on this list.
> >
> > -Yotam
> >
> >
> > On Mon, Mar 31, 2014 at 1:06 PM, Milan Stanojević <milanst@gmail.com>
> wrote:
> >>
> >> Thank you for doing this, I'm interested in learning more about how
> >> compiler works.
> >>
> >> Are you creating a separate file(s) to document the compiler or you
> >> are adding comments to ml files?
> >>
> >> On Mon, Mar 31, 2014 at 11:39 AM, Yotam Barnoy <yotambarnoy@gmail.com>
> >> wrote:
> >> > Hi everybody
> >> >
> >> > It's been mentioned before that the OCaml compiler's documentation is
> >> > somewhat lacking. I've been going over the compiler code gradually
> (both
> >> > the
> >> > frontend and the backend) and while some parts are understandable
> >> > enough,
> >> > others are missing some basic explanations. Some explanations are also
> >> > spread out throughout the codebase, making it hard to know what
> >> > something
> >> > means unless you've read another part of the codebase that relates to
> >> > it.
> >> >
> >> > Since the call to submit documentation commits has gone mostly
> >> > unanswered,
> >> > I'd like to suggest a method of making both my own progress through
> the
> >> > code
> >> > easier and hopefully making it easier for others who will follow.
> >> >
> >> > What I'm going to do is, focusing on more or less one file at a time,
> >> > I'll
> >> > post newbie questions to the list about the code. Once I'm satisfied
> >> > that I
> >> > have a good enough understanding, I'll add comments to the
> >> > aforementioned
> >> > files and submit pull requests for them. I also encourage others to do
> >> > the
> >> > same.
> >> >
> >> > What I need from the list, and especially from the more knowledgeable
> >> > members (who already know the compiler code) is the willingness to
> >> > explain
> >> > the concepts and answer my questions, annoying as they may be. I have
> a
> >> > pretty decent background in compilers, ASTs, code generation, etc, but
> >> > not
> >> > so much in type inference.
> >> >
> >> > I'm not suggesting a particular timeframe for this process -- I'm
> doing
> >> > this
> >> > on the side while working on a research project and TAing, but I
> really
> >> > would like to get to the point where I can make significant
> >> > contributions to
> >> > the toolchain, and if I can help others who follow in my footsteps,
> then
> >> > that's a nice bonus.
> >> >
> >> > While I could have skipped this introduction and just proceeded with
> >> > inundating the list with questions, I felt that this (hopefully)
> gives a
> >> > purpose and perhaps motivation for those who have the answers to
> answer
> >> > my
> >> > questions even if they get annoying. In particular, I may often miss
> >> > some
> >> > parts that may seem obvious because I don't necessarily have the time
> to
> >> > read all the connected code in depth. Hopefully you'll bear with me.
> >> >
> >> > Does this sound reasonable to the fine folks on the list?
> >> >
> >> > Yotam
> >
> >
>