caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Hacking the lexer in the new camlp4
@ 2007-03-28 22:22 Harrison, John R
  2007-03-29  9:30 ` [Caml-list] " Nicolas Pouillard
  0 siblings, 1 reply; 5+ messages in thread
From: Harrison, John R @ 2007-03-28 22:22 UTC (permalink / raw)
  To: caml-list; +Cc: Harrison, John R

[-- Attachment #1: Type: text/plain, Size: 884 bytes --]

I haven't really been following the discussion about the new camlp4,

but it seems quite a lot has changed.

 

I only use some fairly simple extensions to parsing (a few new infix

names, and forcing anonymous expressions "..." to become "let it =

..."). However I also make two invasive changes to the lexer:

 

 * Modify the lexical rules for deciding whether a name is a regular

   or special identifier (not just based on case of the first letter)

 

 * Change the quotation delimiters to be `...` rather than <<...>>

 

In the current camlp4, the only way I found to do this was basically

to copy the existing lexer and edit it. Although it works, it's ugly

and invariably means that I've had to change something with almost

every new version of camlp4. Does the new camlp4 offer a nicer way of

changing the lexer?

 

John.

 


[-- Attachment #2: Type: text/html, Size: 4398 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] Hacking the lexer in the new camlp4
  2007-03-28 22:22 Hacking the lexer in the new camlp4 Harrison, John R
@ 2007-03-29  9:30 ` Nicolas Pouillard
  2007-03-29 15:11   ` Harrison, John R
  0 siblings, 1 reply; 5+ messages in thread
From: Nicolas Pouillard @ 2007-03-29  9:30 UTC (permalink / raw)
  To: Harrison, John R; +Cc: caml-list

On 3/29/07, Harrison, John R <john.r.harrison@intel.com> wrote:
>

[...]

> In the current camlp4, the only way I found to do this was basically
>
> to copy the existing lexer and edit it. Although it works, it's ugly
>
> and invariably means that I've had to change something with almost
>
> every new version of camlp4. Does the new camlp4 offer a nicer way of
>
> changing the lexer?

How did you that in the previous one without copy/paste the old lexer?

-- 
Nicolas Pouillard


^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [Caml-list] Hacking the lexer in the new camlp4
  2007-03-29  9:30 ` [Caml-list] " Nicolas Pouillard
@ 2007-03-29 15:11   ` Harrison, John R
  2007-03-29 15:29     ` Nicolas Pouillard
  0 siblings, 1 reply; 5+ messages in thread
From: Harrison, John R @ 2007-03-29 15:11 UTC (permalink / raw)
  To: Nicolas Pouillard; +Cc: Harrison, John R, caml-list

| > In the current camlp4, the only way I found to do this was basically
| > to copy the existing lexer and edit it. Although it works, it's ugly
| > and invariably means that I've had to change something with almost
| > every new version of camlp4. Does the new camlp4 offer a nicer way
of
| > changing the lexer?
|
| How did you that in the previous one without copy/paste the old lexer?

Indeed, that's exactly what I did in the old camlp4. But I am curious if
the new camlp4 offers, or could offer, a more modular solution.

John.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] Hacking the lexer in the new camlp4
  2007-03-29 15:11   ` Harrison, John R
@ 2007-03-29 15:29     ` Nicolas Pouillard
  2007-03-29 17:07       ` skaller
  0 siblings, 1 reply; 5+ messages in thread
From: Nicolas Pouillard @ 2007-03-29 15:29 UTC (permalink / raw)
  To: Harrison, John R; +Cc: caml-list

On 3/29/07, Harrison, John R <john.r.harrison@intel.com> wrote:
> | > In the current camlp4, the only way I found to do this was basically
> | > to copy the existing lexer and edit it. Although it works, it's ugly
> | > and invariably means that I've had to change something with almost
> | > every new version of camlp4. Does the new camlp4 offer a nicer way
> of
> | > changing the lexer?
> |
> | How did you that in the previous one without copy/paste the old lexer?
>
> Indeed, that's exactly what I did in the old camlp4. But I am curious if
> the new camlp4 offers, or could offer, a more modular solution.

Ok, so the new one is build with ocamllex, so it's not really extensible.

If you want to accept something that is currently rejected by the
lexer you need to change it.
However if it's just a matter of patching the token stream you can do
it more easily by adding a token filter (Camlp4.Sig.Token.Filter).

-- 
Nicolas Pouillard


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] Hacking the lexer in the new camlp4
  2007-03-29 15:29     ` Nicolas Pouillard
@ 2007-03-29 17:07       ` skaller
  0 siblings, 0 replies; 5+ messages in thread
From: skaller @ 2007-03-29 17:07 UTC (permalink / raw)
  To: Nicolas Pouillard; +Cc: Harrison, John R, caml-list

On Thu, 2007-03-29 at 17:29 +0200, Nicolas Pouillard wrote:
> On 3/29/07, Harrison, John R <john.r.harrison@intel.com> wrote:
> > | > In the current camlp4, the only way I found to do this was basically
> > | > to copy the existing lexer and edit it. Although it works, it's ugly
> > | > and invariably means that I've had to change something with almost
> > | > every new version of camlp4. Does the new camlp4 offer a nicer way
> > of
> > | > changing the lexer?
> > |
> > | How did you that in the previous one without copy/paste the old lexer?

> Ok, so the new one is build with ocamllex, so it's not really extensible.

This doesn't follow entirely. There are at least two ways to 
extend ocamllex lexers.

1. Recursively process a given lexeme. 

2. Dispatch the error case with enough information to start
another lexer.

Method 1 can either tokenise the given lexeme, or simply
use it as a trigger to start another lexer. Method 2 is
just a special case of method 1.

All you really need is to pass the lexer a class with
an overridable method for each lexemical class the lexer
recognizes, which accepts the state data, and returns
a list of tokens.

Ocamllex currently ensures that the state of the buffer
is ready to process the next character after the lexeme
just decoded, even if it had to overshoot to get
there.


-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-03-29 17:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-28 22:22 Hacking the lexer in the new camlp4 Harrison, John R
2007-03-29  9:30 ` [Caml-list] " Nicolas Pouillard
2007-03-29 15:11   ` Harrison, John R
2007-03-29 15:29     ` Nicolas Pouillard
2007-03-29 17:07       ` skaller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).