caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] a few lexing questions
@ 2003-04-28 18:13 Alan Schmitt
  2003-04-28 21:30 ` Michal Moskal
  0 siblings, 1 reply; 3+ messages in thread
From: Alan Schmitt @ 2003-04-28 18:13 UTC (permalink / raw)
  To: caml-list

Hi,

I am writing a parser / pretty-printer for the iCalendar file format,
and I have a few questions.

First of all, there are some times when I need to differentiate lexing
according to the first letter of what I am lexing. As I want to reuse
the lexing code, I did the following:

and alt_lang_xparam = parse
| 'A'                    { lexbuf.Lexing.lex_curr_pos <- Lexing.lexeme_start lexbuf;
                           Altrepparam (altrepparam lexbuf) }
| 'X'                    { lexbuf.Lexing.lex_curr_pos <- Lexing.lexeme_start lexbuf;
                           Xparam (xparam lexbuf) }
| 'L'                    { lexbuf.Lexing.lex_curr_pos <- Lexing.lexeme_start lexbuf;
                           Langparam (languageparam lexbuf) }

Is it buggy ? Bad style ? Is there a nicer way to do it ?

A second question is about integration this code with other lexing code
or streams. An iCalendar file cannot have a line that is longer than 75
bytes, excluding line break. A line may be broken anywhere as long as
there is a space at the beginning of the next line, as in:

this is a very lo
 ng line

represents "this is a very long line". As this break may occur anywhere
(even inside keywords), I assume when writing the lexer these kind of
lines have been already merged together. I know how to implement the
merging using a temp file, but I'm looking for a nicer solution (like
using a stream, or using one lexer to feed the current lexer). Any
suggestion ?

I'd also like to have some advice on doing the symmetric transformation
(breaking long lines when necessary) ... Would streams be a good
solution here ?

Thanks a lot,

Alan Schmitt

-- 
The hacker: someone who figured things out and made something cool happen.

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] a few lexing questions
  2003-04-28 18:13 [Caml-list] a few lexing questions Alan Schmitt
@ 2003-04-28 21:30 ` Michal Moskal
  2003-04-30 12:31   ` John Max Skaller
  0 siblings, 1 reply; 3+ messages in thread
From: Michal Moskal @ 2003-04-28 21:30 UTC (permalink / raw)
  To: Alan Schmitt; +Cc: caml-list

On Mon, Apr 28, 2003 at 02:13:48PM -0400, Alan Schmitt wrote:
> A second question is about integration this code with other lexing code
> or streams. An iCalendar file cannot have a line that is longer than 75
> bytes, excluding line break. A line may be broken anywhere as long as
> there is a space at the beginning of the next line, as in:
> 
> this is a very lo
>  ng line
> 
> represents "this is a very long line". As this break may occur anywhere
> (even inside keywords), I assume when writing the lexer these kind of
> lines have been already merged together. I know how to implement the
> merging using a temp file, but I'm looking for a nicer solution (like
> using a stream, or using one lexer to feed the current lexer). Any
> suggestion ?

Maybe Lexing.from_function ? and implement what you would do with temp
file using it. Unfortunately Lexing.from_function has very C-ish
interface using indices in strings, so it is easy to make a mistake when
writing this, so be careful.

-- 
: Michal Moskal :: http://www.kernel.pl/~malekith : GCS {C,UL}++++$ a? !tv
: PLD Linux ::::::::: Wroclaw University, CS Dept : {E-,w}-- {b++,e}>+++ h

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] a few lexing questions
  2003-04-28 21:30 ` Michal Moskal
@ 2003-04-30 12:31   ` John Max Skaller
  0 siblings, 0 replies; 3+ messages in thread
From: John Max Skaller @ 2003-04-30 12:31 UTC (permalink / raw)
  To: Michal Moskal; +Cc: Alan Schmitt, caml-list

Michal Moskal wrote:

> On Mon, Apr 28, 2003 at 02:13:48PM -0400, Alan Schmitt wrote:
> 
>>A second question is about integration this code with other lexing code
>>or streams. An iCalendar file cannot have a line that is longer than 75
>>bytes, excluding line break. A line may be broken anywhere as long as
>>there is a space at the beginning of the next line, as in:
>>
>>this is a very lo
>> ng line
>>
>>represents "this is a very long line". As this break may occur anywhere
>>(even inside keywords), I assume when writing the lexer these kind of
>>lines have been already merged together. I know how to implement the
>>merging using a temp file, but I'm looking for a nicer solution (like
>>using a stream, or using one lexer to feed the current lexer). Any
>>suggestion ?
>>
> 
> Maybe Lexing.from_function ? 


Yes, I do this. In the lexer:

....
| "(" { fun state -> state#inbody; [LPAR (state#get_srcref lexbuf)] }
....

Notice each executable section is a function accepting
the state object. You can form a closure of a method
of a lexer state class and pass that to Lexing.from_function,
and you call the lexer function with an extra arguemnt like:

lex_my_stuff lexbuf state


In that way you can

(a) pre-process the input
(b) maintain state such as the original line number


-- 
John Max Skaller, mailto:skaller@ozemail.com.au
snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia.
voice:61-2-9660-0850


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-04-30 12:32 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-28 18:13 [Caml-list] a few lexing questions Alan Schmitt
2003-04-28 21:30 ` Michal Moskal
2003-04-30 12:31   ` John Max Skaller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).