caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc
@ 2001-08-22 18:53 neale-caml
  2001-08-22 19:18 ` Alain Frisch
  2001-08-22 20:23 ` Markus Mottl
  0 siblings, 2 replies; 40+ messages in thread
From: neale-caml @ 2001-08-22 18:53 UTC (permalink / raw)
  To: caml-list

Hi, I'm converting over some examples from the Perl cookbook to learn
OCaml and help with the PLEAC project <http://pleac.sourceforge.net/>,
and I'm running into some strange behavior with the Str structure; I
think it's something to do with garbage collection.

  # let rec f l =
      let sep = Str.regexp "^[ \t\n]*\\(.+\\)" in
      match l with
      | [] -> []
      | [""] -> []
      | s :: rest -> if (Str.string_match sep s 0) then
          let foo = print_string ("match " ^ Str.matched_group 1 s ^ "\n") in
          (Str.matched_group 1 s) :: (f rest)
      else
          let foo = print_string "nomatch\n" in
          (f rest);;
  val f : string list -> string list = <fun>
  # f ["  arf"];;
  match arf
  - : string list = ["arf"]
  # f ["  arf";  "barf"];;
  match arf
  match barf
  - : string list = ["  ar"; "barf"]
  # f [" arf"; " barf"];;
  match arf
  match barf
  Uncaught exception: Invalid_argument "String.sub".

First question: why is
  f ["  arf"] = ["arf"]
while
  f ["  arf"; "barf"] = ["  ar"; "barf"]
?  I don't think I've entered any code which should modify the string
after it's been matched.

Second question: What is causing the Invalid_argument exception?

Thanks in advance for any assistance,

Neale
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc
  2001-08-22 18:53 [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc neale-caml
@ 2001-08-22 19:18 ` Alain Frisch
  2001-08-22 20:41   ` Neale Pickett
  2001-08-22 20:23 ` Markus Mottl
  1 sibling, 1 reply; 40+ messages in thread
From: Alain Frisch @ 2001-08-22 19:18 UTC (permalink / raw)
  To: neale-caml; +Cc: caml-list

On 22 Aug 2001 neale-caml@woozle.org wrote:

>   # let rec f l =
>       let sep = Str.regexp "^[ \t\n]*\\(.+\\)" in
>       match l with
>       | [] -> []
>       | [""] -> []
>       | s :: rest -> if (Str.string_match sep s 0) then
>           let foo = print_string ("match " ^ Str.matched_group 1 s ^ "\n") in
>           (Str.matched_group 1 s) :: (f rest)
                                    ^^
                                    
This is wrong; with the current OCaml implementation, the right
operand of (::) is called first; so (Str.matched_group 1 s) is called
after subsequent calls to Str.string_match, which is obviously incorrect.

You have to use explicit sequencing (let .. in ..)

BTW, you could call Str.regexp only once.

-- 
  Alain Frisch

-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc
  2001-08-22 18:53 [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc neale-caml
  2001-08-22 19:18 ` Alain Frisch
@ 2001-08-22 20:23 ` Markus Mottl
  2001-08-22 20:31   ` Miles Egan
       [not found]   ` <w533d6j1vxn.fsf@woozle.org>
  1 sibling, 2 replies; 40+ messages in thread
From: Markus Mottl @ 2001-08-22 20:23 UTC (permalink / raw)
  To: neale-caml; +Cc: caml-list

On Wed, 22 Aug 2001, neale-caml@woozle.org wrote:
> Hi, I'm converting over some examples from the Perl cookbook to learn
> OCaml and help with the PLEAC project <http://pleac.sourceforge.net/>,
> and I'm running into some strange behavior with the Str structure;
> I think it's something to do with garbage collection.

I think it might be more suitable to translate the PLEAC-examples to
OCaml using the PCRE-library (Perl Compatible Regular Expressions):
it's much easier to get things right (e.g. no problems with evaluation
order as you just had - the PCRE is stateless). You also don't have
to think Emacs-style with regular expressions. Furthermore, Perl
supports regexp features that are not available in the Str-module:
without the PCRE-library you'd have to write significantly more code
for some PLEAC-examples.

You can find the PCRE-library here:

  http://www.ai.univie.ac.at/~markus/home/ocaml_sources.html

Best regards,
Markus Mottl

-- 
Markus Mottl                                             markus@oefai.at
Austrian Research Institute
for Artificial Intelligence                  http://www.oefai.at/~markus
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc
  2001-08-22 20:23 ` Markus Mottl
@ 2001-08-22 20:31   ` Miles Egan
  2001-08-22 20:52     ` Michael Leary
  2001-08-22 22:06     ` Nicolas George
       [not found]   ` <w533d6j1vxn.fsf@woozle.org>
  1 sibling, 2 replies; 40+ messages in thread
From: Miles Egan @ 2001-08-22 20:31 UTC (permalink / raw)
  To: Markus Mottl; +Cc: neale-caml, caml-list

On Wed, Aug 22, 2001 at 10:23:17PM +0200, Markus Mottl wrote:
> On Wed, 22 Aug 2001, neale-caml@woozle.org wrote:
> > Hi, I'm converting over some examples from the Perl cookbook to learn
> > OCaml and help with the PLEAC project <http://pleac.sourceforge.net/>,
> > and I'm running into some strange behavior with the Str structure;
> > I think it's something to do with garbage collection.
> 
> I think it might be more suitable to translate the PLEAC-examples to
> OCaml using the PCRE-library (Perl Compatible Regular Expressions):
> it's much easier to get things right (e.g. no problems with evaluation
> order as you just had - the PCRE is stateless). You also don't have
> to think Emacs-style with regular expressions. Furthermore, Perl
> supports regexp features that are not available in the Str-module:
> without the PCRE-library you'd have to write significantly more code
> for some PLEAC-examples.

I've asked this several times before, but I think it's worth asking again: is
there any chance of adding pcre to the stock distribution?  It's superior in
every way the the str module and much friendlier to python/perl refugees.

-- 
miles

"We in the past evade X, where X is something which we believe to be a
lion, through the act of running." - swiftrain@geocities.com
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc
  2001-08-22 19:18 ` Alain Frisch
@ 2001-08-22 20:41   ` Neale Pickett
  2001-08-23 10:21     ` Frank Atanassow
  0 siblings, 1 reply; 40+ messages in thread
From: Neale Pickett @ 2001-08-22 20:41 UTC (permalink / raw)
  To: caml-list

Alain Frisch writes:
> On 22 Aug 2001 neale-caml@woozle.org wrote:

>> # let rec f l =
>> let sep = Str.regexp "^[ \t\n]*\\(.+\\)" in
>> match l with
>> | [] -> []
>> | [""] -> []
>> | s :: rest -> if (Str.string_match sep s 0) then
>> let foo = print_string ("match " ^ Str.matched_group 1 s ^ "\n") in
>>             (Str.matched_group 1 s) :: (f rest)
>                                     ^^
                                    
> This is wrong; with the current OCaml implementation, the right
> operand of (::) is called first; so (Str.matched_group 1 s) is called
> after subsequent calls to Str.string_match, which is obviously
> incorrect.

Aha!  Thank you.

This makes sense, but it is certainly not obvious, especially in a
language which purports to have no side-effects.  I can't help thinking
that s should be a different string for every invocation, but clearly it
is somehow related to the initial input string.  No doubt this is a
clever optimization within OCaml which makes for drastically reduced
memory usage when processing strings, but it does make things a bit
confusing to the beginner.

I don't have any good suggestions on how else to do it, although my base
desire is to have a regexp matching function which returns a string list
of the matched groups.

> You have to use explicit sequencing (let .. in ..)

> BTW, you could call Str.regexp only once.

Yes, to avoild compiling the regexp with each function invocation,
correct?  For this example, I thought it more illustrative to have it
all contained within a function definition, but perhaps I need to become
more experienced with closures.

Merci encore!

Neale

> -- 
>   Alain Frisch


-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc
  2001-08-22 20:31   ` Miles Egan
@ 2001-08-22 20:52     ` Michael Leary
  2001-08-23  5:36       ` Jeremy Fincher
  2001-08-22 22:06     ` Nicolas George
  1 sibling, 1 reply; 40+ messages in thread
From: Michael Leary @ 2001-08-22 20:52 UTC (permalink / raw)
  To: Miles Egan; +Cc: Markus Mottl, neale-caml, caml-list

I second that... er, Several + 1 that?

On Wed, Aug 22, 2001 at 01:31:00PM -0700, Miles Egan wrote:
> I've asked this several times before, but I think it's worth asking again: is
> there any chance of adding pcre to the stock distribution?  It's superior in
> every way the the str module and much friendlier to python/perl refugees.

-- 
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc
  2001-08-22 20:31   ` Miles Egan
  2001-08-22 20:52     ` Michael Leary
@ 2001-08-22 22:06     ` Nicolas George
  2001-08-23  7:08       ` [Caml-list] PCRE as standard (Was: Str.string_match raising Invalid_argument...) Florian Hars
  2001-08-23 17:31       ` [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc Brian Rogoff
  1 sibling, 2 replies; 40+ messages in thread
From: Nicolas George @ 2001-08-22 22:06 UTC (permalink / raw)
  To: Miles Egan; +Cc: Markus Mottl, neale-caml, caml-list

Le mercredi 22 août 2001 à 13:31, Miles Egan a écrit :
>>		    PCRE-library (Perl Compatible Regular Expressions):
> I've asked this several times before, but I think it's worth asking again: is
> there any chance of adding pcre to the stock distribution?  It's superior in
> every way the the str module and much friendlier to python/perl refugees.

I second that too. And because PCRE is under LGPL (Str is based on GNU
regex, which is under GPL), it could be in the standard library and not
only in the distribution. Maybe we could even hope a regexp pattern
matching as a syntax extension :-)
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc
  2001-08-22 20:52     ` Michael Leary
@ 2001-08-23  5:36       ` Jeremy Fincher
  0 siblings, 0 replies; 40+ messages in thread
From: Jeremy Fincher @ 2001-08-23  5:36 UTC (permalink / raw)
  To: caml-list

I'm one of those Python/Perl refugees (I knew both before I knew O'Caml,
actually) and I can definitely agree that I would find it ridiculous to
spend any amount of time learning a new (not only new, but *inferior* in
many regards) regexp syntax and interface when Pcre already exists for
O'Caml.  I can't see any reason why it shouldn't be in the stock
distribution; I know that I won't be using O'Caml and regular expressions
together without it.

Jeremy

> On Wed, Aug 22, 2001 at 01:31:00PM -0700, Miles Egan wrote:
> > I've asked this several times before, but I think it's worth asking
again: is
> > there any chance of adding pcre to the stock distribution?  It's
superior in
> > every way the the str module and much friendlier to python/perl
refugees.
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Caml-list] PCRE as standard (Was: Str.string_match raising Invalid_argument...)
  2001-08-22 22:06     ` Nicolas George
@ 2001-08-23  7:08       ` Florian Hars
  2001-08-23 17:31       ` [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc Brian Rogoff
  1 sibling, 0 replies; 40+ messages in thread
From: Florian Hars @ 2001-08-23  7:08 UTC (permalink / raw)
  To: Nicolas George; +Cc: caml-list

On Thu, Aug 23, 2001 at 12:06:25AM +0200, Nicolas George wrote:
> I second that too.

<AOL>MEEE TOOOOOOOOOOOOOO</AOL>

> Maybe we could even hope a regexp pattern
> matching as a syntax extension :-)

"Some sugar for regexp matching using camlp4"
http://caml.inria.fr/archives/200107/msg00187.html

Yours, Florian.
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc
  2001-08-22 20:41   ` Neale Pickett
@ 2001-08-23 10:21     ` Frank Atanassow
  2001-08-23 16:06       ` Neale Pickett
  0 siblings, 1 reply; 40+ messages in thread
From: Frank Atanassow @ 2001-08-23 10:21 UTC (permalink / raw)
  To: Neale Pickett; +Cc: caml-list

Neale Pickett wrote (on 22-08-01 13:41 -0700):
> Alain Frisch writes:
> > On 22 Aug 2001 neale-caml@woozle.org wrote:
> 
> >> # let rec f l =
> >> let sep = Str.regexp "^[ \t\n]*\\(.+\\)" in
> >> match l with
> >> | [] -> []
> >> | [""] -> []
> >> | s :: rest -> if (Str.string_match sep s 0) then
> >> let foo = print_string ("match " ^ Str.matched_group 1 s ^ "\n") in
> >>             (Str.matched_group 1 s) :: (f rest)
> >                                     ^^
>                                     
> > This is wrong; with the current OCaml implementation, the right
> > operand of (::) is called first; so (Str.matched_group 1 s) is called
> > after subsequent calls to Str.string_match, which is obviously
> > incorrect.
> 
> Aha!  Thank you.
> 
> This makes sense, but it is certainly not obvious, especially in a
> language which purports to have no side-effects.

Ocaml does not purport to have no side-effects. It has plenty of side-effects.
You must be thinking of Haskell or Miranda.

> I can't help thinking
> that s should be a different string for every invocation, but clearly it
> is somehow related to the initial input string.  No doubt this is a
> clever optimization within OCaml which makes for drastically reduced
> memory usage when processing strings, but it does make things a bit
> confusing to the beginner.

I'm pretty sure there is no such optimization, but I'm not sure what you're
talking about here. Anyway, if an optimization affected the behavior of a
program, it would not be an optimization but rather an compiler bug.

> I don't have any good suggestions on how else to do it, although my base
> desire is to have a regexp matching function which returns a string list
> of the matched groups.

There is no need to mutate the list/string(s).

If I understand you correctly (but I don't think I do):

  let sep_list =
    let sep = Str.regexp "[ \t\n]+\\([^ \t\n]*\\)" in
    fun s ->
      let rec loop i =
        if Str.string_match sep s i then
           let m = Str.matched_group 1 s in
	   m :: loop (Str.match_end ())
	else
	   []
      in loop 0

# sep_list "  abc def  ghi j";;
- : string list = ["abc"; "def"; "ghi"; "j"]

But this is what the Str.split procedure does already:

# Str.split (Str.regexp "[ \t\n]+") "  abc def  ghi j";;
- : string list = ["abc"; "def"; "ghi"; "j"]

Your function has type string list -> string list, and it seems like it just
does the same match on every element of the list, so it's much easier:

  let map_match =
    let sep = Str.regexp "[ \t\n]*\\(.+\\)" in
    fun l ->
      let f s = Str.string_match sep s 0; Str.matched_group 1 s in
      List.map f l

# map_match ["  arf"; " barf"];;
- : string list = ["arf"; "barf"]

-- 
Frank Atanassow, Information & Computing Sciences, Utrecht University
Padualaan 14, PO Box 80.089, 3508 TB Utrecht, Netherlands
Tel +31 (030) 253-3261 Fax +31 (030) 251-379
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc
  2001-08-23 10:21     ` Frank Atanassow
@ 2001-08-23 16:06       ` Neale Pickett
  2001-08-23 16:25         ` Alain Frisch
  0 siblings, 1 reply; 40+ messages in thread
From: Neale Pickett @ 2001-08-23 16:06 UTC (permalink / raw)
  To: caml-list

Frank Atanassow writes:

> Ocaml does not purport to have no side-effects. It has plenty of
> side-effects.  You must be thinking of Haskell or Miranda.

That's probably half of my problem, then :-)

> I'm pretty sure there is no such optimization, but I'm not sure what
> you're talking about here. Anyway, if an optimization affected the
> behavior of a program, it would not be an optimization but rather an
> compiler bug.

Having slept on it, I think what I was experiencing might be linked with
the fact that the Str library is apparently non-reentrant and my
approach to using the regexp parts of Str.  What I ran into was, I
think, a bug in either the Str library or its documentation.

Originally, I was trying to do something like this:

# let string_lines =
    let sep = Str.regexp "^[ \t\n]*\\(.+\\)" in
    let rec f = function
      | [] -> []
      | s :: rest -> if (Str.string_match sep s 0) then
          (Str.matched_group 1 s) :: (f rest)
      else
          f rest
    in
    f
  in
  string_lines ["  hello"; "  dromedaries"];;
Uncaught exception: Invalid_argument "String.sub".

(Apologies if this is inelegant, I'm just starting out.)

Alain Frisch <frisch@clipper.ens.fr> points out:

> This is wrong; with the current OCaml implementation, the right
> operand of (::) is called first; so (Str.matched_group 1 s) is called
> after subsequent calls to Str.string_match, which is obviously
> incorrect.

I contest that this is obvious.  s is a different string each time f is
called, and so even though I do call Str.string_match multiple times,
it's with a different s.  The manual for the Str libary says only that I
must pass in the same s as was given to string_match, which implies that
s is somehow keyed to its matches.  It sounds as though I shouldn't do
the following:

  Str.string_match sep s 0;
  Str.string_match sep s' 0;
  print_string (Str.matched_group 1 s);

If this is the case, why does Str.matched_group even bother requiring
the original string?

I may be missing some crucial aspect to OCaml, and if so, I apologize
for this excercise in my own ignorance.  With my current understanding
of the language, though, it looks as though to use the regexp parts of
Str, I need to understand the underlying implementation of the library,
or at least know not to call string_match as above.  If the former, I
would consider this a bug; if the latter, it should just be added to the
documentation.  Either way, it's confusing.

> If I understand you correctly (but I don't think I do):

> # Str.split (Str.regexp "[ \t\n]+") "  abc def  ghi j";;
> - : string list = ["abc"; "def"; "ghi"; "j"]

This is, in fact, exactly what I was trying to do.  I wanted to code it
as a recursive function to show a friend the difference between
functional and procedural programming, got caught up in the exception,
and forgot what my original intent was.  Thank you!
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc
  2001-08-23 16:06       ` Neale Pickett
@ 2001-08-23 16:25         ` Alain Frisch
  2001-08-23 18:14           ` Neale Pickett
  0 siblings, 1 reply; 40+ messages in thread
From: Alain Frisch @ 2001-08-23 16:25 UTC (permalink / raw)
  To: Neale Pickett; +Cc: caml-list

On 23 Aug 2001, Neale Pickett wrote:

> I contest that this is obvious.  s is a different string each time f is
> called, and so even though I do call Str.string_match multiple times,
> it's with a different s.  The manual for the Str libary says only that I
> must pass in the same s as was given to string_match, which implies that
> s is somehow keyed to its matches.  It sounds as though I shouldn't do
> the following:
> 
>   Str.string_match sep s 0;
>   Str.string_match sep s' 0;
>   print_string (Str.matched_group 1 s);
> 
> If this is the case, why does Str.matched_group even bother requiring
> the original string?

Indeed, you shouldn't.

The manual says:

<<
val matched_string: string -> string

matched_string s returns the substring of s that was matched by the latest
string_match, search_forward or search_backward. The user must make sure
that the parameter s is the same string that was passed to the matching or
searching function.
>>

Note the "latest".

The approach you suggest (that the library keeps a reference to
the last matched string) is acceptable. I guess it was not
implemented like that because this would prevent garbage collection
of the last matched string.  (maybe a "release_internal_buffer" function
would have been better)



-- 
 Alain

-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc
  2001-08-22 22:06     ` Nicolas George
  2001-08-23  7:08       ` [Caml-list] PCRE as standard (Was: Str.string_match raising Invalid_argument...) Florian Hars
@ 2001-08-23 17:31       ` Brian Rogoff
  2001-08-23 18:08         ` [Caml-list] standard regex package Miles Egan
  1 sibling, 1 reply; 40+ messages in thread
From: Brian Rogoff @ 2001-08-23 17:31 UTC (permalink / raw)
  To: Nicolas George; +Cc: Miles Egan, Markus Mottl, neale-caml, caml-list

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=X-UNKNOWN, Size: 3030 bytes --]

On Thu, 23 Aug 2001, Nicolas George wrote:
> Le mercredi 22 août 2001 à 13:31, Miles Egan a écrit :
> >>		    PCRE-library (Perl Compatible Regular Expressions):
> > I've asked this several times before, but I think it's worth asking again: is
> > there any chance of adding pcre to the stock distribution?  It's superior in
> > every way the the str module and much friendlier to python/perl refugees.
> 
> I second that too. And because PCRE is under LGPL (Str is based on GNU
> regex, which is under GPL), it could be in the standard library and not
> only in the distribution. 

Some other "pure OCaml" regexp engines were discussed here recently, including 
Claude Marche's and the one from Unison. Since the Unison code is under GPL 
and not LGPL, and I'm a (inverse) license ayatollah, I can only use the
LGPL'ed one. I've been playing with it and it's quite nice, though I think it
needs a few more bells and whistles to satisfy the Perlers. I don't know how 
it compares in performance against the Pcre C code. 

I agree that Str is suboptimal, but I think that there are also a few
other ways in which string handling could be improved, like 

(1) Very long strings (Sys.max_string_length = 16777211 on most
    machines). Please don't tell me that slurping a 100M file into a 
    string is probably not smart, I know that, but it's a restriction
    that annoys some (many?) programmers. 

(2) Wide character strings

(3) Functional strings (and functional arrays while we're at it :)

(4) Substrings

(1) and (3) could be fixed by adding a "ropes" library, or (1) alone could
be fixed by building strings over Bigarrays. (2) can also be fixed using 
Bigarrays, either building on top of them or just stealing the C code and 
specializing it. I ported the SML Basis library for substrings over to
OCaml, but I much prefer Hansen's subsequence reference approach (if
you've read Finkel's "Advanced Programming Language Design" you know what
I mean) and I've made a new module based on that which I'll release after
some more tire kicking; e-mail me if you want a version. Interestingly, it 
depends on physical reference equality so a semantics preserving port to
SML would require some uglification. 

So, I think we could use a richer set of string datatypes, and operations 
over them. It's not clear to me how much of this needs to be part of OCaml 
proper, and how much should just be, say, part of the CDK. It is clear that 
if there is going to be built-in regexp matching that Str is not the way to go. 
 
> Maybe we could even hope a regexp pattern matching as a syntax extension :-)

Some version of Haskell had a regexp matcher built in that worked on regexps over 
other types than characters. I don't think it survived, but it's certainly
a cool idea.

-- Brian


-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Caml-list] standard regex package
  2001-08-23 17:31       ` [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc Brian Rogoff
@ 2001-08-23 18:08         ` Miles Egan
  2001-08-23 19:28           ` Brian Rogoff
  2001-08-24  9:13           ` Xavier Leroy
  0 siblings, 2 replies; 40+ messages in thread
From: Miles Egan @ 2001-08-23 18:08 UTC (permalink / raw)
  To: caml-list

On Thu, Aug 23, 2001 at 10:31:46AM -0700, Brian Rogoff wrote:
> I agree that Str is suboptimal, but I think that there are also a few
> other ways in which string handling could be improved, like 
> 
> (1) Very long strings (Sys.max_string_length = 16777211 on most
>     machines). Please don't tell me that slurping a 100M file into a 
>     string is probably not smart, I know that, but it's a restriction
>     that annoys some (many?) programmers. 
> 
> (2) Wide character strings
> 
> (3) Functional strings (and functional arrays while we're at it :)
> 
> (4) Substrings

These would all be very nice, I agree, but I also think we need something better
than str and sooner than all these things could be implemented.  Maybe some kind
of transitional scheme would work?

While were on the subject of beginner usability, it seems to me that if dynamic
loading of c-libraries is still a ways off, it might be nice to build the unix
module into the toplevel at install time.

-- 
miles

"We in the past evade X, where X is something which we believe to be a
lion, through the act of running." - swiftrain@geocities.com
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc
  2001-08-23 16:25         ` Alain Frisch
@ 2001-08-23 18:14           ` Neale Pickett
  0 siblings, 0 replies; 40+ messages in thread
From: Neale Pickett @ 2001-08-23 18:14 UTC (permalink / raw)
  To: caml-list

Alain Frisch writes:
> On 23 Aug 2001, Neale Pickett wrote:

>> If this is the case, why does Str.matched_group even bother requiring
>> the original string?

> Indeed, you shouldn't.

> The manual says:

> <<
> val matched_string: string -> string

> matched_string s returns the substring of s that was matched by the latest
> string_match, search_forward or search_backward. The user must make sure
> that the parameter s is the same string that was passed to the matching or
> searching function.
> >> 

> Note the "latest".

Yes, there it is; now I feel a little silly.

> The approach you suggest (that the library keeps a reference to the
> last matched string) is acceptable. I guess it was not implemented
> like that because this would prevent garbage collection of the last
> matched string.  (maybe a "release_internal_buffer" function would
> have been better)

Hmm.  I'm beginning to see the delimna.  I don't suppose there's an
exposed function that will return the memory address of the string,
either, but if there were, that could be used to keep track of what
string was last passed in.

Maybe the proper solution is to use Markus Mottl's PCRE-OCaml library
:-)

Thanks for your patience.  I understand a lot more about OCaml than I
did yesterday!

Neale
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] standard regex package
  2001-08-23 18:08         ` [Caml-list] standard regex package Miles Egan
@ 2001-08-23 19:28           ` Brian Rogoff
  2001-08-23 19:49             ` Miles Egan
                               ` (3 more replies)
  2001-08-24  9:13           ` Xavier Leroy
  1 sibling, 4 replies; 40+ messages in thread
From: Brian Rogoff @ 2001-08-23 19:28 UTC (permalink / raw)
  To: caml-list

On Thu, 23 Aug 2001, Miles Egan wrote:
[... snip ...]
> These would all be very nice, I agree, but I also think we need something better
> than str and sooner than all these things could be implemented.  Maybe some kind
> of transitional scheme would work?

I agree, from a pragmatic point of view a better regexp matcher would make OCaml 
significantly sexier to all of those poor deluded Python and Perl programmers. 
The other stuff can come later. I think Markus has a very good point about 
some distutils (Python) like facility being even more important. Once such
a framework is in place we can have an OCaml CPAN. Last time I looked findlib 
ran only on Unix, which is a big problem. 

On the subject of "social tools", the program Neel is looking for is this one 

http://www.lri.fr/~filliatr/ocamlweb/index.en.html

and Hevea of course. There is also OCamlDoc, which seems quite nice too,
but ocamlweb is used in a few more libraries I think. 

> While were on the subject of beginner usability, it seems to me that if dynamic
> loading of c-libraries is still a ways off, it might be nice to build the unix
> module into the toplevel at install time.

A better apporach might be to ape Python and the SML Basis Library by providing a 
generic "OS" module which abstracts at least Unix/Win/Mac away. I would
prefer this, since I feel silly using Unix.<blah> on a Windows box :-).

-- Brian


-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] standard regex package
  2001-08-23 19:28           ` Brian Rogoff
@ 2001-08-23 19:49             ` Miles Egan
  2001-08-23 19:51             ` Gerd Stolpmann
                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 40+ messages in thread
From: Miles Egan @ 2001-08-23 19:49 UTC (permalink / raw)
  To: Brian Rogoff; +Cc: caml-list

On Thu, Aug 23, 2001 at 12:28:49PM -0700, Brian Rogoff wrote:
> The other stuff can come later. I think Markus has a very good point about 
> some distutils (Python) like facility being even more important. Once such
> a framework is in place we can have an OCaml CPAN. Last time I looked findlib 
> ran only on Unix, which is a big problem. 

I agree that a CPAN equivalent is more pressing, but I don't think regex should
be an add-on.  It's too fundamental, at least to the people coming from
'scripting' languages.

> > While were on the subject of beginner usability, it seems to me that if dynamic
> > loading of c-libraries is still a ways off, it might be nice to build the unix
> > module into the toplevel at install time.
> 
> A better apporach might be to ape Python and the SML Basis Library by providing a 
> generic "OS" module which abstracts at least Unix/Win/Mac away. I would
> prefer this, since I feel silly using Unix.<blah> on a Windows box :-).

Yeah, "Unix" seems like a bit of an anachronism.

-- 
miles

"We in the past evade X, where X is something which we believe to be a
lion, through the act of running." - swiftrain@geocities.com
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] standard regex package
  2001-08-23 19:28           ` Brian Rogoff
  2001-08-23 19:49             ` Miles Egan
@ 2001-08-23 19:51             ` Gerd Stolpmann
  2001-08-23 21:12               ` Brian Rogoff
  2001-08-23 21:27               ` Benjamin C. Pierce
  2001-08-23 21:06             ` RE : " Lionel Fourquaux
  2001-08-27 15:16             ` [Caml-list] standard regex package Ian Zimmerman
  3 siblings, 2 replies; 40+ messages in thread
From: Gerd Stolpmann @ 2001-08-23 19:51 UTC (permalink / raw)
  To: Brian Rogoff, caml-list

On Thu, 23 Aug 2001, Brian Rogoff wrote:
>On Thu, 23 Aug 2001, Miles Egan wrote:
>[... snip ...]
>> These would all be very nice, I agree, but I also think we need something better
>> than str and sooner than all these things could be implemented.  Maybe some kind
>> of transitional scheme would work?
>
>I agree, from a pragmatic point of view a better regexp matcher would make OCaml 
>significantly sexier to all of those poor deluded Python and Perl programmers. 

Agreed, too.

>The other stuff can come later. I think Markus has a very good point about 
>some distutils (Python) like facility being even more important. Once such
>a framework is in place we can have an OCaml CPAN. Last time I looked findlib 
>ran only on Unix, which is a big problem. 

When I developed findlib I had something like CPAN in mind. I started it when I
downloaded several 3rd party packages, and all had a different installation
routines I had to modify for my purposes. For those who don't know: You can
install almost every 3rd party Perl package by simply doing

perl Makefile.pl
make
make test
make install

It is simple to do, and that's an important aspect of the success of Perl (a
language which is nothing without CPAN).

I hope we will have CCAN. Of course, one precondition is a standard package
structure, and I can imagine the findlib tool could be an important part of it
(for a description see 
http://test.ocaml-programming.de/packages/documentation/findlib/).

Currently, findlib runs only on Unix (including cygwin), but this is mainly
because I have no native Windows installation on which I could test it.
Especially, I have removed all shell scripts (it is now purely programmed in
OCaml), and it is only a question of fixing details. (And of writing a
replacement for "configure".)

>On the subject of "social tools", the program Neel is looking for is this one 
>
>http://www.lri.fr/~filliatr/ocamlweb/index.en.html
>
>and Hevea of course. There is also OCamlDoc, which seems quite nice too,
>but ocamlweb is used in a few more libraries I think. 
>
>> While were on the subject of beginner usability, it seems to me that if dynamic
>> loading of c-libraries is still a ways off, it might be nice to build the unix
>> module into the toplevel at install time.
>
>A better apporach might be to ape Python and the SML Basis Library by providing a 
>generic "OS" module which abstracts at least Unix/Win/Mac away. I would
>prefer this, since I feel silly using Unix.<blah> on a Windows box :-).

"Unix" is only a name for an API (and not for a technology or a familiy of OSs),
and it is clearly MS's fault not to be Unix-compliant (other operating systems
originating in different worlds are, e.g. MVS includes a Unix API). But that's
politics... (MS had some times ago a Posix mode in NT but nobody used it, so I
think there are no real technical reasons.)

Being more abstract has also disadvantages because you don't know which system
calls are done for one abstract call. (Ironically, Perl tried to abstract from
the system calls, i.e. _the_ language used by system hackers. And Perl shows
clearly that this way is wrong. For example, Perl's flock is one of flock,
fcntl, or lockf, resulting in total confusion which semantics the program can
expect.)

So I think having a Unix module with Unix semantics is right.

Gerd
-- 
----------------------------------------------------------------------------
Gerd Stolpmann      Telefon: +49 6151 997705 (privat)
Viktoriastr. 45             
64293 Darmstadt     EMail:   gerd@gerd-stolpmann.de
Germany                     
----------------------------------------------------------------------------
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Caml-list] Re: [OFF-LIST] Str.string_match raising Invalid_argument "String.sub" in gc
       [not found]       ` <w5366be7fd0.fsf_-_@woozle.org>
@ 2001-08-23 20:01         ` Markus Mottl
  2001-08-23 20:31           ` Patrick M Doane
  0 siblings, 1 reply; 40+ messages in thread
From: Markus Mottl @ 2001-08-23 20:01 UTC (permalink / raw)
  To: Neale Pickett; +Cc: OCAML

On Thu, 23 Aug 2001, Neale Pickett wrote:
> Markus Mottl writes:
> 
> >   ./mytop -I /usr/lib/ocaml/contrib
> 
> AHA!  I wonder why it doesn't look here by default.  Anyway, I now have
> a shell script wrapper which does the job nicely.

Hm, this could be a feature request: allow adding default include paths to
generated toplevels. This would surely save many people some headaches.
Interface files usually don't move after toplevel-generation anyway. Why
not just include the directories specified during toplevel generation
by default?

What do the developers think? Adding this feature should hardly cost
any time.

Regards,
Markus Mottl

-- 
Markus Mottl                                             markus@oefai.at
Austrian Research Institute
for Artificial Intelligence                  http://www.oefai.at/~markus
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] Re: [OFF-LIST] Str.string_match raising Invalid_argument "String.sub" in gc
  2001-08-23 20:01         ` [Caml-list] Re: [OFF-LIST] Str.string_match raising Invalid_argument "String.sub" in gc Markus Mottl
@ 2001-08-23 20:31           ` Patrick M Doane
  0 siblings, 0 replies; 40+ messages in thread
From: Patrick M Doane @ 2001-08-23 20:31 UTC (permalink / raw)
  To: Markus Mottl; +Cc: Neale Pickett, OCAML

This functionality was discussed earlier this year under the
thread 'ocamlmktop and includes'.  I believe that Chris Hecker posted an
updated version of ocamlmktop that added this functionality.

Patrick

On Thu, 23 Aug 2001, Markus Mottl wrote:

> On Thu, 23 Aug 2001, Neale Pickett wrote:
> > Markus Mottl writes:
> > 
> > >   ./mytop -I /usr/lib/ocaml/contrib
> > 
> > AHA!  I wonder why it doesn't look here by default.  Anyway, I now have
> > a shell script wrapper which does the job nicely.
> 
> Hm, this could be a feature request: allow adding default include paths to
> generated toplevels. This would surely save many people some headaches.
> Interface files usually don't move after toplevel-generation anyway. Why
> not just include the directories specified during toplevel generation
> by default?
> 
> What do the developers think? Adding this feature should hardly cost
> any time.
> 
> Regards,
> Markus Mottl
> 
> -- 
> Markus Mottl                                             markus@oefai.at
> Austrian Research Institute
> for Artificial Intelligence                  http://www.oefai.at/~markus
> -------------------
> Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
> To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr
> 

-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE : [Caml-list] standard regex package
  2001-08-23 19:28           ` Brian Rogoff
  2001-08-23 19:49             ` Miles Egan
  2001-08-23 19:51             ` Gerd Stolpmann
@ 2001-08-23 21:06             ` Lionel Fourquaux
  2001-08-24  9:23               ` [Caml-list] dynamic loading and OS interface Xavier Leroy
  2001-08-27 15:16             ` [Caml-list] standard regex package Ian Zimmerman
  3 siblings, 1 reply; 40+ messages in thread
From: Lionel Fourquaux @ 2001-08-23 21:06 UTC (permalink / raw)
  To: caml-list

> -----Original Message-----
> From: owner-caml-list@pauillac.inria.fr [mailto:owner-caml-
> list@pauillac.inria.fr] On Behalf Of Brian Rogoff
> Sent: Thursday, August 23, 2001 9:29 PM
> To: caml-list@inria.fr
> Subject: Re: [Caml-list] standard regex package
>
> On Thu, 23 Aug 2001, Miles Egan wrote:
> [... snip ...]
> > These would all be very nice, I agree, but I also think we need
> something better
> > than str and sooner than all these things could be implemented.
Maybe
> some kind
> > of transitional scheme would work?
>
> I agree, from a pragmatic point of view a better regexp matcher would
make
> OCaml
> significantly sexier to all of those poor deluded Python and Perl
> programmers.
> The other stuff can come later. I think Markus has a very good point
about
> some distutils (Python) like facility being even more important. Once
such
> a framework is in place we can have an OCaml CPAN. Last time I looked
> findlib
> ran only on Unix, which is a big problem.
>
> On the subject of "social tools", the program Neel is looking for is
this
> one
>
> http://www.lri.fr/~filliatr/ocamlweb/index.en.html
>
> and Hevea of course. There is also OCamlDoc, which seems quite nice
too,
> but ocamlweb is used in a few more libraries I think.
>
> > While were on the subject of beginner usability, it seems to me that
if
> dynamic
> > loading of c-libraries is still a ways off, it might be nice to
build
> the unix
> > module into the toplevel at install time.

What is the current status of dynamic loading? I've seen a lot
of implementations for Unix, and it shouldn't be too difficult to
port them to Win32 (I tried it once), although it requires some design
choices. Are there plans to add it in the near future?


>
> A better apporach might be to ape Python and the SML Basis Library by
> providing a
> generic "OS" module which abstracts at least Unix/Win/Mac away. I
would
> prefer this, since I feel silly using Unix.<blah> on a Windows box
:-).

I'd like to point out that this would make it possible to add lots
functionalities without too much trouble. For example, user ids work
in the Unix implementation, but have no real meaning in the Win32 one.
Turn them into an abstract type and you can simply use Win32 SIDs!
(Of course, you have to provide some more functions to compare them,
...)
Another instance is that of inodes. I'd be possible to use the Win32
file index, but it's a 64 bits value and really doesn't fit in the
int type.

>
> -- Brian
>
>
> -------------------
> Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ:
> http://caml.inria.fr/FAQ/
> To unsubscribe, mail caml-list-request@inria.fr  Archives:
> http://caml.inria.fr


-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] standard regex package
  2001-08-23 19:51             ` Gerd Stolpmann
@ 2001-08-23 21:12               ` Brian Rogoff
  2001-08-23 21:27               ` Benjamin C. Pierce
  1 sibling, 0 replies; 40+ messages in thread
From: Brian Rogoff @ 2001-08-23 21:12 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: caml-list

On Thu, 23 Aug 2001, Gerd Stolpmann wrote:
> On Thu, 23 Aug 2001, Brian Rogoff wrote:
> >The other stuff can come later. I think Markus has a very good point about 
> >some distutils (Python) like facility being even more important. Once such
> >a framework is in place we can have an OCaml CPAN. Last time I looked findlib 
> >ran only on Unix, which is a big problem. 
> 
> When I developed findlib I had something like CPAN in mind. I started it when I
> downloaded several 3rd party packages, and all had a different installation
> routines I had to modify for my purposes. For those who don't know: You can
> install almost every 3rd party Perl package by simply doing
> 
> perl Makefile.pl
> make
> make test
> make install
> 
> It is simple to do, and that's an important aspect of the success of Perl (a
> language which is nothing without CPAN).

Right, and our advantage is that the language is something without CCAN
(COAN? KOAN? :) but will be much more with it. 

> I hope we will have CCAN. Of course, one precondition is a standard package
> structure, and I can imagine the findlib tool could be an important part of it
> (for a description see 
> http://test.ocaml-programming.de/packages/documentation/findlib/).
> 
> Currently, findlib runs only on Unix (including cygwin), but this is mainly
> because I have no native Windows installation on which I could test it.

Maybe the Consortium should just get you a few Windows machines? I'm
totally serious about that; once this hole is filled we can seriously talk 
about world domination. Err, benevolent world domination, of course. 

> Especially, I have removed all shell scripts (it is now purely programmed in
> OCaml), and it is only a question of fixing details. (And of writing a
> replacement for "configure".)
> 
[...snip...]

> >A better apporach might be to ape Python and the SML Basis Library by providing a 
> >generic "OS" module which abstracts at least Unix/Win/Mac away. I would
> >prefer this, since I feel silly using Unix.<blah> on a Windows box :-).
> 
> "Unix" is only a name for an API (and not for a technology or a familiy of OSs),

Really that API is "Posix", right? Yeah, I know, a small matter of spelling.

> and it is clearly MS's fault not to be Unix-compliant (other operating systems
> originating in different worlds are, e.g. MVS includes a Unix API). But that's
> politics... (MS had some times ago a Posix mode in NT but nobody used it, so I
> think there are no real technical reasons.)

My memory is that NT's Posix mode was not truely Posix compliant, but it's
been a while and I could be wrong. 

> Being more abstract has also disadvantages because you don't know which system
> calls are done for one abstract call. 

I was thinking of "OS" as providing a high level view of a fairly generic
OS with a hierarchical file system. I agree that a Posix module with Posix 
semantics is also required, as is a WindowsNT/2000/XP or whatever they
decide to call it. Or to sound hip, a thick binding to generic OS features
and numerous thin bindings which are mostly just stubs for the system  
calls. 

-- Brian


-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] standard regex package
  2001-08-23 19:51             ` Gerd Stolpmann
  2001-08-23 21:12               ` Brian Rogoff
@ 2001-08-23 21:27               ` Benjamin C. Pierce
  2001-08-23 21:49                 ` Gerd Stolpmann
  1 sibling, 1 reply; 40+ messages in thread
From: Benjamin C. Pierce @ 2001-08-23 21:27 UTC (permalink / raw)
  To: info; +Cc: Brian Rogoff, caml-list

> For those who don't know: You can
> install almost every 3rd party Perl package by simply doing
> 
> perl Makefile.pl
> make
> make test
> make install
> 
> It is simple to do, and that's an important aspect of the success of Perl (a
> language which is nothing without CPAN).

Having just spent 90 minutes last weekend trying to get a PERL package
installed and working, I can say with confidence that PERL's standard
installation procedure, while slick, leaves one big thing to be desired:
following dependencies.  The problem with the "CPAN way" is that it leads
to 10,000 people writing cool little packages, all of which depend on ten
other cool little packages written by somebody else, etc., etc.
Following all these dependency chains manually by trying to install one
package, failing, grepping around in CPAN for the ones it depends on,
downloading them, trying to install, failing, ... is a pretty boring way
to spend a morning.

I really wish that I'd been able to say to some tool, "I want to use
module X; please go off to CPAN and find, download, and install me the
current versions of X and all the modules it transitively depends on."  I
know that it would be reallyreally hard to design a framework that would
always do the right thing, but if it did the right thing 99% of the time
and gave me a type error in 99% of the cases where it did not do the
right thing, it would be fantastic (and I believe both of these numbers
would be rather easy to achieve by low-tech means).

    Benjamin

-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] standard regex package
  2001-08-23 21:27               ` Benjamin C. Pierce
@ 2001-08-23 21:49                 ` Gerd Stolpmann
  2001-08-23 22:11                   ` Miles Egan
  2001-08-24  9:23                   ` [Caml-list] standard regex package Sven
  0 siblings, 2 replies; 40+ messages in thread
From: Gerd Stolpmann @ 2001-08-23 21:49 UTC (permalink / raw)
  To: bcpierce, Benjamin C. Pierce; +Cc: Brian Rogoff, caml-list

On Thu, 23 Aug 2001, Benjamin C. Pierce wrote:
>> For those who don't know: You can
>> install almost every 3rd party Perl package by simply doing
>> 
>> perl Makefile.pl
>> make
>> make test
>> make install
>> 
>> It is simple to do, and that's an important aspect of the success of Perl (a
>> language which is nothing without CPAN).
>
>Having just spent 90 minutes last weekend trying to get a PERL package
>installed and working, I can say with confidence that PERL's standard
>installation procedure, while slick, leaves one big thing to be desired:
>following dependencies.  The problem with the "CPAN way" is that it leads
>to 10,000 people writing cool little packages, all of which depend on ten
>other cool little packages written by somebody else, etc., etc.
>Following all these dependency chains manually by trying to install one
>package, failing, grepping around in CPAN for the ones it depends on,
>downloading them, trying to install, failing, ... is a pretty boring way
>to spend a morning.
>
>I really wish that I'd been able to say to some tool, "I want to use
>module X; please go off to CPAN and find, download, and install me the
>current versions of X and all the modules it transitively depends on."  I
>know that it would be reallyreally hard to design a framework that would
>always do the right thing, but if it did the right thing 99% of the time
>and gave me a type error in 99% of the cases where it did not do the
>right thing, it would be fantastic (and I believe both of these numbers
>would be rather easy to achieve by low-tech means).

The CPAN module does it; see perldoc CPAN.

For O'Caml the corresponding functionality would be much harder because
of possible version conflicts. Say you have packages A and B already installed,
and A depends on B. Furthermore, you want to install package C, but package C
requires the new version B'. For Perl, this is less dramatic than it seems to
be because everything is dynamic, and the packages can themselves cope with
version conflicts. For Caml, a clever automatic installer must decide:

- Does the already installed package A work with the new version B'? If yes, 
  is it necessary to recompile package A?

- If package A must be renewed, too: Is there a version A' that works with B'?

- If there are other packages that depend on A: Is it necessary to
  recompile/relink them?

Because these questions are difficult, findlib does not include versioned
dependencies (but it includes versions as such, and dependencies as such). So
it requires still an intelligent operator that helps finding the right versions.

One feature that should be added is automatic recompilation of dependent
packages (from known sources).

-- 
----------------------------------------------------------------------------
Gerd Stolpmann      Telefon: +49 6151 997705 (privat)
Viktoriastr. 45             
64293 Darmstadt     EMail:   gerd@gerd-stolpmann.de
Germany                     
----------------------------------------------------------------------------
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] standard regex package
  2001-08-23 21:49                 ` Gerd Stolpmann
@ 2001-08-23 22:11                   ` Miles Egan
  2001-08-23 23:55                     ` Gerd Stolpmann
  2001-08-24  9:23                   ` [Caml-list] standard regex package Sven
  1 sibling, 1 reply; 40+ messages in thread
From: Miles Egan @ 2001-08-23 22:11 UTC (permalink / raw)
  To: caml-list

On Thu, Aug 23, 2001 at 11:49:50PM +0200, Gerd Stolpmann wrote:
> Because these questions are difficult, findlib does not include versioned
> dependencies (but it includes versions as such, and dependencies as such). So
> it requires still an intelligent operator that helps finding the right versions.

One danger in developing such a system is that you'll wind up duplicating the
rather extensive functionality of existing package management systems.  RPM and
DEB both handle these kinds of dependencies and are fairly complex systems for
it.  CPAN has its shortcomings, but it also works suprisingly well most of the
time.  I think you should at least consider taking a "worse-is-better" approach
and build something that works and leave the delicate dependency management to
the distribution packagers.

-- 
miles

"We in the past evade X, where X is something which we believe to be a
lion, through the act of running." - swiftrain@geocities.com
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] standard regex package
  2001-08-23 22:11                   ` Miles Egan
@ 2001-08-23 23:55                     ` Gerd Stolpmann
  2001-08-24  9:03                       ` Claudio Sacerdoti Coen
  2001-08-24  9:26                       ` Sven
  0 siblings, 2 replies; 40+ messages in thread
From: Gerd Stolpmann @ 2001-08-23 23:55 UTC (permalink / raw)
  To: Miles Egan, caml-list

On Fri, 24 Aug 2001, Miles Egan wrote:
>On Thu, Aug 23, 2001 at 11:49:50PM +0200, Gerd Stolpmann wrote:
>> Because these questions are difficult, findlib does not include versioned
>> dependencies (but it includes versions as such, and dependencies as such). So
>> it requires still an intelligent operator that helps finding the right versions.
>
>One danger in developing such a system is that you'll wind up duplicating the
>rather extensive functionality of existing package management systems.  RPM and
>DEB both handle these kinds of dependencies and are fairly complex systems for
>it.  CPAN has its shortcomings, but it also works suprisingly well most of the
>time.  I think you should at least consider taking a "worse-is-better" approach
>and build something that works and leave the delicate dependency management to
>the distribution packagers.

As far as I know, RPM/DEP focus on binary installations, and source packages
exist only to conveniently make binary packages. This means: Someone already
has reviewed the package and decided which versions to take.

For a CPAN-like system, the primary goal is to simplify installations from
source. This is far more complicated because typically nobody has checked which
package versions (reliably) work together. As pointed out in my last mail,
there is no definite algorithm, and at most it would be possible to find some
heuristics working in many cases.

But this doesn't mean that CPAN isn't possible for Caml. It only means: Don't
begin with a fully automatic installation routine, and lower your goals. It
would be already great if we had a standard package format and a standard
procedure to install packages, even if we would have to call it manually (at the
beginning).

Gerd
-- 
----------------------------------------------------------------------------
Gerd Stolpmann      Telefon: +49 6151 997705 (privat)
Viktoriastr. 45             
64293 Darmstadt     EMail:   gerd@gerd-stolpmann.de
Germany                     
----------------------------------------------------------------------------
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] standard regex package
  2001-08-23 23:55                     ` Gerd Stolpmann
@ 2001-08-24  9:03                       ` Claudio Sacerdoti Coen
  2001-08-24  9:26                       ` Sven
  1 sibling, 0 replies; 40+ messages in thread
From: Claudio Sacerdoti Coen @ 2001-08-24  9:03 UTC (permalink / raw)
  To: caml-list

 By the way, someone mentioned the debian effort to package OCaml stuff.
 Well, Gerd has recently modified findlib in such a way that it is now
 almost trivial to make a debian package of a findlib-ed one. (And there
 are already many of them available.) I think that the same thing is
 true also for other packaging systems (e.g. RPM).

 Hence, if findlib was widely adopted, there would be no conflict at
 all between it and the debian-ized stuff.

 					Cheers,
					 C.S.C.

-- 
----------------------------------------------------------------
Real name: Claudio Sacerdoti Coen
PhD Student in Computer Science at University of Bologna
E-mail: sacerdot@cs.unibo.it
http://caristudenti.cs.unibo.it/~sacerdot
----------------------------------------------------------------
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] standard regex package
  2001-08-23 18:08         ` [Caml-list] standard regex package Miles Egan
  2001-08-23 19:28           ` Brian Rogoff
@ 2001-08-24  9:13           ` Xavier Leroy
  2001-08-24 10:16             ` Markus Mottl
  2001-08-24 16:49             ` Miles Egan
  1 sibling, 2 replies; 40+ messages in thread
From: Xavier Leroy @ 2001-08-24  9:13 UTC (permalink / raw)
  To: Miles Egan; +Cc: caml-list

We all agree that the Str regexp library has problems, both on the
interface side (too much global state) and on the implementation side
(fails mysteriously on complex regexps).

The last time this topic came up on this list, I said that we aren't
opposed to put PCRE in the OCaml distribution (provided Markus agrees
with that, of course).  BUT: in the name of backward compatibility, we
must have an Str-compatible interface to this library (same functions
and same regexp syntax as in Str), in addition to the native PCRE
interface.  I think it can be done, but the replies I got to this
request were of the form "I don't have time to do this".

Also: the PCRE interface is quite heavyweight, with a zillion options
whose purpose are not always clear to me.  This can be a bit frightening
and will need a lot of carefully worded documentation to explain that
most of these options are useless 99% of the time :-)  This is not a
criticism towards Markus' work, more like a criticism towards Perl's
and PCRE's "creeping featuritism" syndrom.

So: any taker for writing this Str-compatibility layer?

- Xavier Leroy
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] dynamic loading and OS interface
  2001-08-23 21:06             ` RE : " Lionel Fourquaux
@ 2001-08-24  9:23               ` Xavier Leroy
  0 siblings, 0 replies; 40+ messages in thread
From: Xavier Leroy @ 2001-08-24  9:23 UTC (permalink / raw)
  To: Lionel Fourquaux; +Cc: caml-list

> What is the current status of dynamic loading? I've seen a lot
> of implementations for Unix, and it shouldn't be too difficult to
> port them to Win32 (I tried it once), although it requires some design
> choices. Are there plans to add it in the near future?

I'm working on it right now (when not replying to the numerous
messages on this list :-).  If you're curious, you can look at the
"dynamic_loading" branch on the OCaml development sources
(camlcvs.inria.fr).  I'll post more information about it once I'm
happy with the result, to get some feedback on the design.

> > A better apporach might be to ape Python and the SML Basis Library by
> > providing a
> > generic "OS" module which abstracts at least Unix/Win/Mac away. I
> would
> > prefer this, since I feel silly using Unix.<blah> on a Windows box
> :-).

In a way, that's how the OCaml library is structured already:
the OS-independent stuff is in Pervasives and Sys, and corresponds
roughly to what ANSI C offers; Unix is the OS-dependent stuff,
corresponding roughly to Posix + BSD sockets.

But of course users want all functions to be available in an
OS-independent manner, for easy porting of their programs, and that's
how we ended up with an emulation of the Unix module under Windows.  
Still, there are limits to what an OS-independent interface can
provide if you want to remain truly portable.

- Xavier Leroy
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] standard regex package
  2001-08-23 21:49                 ` Gerd Stolpmann
  2001-08-23 22:11                   ` Miles Egan
@ 2001-08-24  9:23                   ` Sven
  2001-08-27 15:54                     ` Ian Zimmerman
  1 sibling, 1 reply; 40+ messages in thread
From: Sven @ 2001-08-24  9:23 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: bcpierce, Benjamin C. Pierce, Brian Rogoff, caml-list

On Thu, Aug 23, 2001 at 11:49:50PM +0200, Gerd Stolpmann wrote:
> On Thu, 23 Aug 2001, Benjamin C. Pierce wrote:
> >> For those who don't know: You can
> >> install almost every 3rd party Perl package by simply doing
> >> 
> >> perl Makefile.pl
> >> make
> >> make test
> >> make install
> >> 
> >> It is simple to do, and that's an important aspect of the success of Perl (a
> >> language which is nothing without CPAN).
> >
> >Having just spent 90 minutes last weekend trying to get a PERL package
> >installed and working, I can say with confidence that PERL's standard
> >installation procedure, while slick, leaves one big thing to be desired:
> >following dependencies.  The problem with the "CPAN way" is that it leads
> >to 10,000 people writing cool little packages, all of which depend on ten
> >other cool little packages written by somebody else, etc., etc.
> >Following all these dependency chains manually by trying to install one
> >package, failing, grepping around in CPAN for the ones it depends on,
> >downloading them, trying to install, failing, ... is a pretty boring way
> >to spend a morning.
> >
> >I really wish that I'd been able to say to some tool, "I want to use
> >module X; please go off to CPAN and find, download, and install me the
> >current versions of X and all the modules it transitively depends on."  I
> >know that it would be reallyreally hard to design a framework that would
> >always do the right thing, but if it did the right thing 99% of the time
> >and gave me a type error in 99% of the cases where it did not do the
> >right thing, it would be fantastic (and I believe both of these numbers
> >would be rather easy to achieve by low-tech means).
> 
> The CPAN module does it; see perldoc CPAN.
> 
> For O'Caml the corresponding functionality would be much harder because
> of possible version conflicts. Say you have packages A and B already installed,
> and A depends on B. Furthermore, you want to install package C, but package C
> requires the new version B'. For Perl, this is less dramatic than it seems to
> be because everything is dynamic, and the packages can themselves cope with
> version conflicts. For Caml, a clever automatic installer must decide:
> 
> - Does the already installed package A work with the new version B'? If yes, 
>   is it necessary to recompile package A?
> 
> - If package A must be renewed, too: Is there a version A' that works with B'?
> 
> - If there are other packages that depend on A: Is it necessary to
>   recompile/relink them?
> 
> Because these questions are difficult, findlib does not include versioned
> dependencies (but it includes versions as such, and dependencies as such). So
> it requires still an intelligent operator that helps finding the right versions.
> 
> One feature that should be added is automatic recompilation of dependent
> packages (from known sources).

Also a way to export the dependencies in a string format, and be able to turn
all this stuff of in the case your OS/distribution/whatever already handles
this would be nice too.

Friendly,

Sven Luther
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] standard regex package
  2001-08-23 23:55                     ` Gerd Stolpmann
  2001-08-24  9:03                       ` Claudio Sacerdoti Coen
@ 2001-08-24  9:26                       ` Sven
  2001-08-27 15:46                         ` [Caml-list] Package dependencies [Was: standard regex package] Ian Zimmerman
  1 sibling, 1 reply; 40+ messages in thread
From: Sven @ 2001-08-24  9:26 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: Miles Egan, caml-list

On Fri, Aug 24, 2001 at 01:55:25AM +0200, Gerd Stolpmann wrote:
> On Fri, 24 Aug 2001, Miles Egan wrote:
> >On Thu, Aug 23, 2001 at 11:49:50PM +0200, Gerd Stolpmann wrote:
> >> Because these questions are difficult, findlib does not include versioned
> >> dependencies (but it includes versions as such, and dependencies as such). So
> >> it requires still an intelligent operator that helps finding the right versions.
> >
> >One danger in developing such a system is that you'll wind up duplicating the
> >rather extensive functionality of existing package management systems.  RPM and
> >DEB both handle these kinds of dependencies and are fairly complex systems for
> >it.  CPAN has its shortcomings, but it also works suprisingly well most of the
> >time.  I think you should at least consider taking a "worse-is-better" approach
> >and build something that works and leave the delicate dependency management to
> >the distribution packagers.
> 
> As far as I know, RPM/DEP focus on binary installations, and source packages
> exist only to conveniently make binary packages. This means: Someone already
> has reviewed the package and decided which versions to take.

debian package include source dependencies, and you are true, it is the work
of the package maintainer to provide those, aided by the numerous bug reports
we get from the porters if the build dependencies are bad.

That said, normally each package has listed in it's INSTALL/README the
dependencies, though in an informal way.

Maybe a kind of more formal dependency listing would be nice, which would be
shared by all ocaml related sources, and may then be filled in the
corresponding debian/rpm control files.

Friendly,

Sven Luther
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] standard regex package
  2001-08-24  9:13           ` Xavier Leroy
@ 2001-08-24 10:16             ` Markus Mottl
  2001-08-24 16:49             ` Miles Egan
  1 sibling, 0 replies; 40+ messages in thread
From: Markus Mottl @ 2001-08-24 10:16 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: Miles Egan, caml-list

On Fri, 24 Aug 2001, Xavier Leroy wrote:
> The last time this topic came up on this list, I said that we aren't
> opposed to put PCRE in the OCaml distribution (provided Markus agrees
> with that, of course).

No objection on my side. That's why I have LGPLed it.

> BUT: in the name of backward compatibility, we must have an
> Str-compatible interface to this library (same functions and same
> regexp syntax as in Str), in addition to the native PCRE interface.

As was mentioned in our last discussion on this topic, backwards
compatibility would require writing a stateful interface around the
PCRE, conversion functions for regular expressions and compatible
implementations of the other functions. Is this really necessary? Why
not just keep the old Str-module and deprecate its use? Of course, if
the strange behaviour of Str wrt. large regexps is severe, somebody would
have to do it if debugging Richard Stallman's code is not an option... ;)

> I think it can be done, but the replies I got to this request were of
> the form "I don't have time to do this".

Ahem, well, what concerns me, this is unfortunately the case right now. I
really need to get on with my actual project (a machine learning system).

What about the many new heros on this list? This would be a good exercise!
:-)

> Also: the PCRE interface is quite heavyweight, with a zillion options
> whose purpose are not always clear to me.  This can be a bit frightening
> and will need a lot of carefully worded documentation to explain that
> most of these options are useless 99% of the time :-)  This is not a
> criticism towards Markus' work, more like a criticism towards Perl's
> and PCRE's "creeping featuritism" syndrom.

I agree. The reasons why I made it rather heavyweight are that hardly
anybody could argue that Perl or the PCRE support features he needs but
are not supported by this library, thus easing the change to OCaml. I
was also practicing writing C-interfaces at this time so I thought I'd
implement all PCRE-functions for practice.

I would certainly not have any objections against making the library more
lightweight: probably many functions could be removed without hesitation
(e.g. information on patterns). It may also be worthwhile to reconsider
the way labels and optional arguments are used, though the latter only
look evil in the interface but are extremely convenient to use once one
gets the scheme behind, which is invariant throughout the library.

Regards,
Markus Mottl

-- 
Markus Mottl                                             markus@oefai.at
Austrian Research Institute
for Artificial Intelligence                  http://www.oefai.at/~markus
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] standard regex package
  2001-08-24  9:13           ` Xavier Leroy
  2001-08-24 10:16             ` Markus Mottl
@ 2001-08-24 16:49             ` Miles Egan
  1 sibling, 0 replies; 40+ messages in thread
From: Miles Egan @ 2001-08-24 16:49 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: caml-list

On Fri, Aug 24, 2001 at 11:13:31AM +0200, Xavier Leroy wrote:
> The last time this topic came up on this list, I said that we aren't
> opposed to put PCRE in the OCaml distribution (provided Markus agrees
> with that, of course).  BUT: in the name of backward compatibility, we
> must have an Str-compatible interface to this library (same functions
> and same regexp syntax as in Str), in addition to the native PCRE
> interface.  I think it can be done, but the replies I got to this
> request were of the form "I don't have time to do this".

I'd be happy to do it, but I'm afraid it's a bit beyond my abilities.  What do
you think of Markus' suggestion of simply keeping the current str module and
adding a new pcre module?

-- 
miles

"We in the past evade X, where X is something which we believe to be a
lion, through the act of running." - swiftrain@geocities.com
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] standard regex package
  2001-08-23 19:28           ` Brian Rogoff
                               ` (2 preceding siblings ...)
  2001-08-23 21:06             ` RE : " Lionel Fourquaux
@ 2001-08-27 15:16             ` Ian Zimmerman
  2001-08-27 15:35               ` Brian Rogoff
  3 siblings, 1 reply; 40+ messages in thread
From: Ian Zimmerman @ 2001-08-27 15:16 UTC (permalink / raw)
  To: OCAML


Brian> A better apporach might be to ape Python and the SML Basis
Brian> Library by providing a generic "OS" module which abstracts at
Brian> least Unix/Win/Mac away. I would prefer this, since I feel
Brian> silly using Unix.<blah> on a Windows box :-).

Surely you _don't_ mean the LCD of Unix/Win/Mac, and throwing out all
the APIs which are specific to Unix but _so_ darn useful?  Why do you
think I am a GNU/Linux user in the first place?

-- 
Ian Zimmerman, Oakland, California, U.S.A.
The easiest way to win an argument: ridicule your opponent's basic
assumptions by stating their negation and postfixing it with ", right?"
GPG pub key: 433BA087 9C0F 194F 203A 63F7 B1B8 6E5A 8CA3 27DB 433B A087
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] standard regex package
  2001-08-27 15:16             ` [Caml-list] standard regex package Ian Zimmerman
@ 2001-08-27 15:35               ` Brian Rogoff
  0 siblings, 0 replies; 40+ messages in thread
From: Brian Rogoff @ 2001-08-27 15:35 UTC (permalink / raw)
  To: caml-list

On 27 Aug 2001, Ian Zimmerman wrote:
> Brian> A better apporach might be to ape Python and the SML Basis
> Brian> Library by providing a generic "OS" module which abstracts at
> Brian> least Unix/Win/Mac away. I would prefer this, since I feel
> Brian> silly using Unix.<blah> on a Windows box :-).
> 
> Surely you _don't_ mean the LCD of Unix/Win/Mac, 

That's exactly what I mean. 

> and throwing out all the APIs which are specific to Unix but _so_ darn
> useful?  

Please read my reply to Gerd. I'm not suggesting throwing away anything,
in fact I suggest adding stuff. In case it wasn't clear, I meant to have a 
high level "LCD" module for generic OS stuff (call it GenerOS, or VirtuOS 
for virtual OS :), *and* have Unix, Windows, Mac, BeOS, Amiga (OK,
now I'm just being silly :) modules for those APIs. If you really want to 
get extreme there could be multiple Unix modules for each Unix variation, 
and multiple Windows and Mac modules for each variation of those OSes, but 
I think OS, Posix, Windows, Mac, etc should be sufficient. 

> Why do you think I am a GNU/Linux user in the first place?

Honestly, I haven't spent much time pondering the matter. :-)

-- Brian


-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Caml-list] Package dependencies [Was: standard regex package]
  2001-08-24  9:26                       ` Sven
@ 2001-08-27 15:46                         ` Ian Zimmerman
  2001-08-27 20:50                           ` Gerd Stolpmann
  0 siblings, 1 reply; 40+ messages in thread
From: Ian Zimmerman @ 2001-08-27 15:46 UTC (permalink / raw)
  To: OCAML


Miles> One danger in developing such a system is that you'll wind up duplicating the
Miles> rather extensive functionality of existing package management systems.  RPM and
Miles> DEB both handle these kinds of dependencies and are fairly complex systems for
Miles> it.  CPAN has its shortcomings, but it also works suprisingly well most of the
Miles> time.  I think you should at least consider taking a "worse-is-better" approach
Miles> and build something that works and leave the delicate dependency management to
Miles> the distribution packagers.

Gerd> As far as I know, RPM/DEP focus on binary installations, and source packages
Gerd> exist only to conveniently make binary packages. This means: Someone already
Gerd> has reviewed the package and decided which versions to take.

Sven> debian package include source dependencies, and you are true, it is the work
Sven> of the package maintainer to provide those, aided by the numerous bug reports
Sven> we get from the porters if the build dependencies are bad.

Sven> That said, normally each package has listed in it's INSTALL/README the
Sven> dependencies, though in an informal way.

Sven> Maybe a kind of more formal dependency listing would be nice, which would be
Sven> shared by all ocaml related sources, and may then be filled in the
Sven> corresponding debian/rpm control files.

The good thing about CPAN the module, and really the whole Perl build
and install scheme, when it works, is precisely that it eases
installation of tarballs [1] _which are not (yet) available_ in the
native OS binary form; and it does this in a way that doesn't conflict
with the OS installation.  That is, the distinction it makes between
INSTALLPRIVLIB and INSTALLSITELIB.

I believe this point is really fudamental to the Perl scheme, and is
really what people mean when they use the term "CPAN" in this thread.
To behave like that, yes we do need formal dependencies in the
tarballs, completely independent of the dependencies an OS packager
might add at the time of creating a binary package, and exploited in a
uniform way by the tarball installation process (even if it is only
for a check target, as Perl does it).

I think the present findlib already does this, when configured
properly; let's keep this functionality whichever way Ocaml
build/install is eventually implemented.


[1] I intentionally use the term "tarball" here to emphasise the
distinction from "packages" which are things the OS maintainer (not
the software author) makes.


-- 
Ian Zimmerman, Oakland, California, U.S.A.
The easiest way to win an argument: ridicule your opponent's basic
assumptions by stating their negation and postfixing it with ", right?"
GPG pub key: 433BA087 9C0F 194F 203A 63F7 B1B8 6E5A 8CA3 27DB 433B A087
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] standard regex package
  2001-08-24  9:23                   ` [Caml-list] standard regex package Sven
@ 2001-08-27 15:54                     ` Ian Zimmerman
  2001-08-30  8:41                       ` Sven
  0 siblings, 1 reply; 40+ messages in thread
From: Ian Zimmerman @ 2001-08-27 15:54 UTC (permalink / raw)
  To: OCAML


Sven> be able to turn all this stuff of in the case your
Sven> OS/distribution/whatever already handles this would be nice too.

Why?

The Perl install routine works just fine alongside the Debian one.
When I'm installing a Perl module from source, the dependencies it is
looking for can come either from other modules I have installed the
same way, or from Debian packages.  It doesn't care, and that's how it
should be.

Once again, the thing to remember is: the dependencies we're talking
about in this thread are _for installation from sources_, when the
modules hasn't yet been packaged in the native OS form.  The initial
packaging is a special case of such source installation.

-- 
Ian Zimmerman, Oakland, California, U.S.A.
The easiest way to win an argument: ridicule your opponent's basic
assumptions by stating their negation and postfixing it with ", right?"
GPG pub key: 433BA087 9C0F 194F 203A 63F7 B1B8 6E5A 8CA3 27DB 433B A087
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] Package dependencies [Was: standard regex package]
  2001-08-27 15:46                         ` [Caml-list] Package dependencies [Was: standard regex package] Ian Zimmerman
@ 2001-08-27 20:50                           ` Gerd Stolpmann
  0 siblings, 0 replies; 40+ messages in thread
From: Gerd Stolpmann @ 2001-08-27 20:50 UTC (permalink / raw)
  To: Ian Zimmerman, OCAML

On Mon, 27 Aug 2001, Ian Zimmerman wrote:
>Miles> One danger in developing such a system is that you'll wind up duplicating the
>Miles> rather extensive functionality of existing package management systems.  RPM and
>Miles> DEB both handle these kinds of dependencies and are fairly complex systems for
>Miles> it.  CPAN has its shortcomings, but it also works suprisingly well most of the
>Miles> time.  I think you should at least consider taking a "worse-is-better" approach
>Miles> and build something that works and leave the delicate dependency management to
>Miles> the distribution packagers.
>
>Gerd> As far as I know, RPM/DEP focus on binary installations, and source packages
>Gerd> exist only to conveniently make binary packages. This means: Someone already
>Gerd> has reviewed the package and decided which versions to take.
>
>Sven> debian package include source dependencies, and you are true, it is the work
>Sven> of the package maintainer to provide those, aided by the numerous bug reports
>Sven> we get from the porters if the build dependencies are bad.
>
>Sven> That said, normally each package has listed in it's INSTALL/README the
>Sven> dependencies, though in an informal way.
>
>Sven> Maybe a kind of more formal dependency listing would be nice, which would be
>Sven> shared by all ocaml related sources, and may then be filled in the
>Sven> corresponding debian/rpm control files.
>
>The good thing about CPAN the module, and really the whole Perl build
>and install scheme, when it works, is precisely that it eases
>installation of tarballs [1] _which are not (yet) available_ in the
>native OS binary form; and it does this in a way that doesn't conflict
>with the OS installation.  That is, the distinction it makes between
>INSTALLPRIVLIB and INSTALLSITELIB.
>
>I believe this point is really fudamental to the Perl scheme, and is
>really what people mean when they use the term "CPAN" in this thread.
>To behave like that, yes we do need formal dependencies in the
>tarballs, completely independent of the dependencies an OS packager
>might add at the time of creating a binary package, and exploited in a
>uniform way by the tarball installation process (even if it is only
>for a check target, as Perl does it).

Yes, there is a difference between (1) the dependencies on the level of a
progamming environment and (2) the dependencies on the system level. For
example, the binary may use a system library, and this dependency is important
for (2) but not for (1) because it's outside the scope of the programming
environment. For a CPAN-like installer only (1) is of interest, but of
course, knowing (1) is very helpful to describe (2).

>I think the present findlib already does this, when configured
>properly; let's keep this functionality whichever way Ocaml
>build/install is eventually implemented.

findlib supports as many installation locations as you want, so you can easily
implement the difference between INSTALLPRIVLIB and INSTALLSITELIB. It is
thought as a tool to manage add-ons to the core O'Caml installation. At the
beginning, it supported only one directory (=INSTALLSITELIB), but after the
Debian people persuaded me several directories are allowed. Debian needs this
because there is one location where the OS packager installs and one location
where the user adds packages (seems to be a common requirement).

Gerd
-- 
----------------------------------------------------------------------------
Gerd Stolpmann      Telefon: +49 6151 997705 (privat)
Viktoriastr. 45             
64293 Darmstadt     EMail:   gerd@gerd-stolpmann.de
Germany                     
----------------------------------------------------------------------------
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] standard regex package
  2001-08-27 15:54                     ` Ian Zimmerman
@ 2001-08-30  8:41                       ` Sven
  0 siblings, 0 replies; 40+ messages in thread
From: Sven @ 2001-08-30  8:41 UTC (permalink / raw)
  To: Ian Zimmerman; +Cc: OCAML

On Mon, Aug 27, 2001 at 08:54:48AM -0700, Ian Zimmerman wrote:
> 
> Sven> be able to turn all this stuff of in the case your
> Sven> OS/distribution/whatever already handles this would be nice too.
> 
> Why?
> 
> The Perl install routine works just fine alongside the Debian one.
> When I'm installing a Perl module from source, the dependencies it is
> looking for can come either from other modules I have installed the
> same way, or from Debian packages.  It doesn't care, and that's how it
> should be.

Well, but the debian packager of the perl stuff, altough i didn't looked at it
in detail, will have to do the stuff needed to disable those dependencies in
the build system, and implement the debian one instead. Same goes for rpms and
others. 

So what i, as a debian packager of various ocaml stuff, was asking, was to
also have an _easy_ way of disabling the builtin dependency stuff, and use the
OSes dependencies in place, and maybe a compatible dependencies format, in
order to just be able to copy it to the packages control file. 

That said, i don't know what kind of dependencies you are speaking about, if
it will be only between ocaml stuff, or also outside libraries needed by your
packages (like gtk+ for lablgtk/mlgtk for example), and if you want to
maintain versionned dependencies on these.

> Once again, the thing to remember is: the dependencies we're talking
> about in this thread are _for installation from sources_, when the
> modules hasn't yet been packaged in the native OS form.  The initial
> packaging is a special case of such source installation.

Well, but don't forget about apt-get source -b package_name, which, if your
apt is new enough will hapilly download the source package from the debian
repository as well as all the needed source dependencies, and rebuild the
package on your system. This, altough debian specific, is something akin to
what you want, isn't it ?

Friendly,

Sven Luther
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Caml-list] standard regex package
@ 2001-08-24 18:46 Arturo Borquez
  0 siblings, 0 replies; 40+ messages in thread
From: Arturo Borquez @ 2001-08-24 18:46 UTC (permalink / raw)
  To: miles; +Cc: caml-list

On Fri, 24 August 2001, Miles Egan wrote:

> 
> I'd be happy to do it, but I'm afraid it's a bit beyond my abilities.  What do
> you think of Markus' suggestion of simply keeping the current str module and
> adding a new pcre module?
> 
> -- 
> miles
> 
I believe that the best choice is what Markus proposed
that is to put PCRE as a different module and deprecate
Str (or not do any further development with Str).

Arturo.


Find the best deals on the web at AltaVista Shopping!
http://www.shopping.altavista.com
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2001-08-30  8:40 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-08-22 18:53 [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc neale-caml
2001-08-22 19:18 ` Alain Frisch
2001-08-22 20:41   ` Neale Pickett
2001-08-23 10:21     ` Frank Atanassow
2001-08-23 16:06       ` Neale Pickett
2001-08-23 16:25         ` Alain Frisch
2001-08-23 18:14           ` Neale Pickett
2001-08-22 20:23 ` Markus Mottl
2001-08-22 20:31   ` Miles Egan
2001-08-22 20:52     ` Michael Leary
2001-08-23  5:36       ` Jeremy Fincher
2001-08-22 22:06     ` Nicolas George
2001-08-23  7:08       ` [Caml-list] PCRE as standard (Was: Str.string_match raising Invalid_argument...) Florian Hars
2001-08-23 17:31       ` [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc Brian Rogoff
2001-08-23 18:08         ` [Caml-list] standard regex package Miles Egan
2001-08-23 19:28           ` Brian Rogoff
2001-08-23 19:49             ` Miles Egan
2001-08-23 19:51             ` Gerd Stolpmann
2001-08-23 21:12               ` Brian Rogoff
2001-08-23 21:27               ` Benjamin C. Pierce
2001-08-23 21:49                 ` Gerd Stolpmann
2001-08-23 22:11                   ` Miles Egan
2001-08-23 23:55                     ` Gerd Stolpmann
2001-08-24  9:03                       ` Claudio Sacerdoti Coen
2001-08-24  9:26                       ` Sven
2001-08-27 15:46                         ` [Caml-list] Package dependencies [Was: standard regex package] Ian Zimmerman
2001-08-27 20:50                           ` Gerd Stolpmann
2001-08-24  9:23                   ` [Caml-list] standard regex package Sven
2001-08-27 15:54                     ` Ian Zimmerman
2001-08-30  8:41                       ` Sven
2001-08-23 21:06             ` RE : " Lionel Fourquaux
2001-08-24  9:23               ` [Caml-list] dynamic loading and OS interface Xavier Leroy
2001-08-27 15:16             ` [Caml-list] standard regex package Ian Zimmerman
2001-08-27 15:35               ` Brian Rogoff
2001-08-24  9:13           ` Xavier Leroy
2001-08-24 10:16             ` Markus Mottl
2001-08-24 16:49             ` Miles Egan
     [not found]   ` <w533d6j1vxn.fsf@woozle.org>
     [not found]     ` <20010823112653.A7085@chopin.ai.univie.ac.at>
     [not found]       ` <w5366be7fd0.fsf_-_@woozle.org>
2001-08-23 20:01         ` [Caml-list] Re: [OFF-LIST] Str.string_match raising Invalid_argument "String.sub" in gc Markus Mottl
2001-08-23 20:31           ` Patrick M Doane
2001-08-24 18:46 [Caml-list] standard regex package Arturo Borquez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).