caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] unix.chop_extension
@ 2004-05-24 20:04 skaller
  2004-05-24 22:01 ` skaller
                   ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: skaller @ 2004-05-24 20:04 UTC (permalink / raw)
  To: caml-list

# Filename.chop_extension "x.y/z";;
- : string = "x"

Oh come on. This is correct according to the specs,
but no one would believe this function is chopping
off the extension here :)

Also 

let y = (concat (directory x) (basename x)) in
assert x = y

is not guarranteed, only that x,y are 'equivalent'.
What does that mean? In particular can I assume:

is_implicit x = is_implicit y
is_relative x = is_relative y

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-24 20:04 [Caml-list] unix.chop_extension skaller
@ 2004-05-24 22:01 ` skaller
  2004-05-25  8:46 ` Alex Baretta
  2004-05-26  9:05 ` Xavier Leroy
  2 siblings, 0 replies; 37+ messages in thread
From: skaller @ 2004-05-24 22:01 UTC (permalink / raw)
  To: caml-list

On Tue, 2004-05-25 at 06:04, skaller wrote:
> # Filename.chop_extension "x.y/z";;
> - : string = "x"
> 
> Oh come on. This is correct according to the specs,
> but no one would believe this function is chopping
> off the extension here :)
> 
> Also 
> 
> let y = (concat (directory x) (basename x)) in
> assert x = y
> 
> is not guarranteed, only that x,y are 'equivalent'.
> What does that mean? In particular can I assume:
> 
> is_implicit x = is_implicit y
> is_relative x = is_relative y

Oh dear .. this information is lost: 

# Filename.concat (Filename.dirname  "x") (Filename.basename "x");;
- : string = "./x"

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-24 20:04 [Caml-list] unix.chop_extension skaller
  2004-05-24 22:01 ` skaller
@ 2004-05-25  8:46 ` Alex Baretta
  2004-05-25  9:35   ` skaller
  2004-05-27  8:15   ` YANG Shouxun
  2004-05-26  9:05 ` Xavier Leroy
  2 siblings, 2 replies; 37+ messages in thread
From: Alex Baretta @ 2004-05-25  8:46 UTC (permalink / raw)
  To: skaller, Ocaml

skaller wrote:
> # Filename.chop_extension "x.y/z";;
> - : string = "x"


This is a terrible consequence of not having (* functional *) support 
for regexps in the language or standard library. The Filename library 
uses the very weak functions of the String library to find the rightmost 
dot in the filename (* or path *), which is obviously correct only under 
a very stringent precondition, which is not the most general possible 
precondition for this function.

The Str module has a very clever implementation, but it cannot be used 
systematically in Ocaml because it is not *functional*. I'm sure that if 
we had a functional version of Str, the author of the Filename module 
would rewrite it using regular expressions instead of the rindex function.

***

There's one extra thing I'd like to point out. We have the 
chop_extension function. Why in the world is there no find_extension 
function?

find_extension "foo.bar" --> "bar"

Alex

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-25  8:46 ` Alex Baretta
@ 2004-05-25  9:35   ` skaller
  2004-05-25  9:46     ` Alain Frisch
                       ` (2 more replies)
  2004-05-27  8:15   ` YANG Shouxun
  1 sibling, 3 replies; 37+ messages in thread
From: skaller @ 2004-05-25  9:35 UTC (permalink / raw)
  To: Alex Baretta; +Cc: Ocaml

On Tue, 2004-05-25 at 18:46, Alex Baretta wrote:
> skaller wrote:
> > # Filename.chop_extension "x.y/z";;
> > - : string = "x"
> 
> 
> This is a terrible consequence of not having (* functional *) support 
> for regexps in the language or standard library. 

Which gets back to the Cathedral/Bazaar problem:
there are plenty of people including me with
reentant Ocaml-only FFAU licenced solutions.
We could probably even agree on a spec, implementation,
and Ocaml LGPL+X licence .. but could we get that solution
into the standard distro?

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-25  9:35   ` skaller
@ 2004-05-25  9:46     ` Alain Frisch
  2004-05-25 10:47       ` skaller
  2004-05-25 13:37     ` [Caml-list] unix.chop_extension John Goerzen
  2004-05-25 19:17     ` Richard Jones
  2 siblings, 1 reply; 37+ messages in thread
From: Alain Frisch @ 2004-05-25  9:46 UTC (permalink / raw)
  To: skaller; +Cc: Ocaml

On 25 May 2004, skaller wrote:

> Which gets back to the Cathedral/Bazaar problem:

No, again ?  You really want all of us the scream.

> but could we get that solution
> into the standard distro?

You could easily get that into the GODI distribution. What are the
advantages of the standard distro over the GODI distro ?

-- Alain

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-25  9:46     ` Alain Frisch
@ 2004-05-25 10:47       ` skaller
  2004-05-25 11:51         ` sejourne kevin
  2004-05-25 14:06         ` [Caml-list] Re: AAP (was: unix.chop_extension) Christophe TROESTLER
  0 siblings, 2 replies; 37+ messages in thread
From: skaller @ 2004-05-25 10:47 UTC (permalink / raw)
  To: Alain Frisch; +Cc: Ocaml

On Tue, 2004-05-25 at 19:46, Alain Frisch wrote:
> On 25 May 2004, skaller wrote:

> You could easily get that into the GODI distribution. What are the
> advantages of the standard distro over the GODI distro ?

1. The standard distro is 'authoritative'.

2. The standard distro is a simple tarball,
you don't need to be online 

3. The standard distro is built by a team
working with some support from a National
Government: INRIA has a charter, lawyers,
funding, etc etc.

4. CVS access to INRIA's Ocaml is available for
bleeding edge developers.

No intent to be critical, just trying to answer
the question.

GODI seems quite complex, I don't feel confident
at the moment not only relying on it myself ..
but also expecting my clients to do so.
That may change as it matures, however it does
seem bit Unix centric.

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-25 10:47       ` skaller
@ 2004-05-25 11:51         ` sejourne kevin
  2004-05-26 11:18           ` Florian Hars
  2004-05-25 14:06         ` [Caml-list] Re: AAP (was: unix.chop_extension) Christophe TROESTLER
  1 sibling, 1 reply; 37+ messages in thread
From: sejourne kevin @ 2004-05-25 11:51 UTC (permalink / raw)
  To: caml-list

 --- skaller <skaller@users.sourceforge.net> a écrit :
> On Tue, 2004-05-25 at 19:46, Alain Frisch wrote:
> > On 25 May 2004, skaller wrote:
> 
> > You could easily get that into the GODI
> distribution. What are the
> > advantages of the standard distro over the GODI
> distro ?
> 
> 1. The standard distro is 'authoritative'.
> 
> 2. The standard distro is a simple tarball,
> you don't need to be online 
> 
> 3. The standard distro is built by a team
> working with some support from a National
> Government: INRIA has a charter, lawyers,
> funding, etc etc.
> 
> 4. CVS access to INRIA's Ocaml is available for
> bleeding edge developers.
> 
> No intent to be critical, just trying to answer
> the question.
> 
> GODI seems quite complex, I don't feel confident
> at the moment not only relying on it myself ..
> but also expecting my clients to do so.
> That may change as it matures, however it does
> seem bit Unix centric.
Stooooooooooooooooooooooooooop!!!!! troooooll!!!!!
If you don't like it, Don't use it!

Kevin


	

	
		
Yahoo! Mail : votre e-mail personnel et gratuit qui vous suit partout ! 
Créez votre Yahoo! Mail sur http://fr.benefits.yahoo.com/

Dialoguez en direct avec vos amis grâce à Yahoo! Messenger !Téléchargez Yahoo! Messenger sur http://fr.messenger.yahoo.com

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-25  9:35   ` skaller
  2004-05-25  9:46     ` Alain Frisch
@ 2004-05-25 13:37     ` John Goerzen
  2004-05-25 19:17     ` Richard Jones
  2 siblings, 0 replies; 37+ messages in thread
From: John Goerzen @ 2004-05-25 13:37 UTC (permalink / raw)
  To: skaller; +Cc: Alex Baretta, Ocaml

On Tue, May 25, 2004 at 07:35:55PM +1000, skaller wrote:
> On Tue, 2004-05-25 at 18:46, Alex Baretta wrote:
> > skaller wrote:
> > > # Filename.chop_extension "x.y/z";;
> > > - : string = "x"
> > 
> > 
> > This is a terrible consequence of not having (* functional *) support 
> > for regexps in the language or standard library. 
> 
> Which gets back to the Cathedral/Bazaar problem:
> there are plenty of people including me with
> reentant Ocaml-only FFAU licenced solutions.
> We could probably even agree on a spec, implementation,
> and Ocaml LGPL+X licence .. but could we get that solution
> into the standard distro?

Perhaps you could get it into Extlib or, if it uses C, into Annexlib?

-- John

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [Caml-list] Re: AAP (was: unix.chop_extension)
  2004-05-25 10:47       ` skaller
  2004-05-25 11:51         ` sejourne kevin
@ 2004-05-25 14:06         ` Christophe TROESTLER
  1 sibling, 0 replies; 37+ messages in thread
From: Christophe TROESTLER @ 2004-05-25 14:06 UTC (permalink / raw)
  To: skaller; +Cc: Alain.Frisch, caml-list

On 25 May 2004, skaller <skaller@users.sourceforge.net> wrote:
> 
> GODI seems quite complex, I don't feel confident at the moment not
> only relying on it myself ..  but also expecting my clients to do
> so.  That may change as it matures, however it does seem bit Unix
> centric.

Maybe AAP (http://www.a-a-p.org/) -- or similar tools -- should be
investigated?  It works on Unix, Win32 and MacOSX and seems to have
many things people on this list wanted (build including from CVS,
distribute packages...).

My 2¢,
ChriS

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-25  9:35   ` skaller
  2004-05-25  9:46     ` Alain Frisch
  2004-05-25 13:37     ` [Caml-list] unix.chop_extension John Goerzen
@ 2004-05-25 19:17     ` Richard Jones
  2 siblings, 0 replies; 37+ messages in thread
From: Richard Jones @ 2004-05-25 19:17 UTC (permalink / raw)
  Cc: Ocaml

On Tue, May 25, 2004 at 07:35:55PM +1000, skaller wrote:
> Which gets back to the Cathedral/Bazaar problem:
> there are plenty of people including me with
> reentant Ocaml-only FFAU licenced solutions.
> We could probably even agree on a spec, implementation,
> and Ocaml LGPL+X licence .. but could we get that solution
> into the standard distro?

Get it into ExtLib!

No, but honestly I today rolled out ExtLib support across our servers.
I've been slowly rewriting my personal "missing library" to use ExtLib
over the past few weeks and today was rollout day.  ExtLib is very
good, not as complete as it needs to be, but still a useful addition
to the standard library.

Rich.

-- 
Richard Jones. http://www.annexia.org/ http://www.j-london.com/
Merjis Ltd. http://www.merjis.com/ - improving website return on investment
"One serious obstacle to the adoption of good programming languages is
the notion that everything has to be sacrificed for speed. In computer
languages as in life, speed kills." -- Mike Vanier

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-24 20:04 [Caml-list] unix.chop_extension skaller
  2004-05-24 22:01 ` skaller
  2004-05-25  8:46 ` Alex Baretta
@ 2004-05-26  9:05 ` Xavier Leroy
  2004-05-26  9:35   ` Luca Pascali
                     ` (3 more replies)
  2 siblings, 4 replies; 37+ messages in thread
From: Xavier Leroy @ 2004-05-26  9:05 UTC (permalink / raw)
  To: skaller; +Cc: caml-list

John Skaller writes:

> # Filename.chop_extension "x.y/z";;
> - : string = "x"
> 
> Oh come on. This is correct according to the specs,
> but no one would believe this function is chopping
> off the extension here :)

Care to submit a bug report for that?

> Also 
> let y = (concat (directory x) (basename x)) in
> assert x = y
> is not guarranteed, only that x,y are 'equivalent'.
> What does that mean?

That both names (x and y) refer to the same file.  You access the same
file if you open one or the other.

Alex Baretta writes:

> This is a terrible consequence of not having (* functional *) support 
> for regexps in the language or standard library. [...]
> The Str module has a very clever implementation, but it cannot be used 
> systematically in Ocaml because it is not *functional*. I'm sure that if 
> we had a functional version of Str, the author of the Filename module 
> would rewrite it using regular expressions instead of the rindex function.

You're jumping to conclusions here.  Str is an external library, not
part of the standard library (stdlib).  The way the OCaml distribution
is set up, stdlib code can only refer to other stdlib modules; the
compilers and tools can use stdlib and their own modules; and other
libraries such as Str are outside the stdlib/compiler subset.  This
stratification is important because of bootstrapping issues.  It's
always a good idea to minimize the bootstrapping base, which currently
consists of stdlib, ocamlc and ocamllex.

So, it's true that parts of stdlib and the compilers would benefit
from regexps (see e.g. asmcomp/asmpackager.ml), but the reason for not
using regexps has nothing to do with the imperative interface of Str.

Concerning the Str interface, I received several complaints about it,
to which I replied "why don't you propose an alternate functional API?",
to which I never got any reply.  So, if you have ideas, go ahead.  If
you can get some peer reviewing on your design, that would be much better.
Good APIs aren't easy to design (witness the lengthy discussion on I/O
on this list), so it's unlikely that you can find a good one all by
yourself in 30 minutes.

> There's one extra thing I'd like to point out. We have the 
> chop_extension function. Why in the world is there no find_extension 
> function?
> find_extension "foo.bar" --> "bar"

Why in the world would that be generally useful?  Remember, we don't
shoot for completeness (it's unattainable anyway), just for
usefulness.

- Xavier Leroy

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-26  9:05 ` Xavier Leroy
@ 2004-05-26  9:35   ` Luca Pascali
  2004-05-26  9:56   ` Remi Vanicat
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 37+ messages in thread
From: Luca Pascali @ 2004-05-26  9:35 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: skaller, caml-list

Xavier Leroy wrote:

[CUT]

>>here's one extra thing I'd like to point out. We have the 
>>chop_extension function. Why in the world is there no find_extension 
>>function?
>>find_extension "foo.bar" --> "bar"
>>    
>>
>
>Why in the world would that be generally useful?  Remember, we don't
>shoot for completeness (it's unattainable anyway), just for
>usefulness.
>
>- Xavier Leroy
>
>
>  
>
IMHO, this might be considered more as a matter of usefulness instead of 
completeness.

I mean that there are systems (like Windows) that use only the extension 
of the file to identify how to execute or how to read the file itself, 
and anyway the extension is useful to classify a file for any kind of 
application (such as compilers, network applications, graphic editing, 
and so on).
So a developer has to write its own function to extract the extension 
from a filename every time he/she needs. Its not a big deal, but ...

My opinion.

Luca


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-26  9:05 ` Xavier Leroy
  2004-05-26  9:35   ` Luca Pascali
@ 2004-05-26  9:56   ` Remi Vanicat
  2004-05-26 10:34   ` skaller
  2004-05-26 11:21   ` Alex Baretta
  3 siblings, 0 replies; 37+ messages in thread
From: Remi Vanicat @ 2004-05-26  9:56 UTC (permalink / raw)
  To: caml-list

Xavier Leroy <xavier.leroy@inria.fr> writes:

[...]


>> There's one extra thing I'd like to point out. We have the 
>> chop_extension function. Why in the world is there no find_extension 
>> function?
>> find_extension "foo.bar" --> "bar"
>
> Why in the world would that be generally useful?  Remember, we don't
> shoot for completeness (it's unattainable anyway), just for
> usefulness.

Well, it usefull when one want to use a different parser (or more
generally a different "file reader) for different type file (gif,
jpg,...), or launch a different helper program for different type of
file. (Well, I agree that the usage of a "magic number" is a better
way to do it, but the usage of the extension is easier).

-- 
Rémi Vanicat

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-26  9:05 ` Xavier Leroy
  2004-05-26  9:35   ` Luca Pascali
  2004-05-26  9:56   ` Remi Vanicat
@ 2004-05-26 10:34   ` skaller
  2004-05-26 13:27     ` Damien Doligez
  2004-05-26 11:21   ` Alex Baretta
  3 siblings, 1 reply; 37+ messages in thread
From: skaller @ 2004-05-26 10:34 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: caml-list

On Wed, 2004-05-26 at 19:05, Xavier Leroy wrote:
> John Skaller writes:
> 
> > # Filename.chop_extension "x.y/z";;
> > - : string = "x"
> > 
> > Oh come on. This is correct according to the specs,
> > but no one would believe this function is chopping
> > off the extension here :)
> 
> Care to submit a bug report for that?

Well it isn't a bug .. it actually does what the 
documentation says it does. It isn't clear if changing
the semantics to a more 'sensible' interpretation
won't break work-arounds etc.. so perhaps a 'feature request'?

Does anyone have code that would break if the chop_extension
function semantics were changed?

> > Also 
> > let y = (concat (directory x) (basename x)) in
> > assert x = y
> > is not guarranteed, only that x,y are 'equivalent'.
> > What does that mean?
> 
> That both names (x and y) refer to the same file.  You access the same
> file if you open one or the other.

What if neither denotes a file?
For example:

#include <stdio.h>

Woops. stdio.h doesn't denote a file. Neither does ./stdio.h.
Neither does 'foobar'. But either all are equivalent
(in which case I'd argue Filename is useless) or Filename
doesn't have defined semantics (in which case its utility
is limited and depends on the actual filesystem present).

The actual implementation is useful however, at least
on Unix: I don't want that leading ./ but I can remove
it and I know when to.. however the code that does that
isn't platform independent, and even on Unix the doco
doesn't guarrantee it: I'm not sure dirname "" = "."
rather than say "" or perhaps ./. :)

> Concerning the Str interface, I received several complaints about it,
> to which I replied "why don't you propose an alternate functional API?",
> to which I never got any reply.  So, if you have ideas, go ahead.  

Part of the reason may be that people don't feel encouraged that
a proposal would make it into the core distribution.

Designing a good interface can be quite difficult and a lot
of work, particularly because one really needs to have multiple
people consider it, and also actually implement it.

In the past I spent years of my life and a thousands of dollars
doing this where there was actually a full formal process for
making proposals, and I myself had considerable political power.

If INRIA people would offer just a little encouragement and direction
in this area it might go a long way. Some comments on the content
of Extlib would be nice .. that project quite specifically
*intends* to be incorporated into the main distro (and I'd sure
like to see a variable length array make it ..)

> If
> you can get some peer reviewing on your design, that would be much better.
> Good APIs aren't easy to design (witness the lengthy discussion on I/O
> on this list), so it's unlikely that you can find a good one all by
> yourself in 30 minutes.

Total agreement!

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-25 11:51         ` sejourne kevin
@ 2004-05-26 11:18           ` Florian Hars
  0 siblings, 0 replies; 37+ messages in thread
From: Florian Hars @ 2004-05-26 11:18 UTC (permalink / raw)
  To: sejourne kevin; +Cc: caml-list

sejourne kevin wrote:
>> On Tue, 2004-05-25 at 19:46, Alain Frisch wrote:
>>> What are the advantages of the standard distro over the GODI
>>> distro ?
 >>
>> GODI seems quite complex, 
> 
> Stooooooooooooooooooooooooooop!!!!! troooooll!!!!!
> If you don't like it, Don't use it!

That was *exactly* the point John Skaller was trying to make. Thank you for 
forcefully supporting his argument.

Yours, Florian Hars.

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-26  9:05 ` Xavier Leroy
                     ` (2 preceding siblings ...)
  2004-05-26 10:34   ` skaller
@ 2004-05-26 11:21   ` Alex Baretta
  2004-05-26 16:43     ` Richard Jones
  3 siblings, 1 reply; 37+ messages in thread
From: Alex Baretta @ 2004-05-26 11:21 UTC (permalink / raw)
  To: Xavier Leroy, Ocaml, pascky

Xavier Leroy wrote:

> Concerning the Str interface, I received several complaints about it,
> to which I replied "why don't you propose an alternate functional API?",
> to which I never got any reply.  So, if you have ideas, go ahead.  If
> you can get some peer reviewing on your design, that would be much better.
> Good APIs aren't easy to design (witness the lengthy discussion on I/O
> on this list), so it's unlikely that you can find a good one all by
> yourself in 30 minutes.

Actually, I'm a happy user of Str, but I find the absence in Ocaml of a 
functional "canonical" regexp feature striking.


>>There's one extra thing I'd like to point out. We have the 
>>chop_extension function. Why in the world is there no find_extension 
>>function?
>>find_extension "foo.bar" --> "bar"
> 
> 
> Why in the world would that be generally useful?  Remember, we don't
> shoot for completeness (it's unattainable anyway), just for
> usefulness.

find_extension would be no more and no less general than chop_extension. 
It's actually the dual of the latter, in a way. Again, it's no big deal, 
but it seems to me that the lack of a find_extension function in 
Filename is an involuntary omission rather than a well thought out 
design decision.

***

Are the following two pieces of code not dual of each other? Is this not 
a good enough reason to include the former?

let file_extension name =
   try
     let index = String.rindex name '.' + 1 in
     let ext_len = String.length name - index  in
       String.sub name index ext_len
   with Not_found ->
     invalid_arg "Xcaml.file_extension"

let chop_extension name =
   try
     String.sub name 0 (String.rindex name '.')
   with Not_found ->
     invalid_arg "Filename.chop_extension"

Duality is a property I consider of paramount importance in a formal 
model. Ocaml is a modeling language for algorithms. Duality is central 
to Ocaml, in my opinion. For this very same reason I once asked why such 
a crucial operator as ++ had not been included in Pervasives.

Alex

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-26 10:34   ` skaller
@ 2004-05-26 13:27     ` Damien Doligez
  2004-05-26 15:50       ` skaller
  0 siblings, 1 reply; 37+ messages in thread
From: Damien Doligez @ 2004-05-26 13:27 UTC (permalink / raw)
  To: caml-list

On May 26, 2004, at 12:34, skaller wrote:

> For example:
>
> #include <stdio.h>
>
> Woops. stdio.h doesn't denote a file. Neither does ./stdio.h.

What ???  "stdio.h" does denote a file, and "./stdio.h" does
denote the same file.  That file may or may not exist, but if
you open it with either name, you will access the same file,
whether it exists or not.

> Neither does 'foobar'.

"foobar" also denotes a file.

>  But either all are equivalent

No, they are not equivalent.  Opening "foobar" does not access
the same file as opening "stdio.h".

-- Damien

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-26 13:27     ` Damien Doligez
@ 2004-05-26 15:50       ` skaller
  2004-05-26 16:04         ` Damien Doligez
  0 siblings, 1 reply; 37+ messages in thread
From: skaller @ 2004-05-26 15:50 UTC (permalink / raw)
  To: Damien Doligez; +Cc: caml-list

On Wed, 2004-05-26 at 23:27, Damien Doligez wrote:
> On May 26, 2004, at 12:34, skaller wrote:
> 
> > For example:
> >
> > #include <stdio.h>
> >
> > Woops. stdio.h doesn't denote a file. Neither does ./stdio.h.
> 
> What ???  "stdio.h" does denote a file, and "./stdio.h" does
> denote the same file.  That file may or may not exist,

This is mathematically an ill formed statement.

You cannot say P(x), when x doesn't exist,
for a predicate P. That could lead to a contradiction.

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-26 15:50       ` skaller
@ 2004-05-26 16:04         ` Damien Doligez
  2004-05-27  4:33           ` skaller
  0 siblings, 1 reply; 37+ messages in thread
From: Damien Doligez @ 2004-05-26 16:04 UTC (permalink / raw)
  To: caml-list

On May 26, 2004, at 17:50, skaller wrote:

> This is mathematically an ill formed statement.
>
> You cannot say P(x), when x doesn't exist,
> for a predicate P. That could lead to a contradiction.

And yet... You can open a file that doesn't exist.

Obviously, the word "exist" is not used in filesystem semantics
with the meaning that you are implying.

-- Damien

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-26 11:21   ` Alex Baretta
@ 2004-05-26 16:43     ` Richard Jones
  2004-05-27  4:48       ` skaller
                         ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: Richard Jones @ 2004-05-26 16:43 UTC (permalink / raw)
  Cc: Ocaml

On Wed, May 26, 2004 at 01:21:01PM +0200, Alex Baretta wrote:
> Actually, I'm a happy user of Str, but I find the absence in Ocaml of a 
> functional "canonical" regexp feature striking.

I'm fascinated to know what this "functional" API would look like.  I
use Pcre and it doesn't appear to have any global internal state
AFAICS ...

Rich.

-- 
Richard Jones. http://www.annexia.org/ http://www.j-london.com/
Merjis Ltd. http://www.merjis.com/ - improving website return on investment
MOD_CAML lets you run type-safe Objective CAML programs inside the Apache
webserver. http://www.merjis.com/developers/mod_caml/

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-26 16:04         ` Damien Doligez
@ 2004-05-27  4:33           ` skaller
  2004-05-27  4:56             ` John Goerzen
  2004-05-28 16:44             ` Damien Doligez
  0 siblings, 2 replies; 37+ messages in thread
From: skaller @ 2004-05-27  4:33 UTC (permalink / raw)
  To: Damien Doligez; +Cc: caml-list

On Thu, 2004-05-27 at 02:04, Damien Doligez wrote:
> On May 26, 2004, at 17:50, skaller wrote:
> 
> > This is mathematically an ill formed statement.
> >
> > You cannot say P(x), when x doesn't exist,
> > for a predicate P. That could lead to a contradiction.
> 
> And yet... You can open a file that doesn't exist.

No you can't. You can call the open function
with a string for any string. For some strings
a file will be opened. For other strings
no file is opened, you get an error.

If I call open on 'fred' 'joe' and 'max' on my system
I will get an error in all three cases because I
do not have any 'fred' 'joe' and 'max' files.

So based on the behaviour of 'open' for those strings,
what can we say about the semantics of the Filename
module which constructed those strings? 

I'm sure you'd agree nothing can be said: either
Xaviers 'equivalent' definition applies and the
strings are equivalent because open has the same
behaviour for all three, or, his definition does
not apply and in BOTH cases the his definition is
inadequate because clearly we'd agree these strings
are not in any way equivalent -- certainly
IF certain files existed such that the three
open's all worked, we'd find a non-equivalence.

Opening a file also depends on permissions,
and symlinks ...

So my conclusion is that Xaviers definition is a bad one
precisely because it does depend on the underlying
filesystem .. whereas Filename module itself is filesystem
independent.

So my belief is that Filename semantics ought to be
specified directly in terms of the strings manipulated.
Even though the *intent* may be to gain a particular
behaviour opening files.

In particular, concat "" x can generate

"./" ^ x

sometimes? Certainly dirname "x" can generate ".".
I found that surprising. I actually expected:

assert (x = concat (basename x) (dirname x))

and wrote code that depended on this isomorphism.
Belatedly I find this assertion doesn't hold.
I'm surprised. However I'm not claiming the
behaviour is wrong, but that it isn't clearly
specified what actually happens in the terms
needed to make the Filename module as useful
as it should be.

In particular 'equivalent files' definition is
of no use to me, because pathname components
almost never refer to files.

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-26 16:43     ` Richard Jones
@ 2004-05-27  4:48       ` skaller
  2004-05-27  7:46         ` Markus Mottl
  2004-05-27 17:29       ` brogoff
  2004-05-28 11:23       ` Alex Baretta
  2 siblings, 1 reply; 37+ messages in thread
From: skaller @ 2004-05-27  4:48 UTC (permalink / raw)
  To: Richard Jones; +Cc: Ocaml

On Thu, 2004-05-27 at 02:43, Richard Jones wrote:
> On Wed, May 26, 2004 at 01:21:01PM +0200, Alex Baretta wrote:
> > Actually, I'm a happy user of Str, but I find the absence in Ocaml of a 
> > functional "canonical" regexp feature striking.
> 
> I'm fascinated to know what this "functional" API would look like.  I
> use Pcre and it doesn't appear to have any global internal state
> AFAICS ...

AFAIK: The C Pcre that it wraps does use global variables 
and so while the interface appears re-entrant
it isn't.

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-27  4:33           ` skaller
@ 2004-05-27  4:56             ` John Goerzen
  2004-05-28 16:44             ` Damien Doligez
  1 sibling, 0 replies; 37+ messages in thread
From: John Goerzen @ 2004-05-27  4:56 UTC (permalink / raw)
  To: skaller; +Cc: Damien Doligez, caml-list

On Thu, May 27, 2004 at 02:33:54PM +1000, skaller wrote:
> On Thu, 2004-05-27 at 02:04, Damien Doligez wrote:
> > On May 26, 2004, at 17:50, skaller wrote:
> > 
> > > This is mathematically an ill formed statement.
> > >
> > > You cannot say P(x), when x doesn't exist,
> > > for a predicate P. That could lead to a contradiction.
> > 
> > And yet... You can open a file that doesn't exist.
> 
> No you can't. You can call the open function
> with a string for any string. For some strings
> a file will be opened. For other strings
> no file is opened, you get an error.

Well.  You can have an open file that doesn't exist but you can't open a
file that doesn't exist.

If a file is unlinked after it's been opened, it is no longer accessible
from any name but its blocks are not reclaimed until after every process
that has it open has closed it.  Thus the open fd remains valid.

(these semantics may differ on non-POSIX platforms)

-- John

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-27  4:48       ` skaller
@ 2004-05-27  7:46         ` Markus Mottl
  2004-05-27  9:33           ` skaller
  0 siblings, 1 reply; 37+ messages in thread
From: Markus Mottl @ 2004-05-27  7:46 UTC (permalink / raw)
  To: skaller; +Cc: Richard Jones, Ocaml

On Thu, 27 May 2004, skaller wrote:
> AFAIK: The C Pcre that it wraps does use global variables 
> and so while the interface appears re-entrant
> it isn't.

To prevent people from getting a false impression of PCRE as interfaced
to OCaml: it _is_ safe.  Calling PCRE-functions from threads or executing
two different regular expressions in an intermittent way should never
crash your program or lead to unexpected results.

The use of global variables by the C-library does not necessarily imply
that the program is unsafe.  It all depends on their use, and in this
case the global variables (e.g. pcre_callout) are initialized exactly
once at startup time, i.e. before the user can access any functions.

Libraries implemented in imperative style may still have reentrant
interfaces.

-- 
Markus Mottl          http://www.oefai.at/~markus          markus@oefai.at

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-25  8:46 ` Alex Baretta
  2004-05-25  9:35   ` skaller
@ 2004-05-27  8:15   ` YANG Shouxun
  2004-05-27  9:47     ` skaller
  1 sibling, 1 reply; 37+ messages in thread
From: YANG Shouxun @ 2004-05-27  8:15 UTC (permalink / raw)
  To: caml-list

On Tuesday 25 May 2004 16:46, Alex Baretta wrote:
> skaller wrote:
> > # Filename.chop_extension "x.y/z";;
> > - : string = "x"
>
> This is a terrible consequence of not having (* functional *) support
> for regexps in the language or standard library. The Filename library
> uses the very weak functions of the String library to find the rightmost
> dot in the filename (* or path *), which is obviously correct only under
> a very stringent precondition, which is not the most general possible
> precondition for this function.

It seems to me that regexp is not necessary to fix this specific problem. 
Hopefully the following patch will get rid of it.

--8<--
--- filename.ml~        2003-12-17 20:23:26.000000000 +0800
+++ filename.ml 2004-05-27 16:11:08.000000000 +0800
@@ -207,10 +207,11 @@
   if n < 0 then invalid_arg "Filename.chop_suffix" else String.sub name 0 n

 let chop_extension name =
-  try
-    String.sub name 0 (String.rindex name '.')
-  with Not_found ->
-    invalid_arg "Filename.chop_extension"
+  let bn = Filename.basename name in
+    try
+      String.sub bn 0 (String.rindex bn '.')
+    with Not_found ->
+      invalid_arg "Filename.chop_extension"

 external open_desc: string -> open_flag list -> int -> int = "sys_open"
 external close_desc: int -> unit = "sys_close"
--8<--

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-27  7:46         ` Markus Mottl
@ 2004-05-27  9:33           ` skaller
  0 siblings, 0 replies; 37+ messages in thread
From: skaller @ 2004-05-27  9:33 UTC (permalink / raw)
  To: Markus Mottl; +Cc: Richard Jones, Ocaml

On Thu, 2004-05-27 at 17:46, Markus Mottl wrote:
> On Thu, 27 May 2004, skaller wrote:
> > AFAIK: The C Pcre that it wraps does use global variables 
> > and so while the interface appears re-entrant
> > it isn't.
> 
> To prevent people from getting a false impression of PCRE as interfaced
> to OCaml: it _is_ safe. 

You are using the word 'safe' imprecisely.

Reentrant is specific. So is thread-safe.
Re-entrant + no shared data structures implies thread-safe.
Thread safe doesn't imply re-entrant. In particular,
re-entrant is a stronger condition in the sense that
it implies all callbacks including asynchronous invocations
from signal handlers will work: thread safe code may
deadlock here.

> The use of global variables by the C-library does not necessarily imply
> that the program is unsafe.  It all depends on their use, and in this
> case the global variables (e.g. pcre_callout) are initialized exactly
> once at startup time, i.e. before the user can access any functions.

If this is so, the globals are merely constants, and the code 
will be re-entrant after the initialisation is complete.

However my examination of the C Pcre at one time
showed this was NOT the case. Instead, certain 
options were stored in global variables on every
call, and that implies the code isn't re-entrant
and cannot ever be made so.

I may have been wrong, and perhaps I was right but
Pcre has changed: I last looked some years ago
whilst working on Vyper .. 

As Markkus comments use of imperative style is
not the issue here. A function can easily
modify variables and be re-entrant provided
the variables are (rooted) on the stack.

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-27  8:15   ` YANG Shouxun
@ 2004-05-27  9:47     ` skaller
  0 siblings, 0 replies; 37+ messages in thread
From: skaller @ 2004-05-27  9:47 UTC (permalink / raw)
  To: YANG Shouxun; +Cc: caml-list

On Thu, 2004-05-27 at 18:15, YANG Shouxun wrote:

> +  let bn = Filename.basename name in
> +    try
> +      String.sub bn 0 (String.rindex bn '.')
> +    with Not_found ->
> +      invalid_arg "Filename.chop_extension"

But this code returns only the basename with extension chopped!

You actually need this:

let xchop_extension f =
  concat (dirname f) (chop_extension (basename f))

but the problem with that is it doesn't just
chop the extension .. it also adds a './'
to the start of the filename if it didn't
have a directory prefix.

So I actually wrote that and then stripped
any leading ./ off .. which is not generally
useful, because one might wish to preserve
that info.

It can be fixed by "even worse hackery" calling
is_implicit and is_relative before and after
the concatenation, and adjusting the result
to preserve those properties.

The right way to fix this is to rindex
for "/" first, and limit the scan for "."
by the result.

BTW: many Win98 apps including IE gets this totally wrong :(
Grrrr.... It scans for the FIRST "." and discards
the rest of the filename after the second one.
When I save 'x.y.jpg' I actually save 'x.y'
and the filetype is lost.

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-26 16:43     ` Richard Jones
  2004-05-27  4:48       ` skaller
@ 2004-05-27 17:29       ` brogoff
  2004-05-28 12:00         ` Keith Wansbrough
  2004-05-28 11:23       ` Alex Baretta
  2 siblings, 1 reply; 37+ messages in thread
From: brogoff @ 2004-05-27 17:29 UTC (permalink / raw)
  To: Ocaml

On Wed, 26 May 2004, Richard Jones wrote:
> On Wed, May 26, 2004 at 01:21:01PM +0200, Alex Baretta wrote:
> > Actually, I'm a happy user of Str, but I find the absence in Ocaml of a
> > functional "canonical" regexp feature striking.
>
> I'm fascinated to know what this "functional" API would look like.  I
> use Pcre and it doesn't appear to have any global internal state
> AFAICS ...

As you note, the interface to Pcre, unlike Str, appears stateless.

If you examine some functional data structure libraries, like Martin Erwig's
FGL, or Gerard Huet's ZEN, you notice that they use the idea of "location +
context" (similar to the C++ STL, where the locations are iterators)  to
navigate the structure. For string processing, the context is the entire string,
and the location is the pair of endpoints of the current string. So it seems
that the right way to a functional API for string processing would be in terms
of substrings, { base : string; left : int; right : int } or just
string * int * int, which is what is done in the SML Basis. The idea extends
naturally to regexp scanning, and it's easy to see how to modify the Str
interface to use substrings, eliminating Not_found exceptions by considering
an empty substring to be a failed match.

Of course, substrings make sense when the underlying string is array like, not
in FPLs like Haskell which use lists of chars as strings, but I consider that an
egregious error of Haskell.

-- Brian

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-26 16:43     ` Richard Jones
  2004-05-27  4:48       ` skaller
  2004-05-27 17:29       ` brogoff
@ 2004-05-28 11:23       ` Alex Baretta
  2 siblings, 0 replies; 37+ messages in thread
From: Alex Baretta @ 2004-05-28 11:23 UTC (permalink / raw)
  To: Richard Jones, caml >> Ocaml

Richard Jones wrote:
> On Wed, May 26, 2004 at 01:21:01PM +0200, Alex Baretta wrote:
> 
>>Actually, I'm a happy user of Str, but I find the absence in Ocaml of a 
>>functional "canonical" regexp feature striking.
> 
> 
> I'm fascinated to know what this "functional" API would look like.  I
> use Pcre and it doesn't appear to have any global internal state
> AFAICS ...
> 
> Rich.
> 

I simply don't know PCRE well enough to discuss it. I'm not arguing in 
favor or against any specific API. I just observed that it is striking 
the the support available in the core ocaml distribution has a 
procedural API rather than a functional one. This does not imply that 
Str is any worse or better than PCRE.

Alex

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-27 17:29       ` brogoff
@ 2004-05-28 12:00         ` Keith Wansbrough
  2004-05-28 16:43           ` brogoff
  0 siblings, 1 reply; 37+ messages in thread
From: Keith Wansbrough @ 2004-05-28 12:00 UTC (permalink / raw)
  To: brogoff; +Cc: caml-list

> interface to use substrings, eliminating Not_found exceptions by considering
> an empty substring to be a failed match.
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

It's not clear that this is always the right thing to do... \([0-9]*\)
can succeed but return an empty substring.

--KW 8-)

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-28 12:00         ` Keith Wansbrough
@ 2004-05-28 16:43           ` brogoff
  2004-05-28 17:49             ` Marcin 'Qrczak' Kowalczyk
  0 siblings, 1 reply; 37+ messages in thread
From: brogoff @ 2004-05-28 16:43 UTC (permalink / raw)
  To: Keith Wansbrough; +Cc: caml-list

On Fri, 28 May 2004, Keith Wansbrough wrote:
> > interface to use substrings, eliminating Not_found exceptions by considering
> > an empty substring to be a failed match.
>   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> It's not clear that this is always the right thing to do... \([0-9]*\)
> can succeed but return an empty substring.

Point taken. There's still a lot of "out of band" values that could be used to
represent failure, such as substrings with negative indices. And of course,
exceptions are perfectly fine for ML, though their functionalness is
arguable, and a Clean (or Haskell 2, where they'll hopefully fix this!)
substring library wouldn't have them.

-- Brian

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-27  4:33           ` skaller
  2004-05-27  4:56             ` John Goerzen
@ 2004-05-28 16:44             ` Damien Doligez
  2004-05-28 19:34               ` skaller
  1 sibling, 1 reply; 37+ messages in thread
From: Damien Doligez @ 2004-05-28 16:44 UTC (permalink / raw)
  To: caml-list

On May 27, 2004, at 06:33, skaller wrote:

> If I call open on 'fred' 'joe' and 'max' on my system
> I will get an error in all three cases because I
> do not have any 'fred' 'joe' and 'max' files.

Try this:

     open_out "fred";;

> In particular 'equivalent files' definition is
> of no use to me, because pathname components
> almost never refer to files.

Please read the documentation.  It doesn't talk about equivalent
files, but about equivalent names.  Two names are equivalent if
they refer to the same file.

-- Damien

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-28 16:43           ` brogoff
@ 2004-05-28 17:49             ` Marcin 'Qrczak' Kowalczyk
  0 siblings, 0 replies; 37+ messages in thread
From: Marcin 'Qrczak' Kowalczyk @ 2004-05-28 17:49 UTC (permalink / raw)
  To: brogoff; +Cc: Keith Wansbrough, caml-list

W liście z pią, 28-05-2004, godz. 09:43 -0700, brogoff napisał:

> Point taken. There's still a lot of "out of band" values that could be used to
> represent failure, such as substrings with negative indices.

I don't think this is a good idea, precisely because it's an "in band"
encoding. Match failure should be indicated either by None instead of
Some whatever, or by a predicate on the whole match object, or with an
exception - but not with a special substring.

-- 
   __("<         Marcin Kowalczyk
   \__/       qrczak@knm.org.pl
    ^^     http://qrnik.knm.org.pl/~qrczak/

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-28 16:44             ` Damien Doligez
@ 2004-05-28 19:34               ` skaller
  2004-05-29  8:37                 ` Damien Doligez
  0 siblings, 1 reply; 37+ messages in thread
From: skaller @ 2004-05-28 19:34 UTC (permalink / raw)
  To: Damien Doligez; +Cc: caml-list

On Sat, 2004-05-29 at 02:44, Damien Doligez wrote:
> On May 27, 2004, at 06:33, skaller wrote:
> 
> > If I call open on 'fred' 'joe' and 'max' on my system
> > I will get an error in all three cases because I
> > do not have any 'fred' 'joe' and 'max' files.
> 
> Try this:
> 
>      open_out "fred";;

Ah, point taken! I was reading files, didn't
even consider open_out :)

> > In particular 'equivalent files' definition is
> > of no use to me, because pathname components
> > almost never refer to files.
> 
> Please read the documentation.  

I did. You didn't follow the full discussion.

> It doesn't talk about equivalent
> files, but about equivalent names.  

yeah, and I had no idea what that meant.
So I asked and Xavier replied:

> Two names are equivalent if
> they refer to the same file.

and that characterisation is what I have been disputing.

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-28 19:34               ` skaller
@ 2004-05-29  8:37                 ` Damien Doligez
  2004-05-29 10:01                   ` skaller
  0 siblings, 1 reply; 37+ messages in thread
From: Damien Doligez @ 2004-05-29  8:37 UTC (permalink / raw)
  To: caml-list

On May 28, 2004, at 21:34, skaller wrote:

>> Two names are equivalent if
>> they refer to the same file.
>
> and that characterisation is what I have been disputing.

More precisely, two names are equivalent if all file-system
operations give the same result and side effects when called
with either name.

-- Damien

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-29  8:37                 ` Damien Doligez
@ 2004-05-29 10:01                   ` skaller
  2004-05-29 16:02                     ` David Brown
  0 siblings, 1 reply; 37+ messages in thread
From: skaller @ 2004-05-29 10:01 UTC (permalink / raw)
  To: Damien Doligez; +Cc: caml-list

On Sat, 2004-05-29 at 18:37, Damien Doligez wrote:

> More precisely, two names are equivalent if all file-system
> operations give the same result and side effects when called
> with either name.

Right: this is a much more complete definition.
However it is still wrong :)

For example: I have no write permission
on the file system, so I can't open_out any files.

It also isn't entirely clear how you would actually
*measure* 'same result' in all cases -- normally
this is obvious, but there will be nasty cases:
the problem is that it is the nasty cases that
are at issue here. opening x and ./x isn't
the same thing on Unix due to permissions distinctions?

My experience on ISO C++ committee suggests this:

The kind of definition you're giving is considered
*motivation*. We want to consider two filenames
equivalent 'roughly in the sense of the operational
behaviour on an actual file system'.

But whilst it provides motivation, such a definition
can't be used as a normative specification. It just
isn't abstract enough or independent enough of
vagaries of some actual file system.

So instead we're forced to define the semantics
entirely in terms of the actual string operations,
and instead of promising behaviour on an existing
file system, state it as an intended consequence
of the specification.

EXAMPLE: chop_extension doesn't do what I expect.
But there is no problem with it actually meeting
its specification. There is no bug in it. 
However, the specification does not satisfy the
motivation so Xavier might actually change it.
I would argue the specification here is of the
correct kind, and if the function is changed,
the reasoning will be that the specification
doesn't match user expectations .. not that there
is a problem with deciding what the function actually
does, or whether the implementation is correct.

I know this sounds pedantic. However the definition
in terms of strings is actually more useful: for
example I can use the functions to manipulate
pathname components such that I have strings
which are never intended to correspond to filenames.
Of course, I will usually intend that *eventually*
I'm constructing filenames.

But even that isn't always the case: in flxcc program
I have to synthesise a module name from a filename.
The module name is closely related to the filename,
but isn't allowed to have non-identifier characters inside.

I translate  weird characters such as '.' '/' etc to '_'.
For this purpose clearly:

stdio.h and ./stdio.h

are not equivalent. I've spent the last two days dealing
with this issue, which arose simply because I didn't
consider that concat (dirname x) (basename x) might
not produce a filename equal to x: clearly the
equivalence of names isn't enough for me here:
I need equality. I have had to fix this by throwing
out leading './' ..

I also have no idea if, for example dirname might
include a trailing '/' or not. Again, it matters,
because I need to more processing than what Filename
module supports: I'm left with the choice of
(1) "do the whole thing yourself" or (2) "guess at what
strings Filename module actually produces" and
write some implementation dependent code.. :(

I'm doing (2) at the moment but will probably
move to (3) use Sylvain de Gall's fileutils module.

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Caml-list] unix.chop_extension
  2004-05-29 10:01                   ` skaller
@ 2004-05-29 16:02                     ` David Brown
  0 siblings, 0 replies; 37+ messages in thread
From: David Brown @ 2004-05-29 16:02 UTC (permalink / raw)
  To: skaller; +Cc: Damien Doligez, caml-list

On Sat, May 29, 2004 at 08:01:10PM +1000, skaller wrote:

> It also isn't entirely clear how you would actually
> *measure* 'same result' in all cases -- normally
> this is obvious, but there will be nasty cases:
> the problem is that it is the nasty cases that
> are at issue here. opening x and ./x isn't
> the same thing on Unix due to permissions distinctions?

What nasty case are you thinking of?  Opening both "x" and "./x" will
require search (x) permission on the current directory, and will either
both work, or both fail in the same situations.

However, that is about the only "equivalent" path that is really
equivalent.  "foo/../x" and "x" may not be equivalent from a permission
perspective (or not at all, say if 'foo' is a symlink).

A good description might be: there are many paths that can possibly
refer to the same file.  A string manipulation library routine can
certainly come up with pathnames that are potentially equivalent.  To
determine real equivalence, something like 'realpath' should be used.
Of course, according to the Linux manpage for realpath:

  "Never use this function.  It is broken by design since it is
  impossible to determine a suitable size for the output buffer."

It wouldn't be very hard to implement realpath in ocaml, just using the
Unix module.  It could even be done safely :-)

Dave

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2004-05-29 16:03 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-05-24 20:04 [Caml-list] unix.chop_extension skaller
2004-05-24 22:01 ` skaller
2004-05-25  8:46 ` Alex Baretta
2004-05-25  9:35   ` skaller
2004-05-25  9:46     ` Alain Frisch
2004-05-25 10:47       ` skaller
2004-05-25 11:51         ` sejourne kevin
2004-05-26 11:18           ` Florian Hars
2004-05-25 14:06         ` [Caml-list] Re: AAP (was: unix.chop_extension) Christophe TROESTLER
2004-05-25 13:37     ` [Caml-list] unix.chop_extension John Goerzen
2004-05-25 19:17     ` Richard Jones
2004-05-27  8:15   ` YANG Shouxun
2004-05-27  9:47     ` skaller
2004-05-26  9:05 ` Xavier Leroy
2004-05-26  9:35   ` Luca Pascali
2004-05-26  9:56   ` Remi Vanicat
2004-05-26 10:34   ` skaller
2004-05-26 13:27     ` Damien Doligez
2004-05-26 15:50       ` skaller
2004-05-26 16:04         ` Damien Doligez
2004-05-27  4:33           ` skaller
2004-05-27  4:56             ` John Goerzen
2004-05-28 16:44             ` Damien Doligez
2004-05-28 19:34               ` skaller
2004-05-29  8:37                 ` Damien Doligez
2004-05-29 10:01                   ` skaller
2004-05-29 16:02                     ` David Brown
2004-05-26 11:21   ` Alex Baretta
2004-05-26 16:43     ` Richard Jones
2004-05-27  4:48       ` skaller
2004-05-27  7:46         ` Markus Mottl
2004-05-27  9:33           ` skaller
2004-05-27 17:29       ` brogoff
2004-05-28 12:00         ` Keith Wansbrough
2004-05-28 16:43           ` brogoff
2004-05-28 17:49             ` Marcin 'Qrczak' Kowalczyk
2004-05-28 11:23       ` Alex Baretta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).