* Re: [Caml-list] unix.chop_extension
2004-05-26 9:05 ` Xavier Leroy
@ 2004-05-26 9:35 ` Luca Pascali
2004-05-26 9:56 ` Remi Vanicat
` (2 subsequent siblings)
3 siblings, 0 replies; 37+ messages in thread
From: Luca Pascali @ 2004-05-26 9:35 UTC (permalink / raw)
To: Xavier Leroy; +Cc: skaller, caml-list
Xavier Leroy wrote:
[CUT]
>>here's one extra thing I'd like to point out. We have the
>>chop_extension function. Why in the world is there no find_extension
>>function?
>>find_extension "foo.bar" --> "bar"
>>
>>
>
>Why in the world would that be generally useful? Remember, we don't
>shoot for completeness (it's unattainable anyway), just for
>usefulness.
>
>- Xavier Leroy
>
>
>
>
IMHO, this might be considered more as a matter of usefulness instead of
completeness.
I mean that there are systems (like Windows) that use only the extension
of the file to identify how to execute or how to read the file itself,
and anyway the extension is useful to classify a file for any kind of
application (such as compilers, network applications, graphic editing,
and so on).
So a developer has to write its own function to extract the extension
from a filename every time he/she needs. Its not a big deal, but ...
My opinion.
Luca
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-26 9:05 ` Xavier Leroy
2004-05-26 9:35 ` Luca Pascali
@ 2004-05-26 9:56 ` Remi Vanicat
2004-05-26 10:34 ` skaller
2004-05-26 11:21 ` Alex Baretta
3 siblings, 0 replies; 37+ messages in thread
From: Remi Vanicat @ 2004-05-26 9:56 UTC (permalink / raw)
To: caml-list
Xavier Leroy <xavier.leroy@inria.fr> writes:
[...]
>> There's one extra thing I'd like to point out. We have the
>> chop_extension function. Why in the world is there no find_extension
>> function?
>> find_extension "foo.bar" --> "bar"
>
> Why in the world would that be generally useful? Remember, we don't
> shoot for completeness (it's unattainable anyway), just for
> usefulness.
Well, it usefull when one want to use a different parser (or more
generally a different "file reader) for different type file (gif,
jpg,...), or launch a different helper program for different type of
file. (Well, I agree that the usage of a "magic number" is a better
way to do it, but the usage of the extension is easier).
--
Rémi Vanicat
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-26 9:05 ` Xavier Leroy
2004-05-26 9:35 ` Luca Pascali
2004-05-26 9:56 ` Remi Vanicat
@ 2004-05-26 10:34 ` skaller
2004-05-26 13:27 ` Damien Doligez
2004-05-26 11:21 ` Alex Baretta
3 siblings, 1 reply; 37+ messages in thread
From: skaller @ 2004-05-26 10:34 UTC (permalink / raw)
To: Xavier Leroy; +Cc: caml-list
On Wed, 2004-05-26 at 19:05, Xavier Leroy wrote:
> John Skaller writes:
>
> > # Filename.chop_extension "x.y/z";;
> > - : string = "x"
> >
> > Oh come on. This is correct according to the specs,
> > but no one would believe this function is chopping
> > off the extension here :)
>
> Care to submit a bug report for that?
Well it isn't a bug .. it actually does what the
documentation says it does. It isn't clear if changing
the semantics to a more 'sensible' interpretation
won't break work-arounds etc.. so perhaps a 'feature request'?
Does anyone have code that would break if the chop_extension
function semantics were changed?
> > Also
> > let y = (concat (directory x) (basename x)) in
> > assert x = y
> > is not guarranteed, only that x,y are 'equivalent'.
> > What does that mean?
>
> That both names (x and y) refer to the same file. You access the same
> file if you open one or the other.
What if neither denotes a file?
For example:
#include <stdio.h>
Woops. stdio.h doesn't denote a file. Neither does ./stdio.h.
Neither does 'foobar'. But either all are equivalent
(in which case I'd argue Filename is useless) or Filename
doesn't have defined semantics (in which case its utility
is limited and depends on the actual filesystem present).
The actual implementation is useful however, at least
on Unix: I don't want that leading ./ but I can remove
it and I know when to.. however the code that does that
isn't platform independent, and even on Unix the doco
doesn't guarrantee it: I'm not sure dirname "" = "."
rather than say "" or perhaps ./. :)
> Concerning the Str interface, I received several complaints about it,
> to which I replied "why don't you propose an alternate functional API?",
> to which I never got any reply. So, if you have ideas, go ahead.
Part of the reason may be that people don't feel encouraged that
a proposal would make it into the core distribution.
Designing a good interface can be quite difficult and a lot
of work, particularly because one really needs to have multiple
people consider it, and also actually implement it.
In the past I spent years of my life and a thousands of dollars
doing this where there was actually a full formal process for
making proposals, and I myself had considerable political power.
If INRIA people would offer just a little encouragement and direction
in this area it might go a long way. Some comments on the content
of Extlib would be nice .. that project quite specifically
*intends* to be incorporated into the main distro (and I'd sure
like to see a variable length array make it ..)
> If
> you can get some peer reviewing on your design, that would be much better.
> Good APIs aren't easy to design (witness the lengthy discussion on I/O
> on this list), so it's unlikely that you can find a good one all by
> yourself in 30 minutes.
Total agreement!
--
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850,
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-26 10:34 ` skaller
@ 2004-05-26 13:27 ` Damien Doligez
2004-05-26 15:50 ` skaller
0 siblings, 1 reply; 37+ messages in thread
From: Damien Doligez @ 2004-05-26 13:27 UTC (permalink / raw)
To: caml-list
On May 26, 2004, at 12:34, skaller wrote:
> For example:
>
> #include <stdio.h>
>
> Woops. stdio.h doesn't denote a file. Neither does ./stdio.h.
What ??? "stdio.h" does denote a file, and "./stdio.h" does
denote the same file. That file may or may not exist, but if
you open it with either name, you will access the same file,
whether it exists or not.
> Neither does 'foobar'.
"foobar" also denotes a file.
> But either all are equivalent
No, they are not equivalent. Opening "foobar" does not access
the same file as opening "stdio.h".
-- Damien
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-26 13:27 ` Damien Doligez
@ 2004-05-26 15:50 ` skaller
2004-05-26 16:04 ` Damien Doligez
0 siblings, 1 reply; 37+ messages in thread
From: skaller @ 2004-05-26 15:50 UTC (permalink / raw)
To: Damien Doligez; +Cc: caml-list
On Wed, 2004-05-26 at 23:27, Damien Doligez wrote:
> On May 26, 2004, at 12:34, skaller wrote:
>
> > For example:
> >
> > #include <stdio.h>
> >
> > Woops. stdio.h doesn't denote a file. Neither does ./stdio.h.
>
> What ??? "stdio.h" does denote a file, and "./stdio.h" does
> denote the same file. That file may or may not exist,
This is mathematically an ill formed statement.
You cannot say P(x), when x doesn't exist,
for a predicate P. That could lead to a contradiction.
--
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850,
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-26 15:50 ` skaller
@ 2004-05-26 16:04 ` Damien Doligez
2004-05-27 4:33 ` skaller
0 siblings, 1 reply; 37+ messages in thread
From: Damien Doligez @ 2004-05-26 16:04 UTC (permalink / raw)
To: caml-list
On May 26, 2004, at 17:50, skaller wrote:
> This is mathematically an ill formed statement.
>
> You cannot say P(x), when x doesn't exist,
> for a predicate P. That could lead to a contradiction.
And yet... You can open a file that doesn't exist.
Obviously, the word "exist" is not used in filesystem semantics
with the meaning that you are implying.
-- Damien
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-26 16:04 ` Damien Doligez
@ 2004-05-27 4:33 ` skaller
2004-05-27 4:56 ` John Goerzen
2004-05-28 16:44 ` Damien Doligez
0 siblings, 2 replies; 37+ messages in thread
From: skaller @ 2004-05-27 4:33 UTC (permalink / raw)
To: Damien Doligez; +Cc: caml-list
On Thu, 2004-05-27 at 02:04, Damien Doligez wrote:
> On May 26, 2004, at 17:50, skaller wrote:
>
> > This is mathematically an ill formed statement.
> >
> > You cannot say P(x), when x doesn't exist,
> > for a predicate P. That could lead to a contradiction.
>
> And yet... You can open a file that doesn't exist.
No you can't. You can call the open function
with a string for any string. For some strings
a file will be opened. For other strings
no file is opened, you get an error.
If I call open on 'fred' 'joe' and 'max' on my system
I will get an error in all three cases because I
do not have any 'fred' 'joe' and 'max' files.
So based on the behaviour of 'open' for those strings,
what can we say about the semantics of the Filename
module which constructed those strings?
I'm sure you'd agree nothing can be said: either
Xaviers 'equivalent' definition applies and the
strings are equivalent because open has the same
behaviour for all three, or, his definition does
not apply and in BOTH cases the his definition is
inadequate because clearly we'd agree these strings
are not in any way equivalent -- certainly
IF certain files existed such that the three
open's all worked, we'd find a non-equivalence.
Opening a file also depends on permissions,
and symlinks ...
So my conclusion is that Xaviers definition is a bad one
precisely because it does depend on the underlying
filesystem .. whereas Filename module itself is filesystem
independent.
So my belief is that Filename semantics ought to be
specified directly in terms of the strings manipulated.
Even though the *intent* may be to gain a particular
behaviour opening files.
In particular, concat "" x can generate
"./" ^ x
sometimes? Certainly dirname "x" can generate ".".
I found that surprising. I actually expected:
assert (x = concat (basename x) (dirname x))
and wrote code that depended on this isomorphism.
Belatedly I find this assertion doesn't hold.
I'm surprised. However I'm not claiming the
behaviour is wrong, but that it isn't clearly
specified what actually happens in the terms
needed to make the Filename module as useful
as it should be.
In particular 'equivalent files' definition is
of no use to me, because pathname components
almost never refer to files.
--
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850,
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-27 4:33 ` skaller
@ 2004-05-27 4:56 ` John Goerzen
2004-05-28 16:44 ` Damien Doligez
1 sibling, 0 replies; 37+ messages in thread
From: John Goerzen @ 2004-05-27 4:56 UTC (permalink / raw)
To: skaller; +Cc: Damien Doligez, caml-list
On Thu, May 27, 2004 at 02:33:54PM +1000, skaller wrote:
> On Thu, 2004-05-27 at 02:04, Damien Doligez wrote:
> > On May 26, 2004, at 17:50, skaller wrote:
> >
> > > This is mathematically an ill formed statement.
> > >
> > > You cannot say P(x), when x doesn't exist,
> > > for a predicate P. That could lead to a contradiction.
> >
> > And yet... You can open a file that doesn't exist.
>
> No you can't. You can call the open function
> with a string for any string. For some strings
> a file will be opened. For other strings
> no file is opened, you get an error.
Well. You can have an open file that doesn't exist but you can't open a
file that doesn't exist.
If a file is unlinked after it's been opened, it is no longer accessible
from any name but its blocks are not reclaimed until after every process
that has it open has closed it. Thus the open fd remains valid.
(these semantics may differ on non-POSIX platforms)
-- John
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-27 4:33 ` skaller
2004-05-27 4:56 ` John Goerzen
@ 2004-05-28 16:44 ` Damien Doligez
2004-05-28 19:34 ` skaller
1 sibling, 1 reply; 37+ messages in thread
From: Damien Doligez @ 2004-05-28 16:44 UTC (permalink / raw)
To: caml-list
On May 27, 2004, at 06:33, skaller wrote:
> If I call open on 'fred' 'joe' and 'max' on my system
> I will get an error in all three cases because I
> do not have any 'fred' 'joe' and 'max' files.
Try this:
open_out "fred";;
> In particular 'equivalent files' definition is
> of no use to me, because pathname components
> almost never refer to files.
Please read the documentation. It doesn't talk about equivalent
files, but about equivalent names. Two names are equivalent if
they refer to the same file.
-- Damien
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-28 16:44 ` Damien Doligez
@ 2004-05-28 19:34 ` skaller
2004-05-29 8:37 ` Damien Doligez
0 siblings, 1 reply; 37+ messages in thread
From: skaller @ 2004-05-28 19:34 UTC (permalink / raw)
To: Damien Doligez; +Cc: caml-list
On Sat, 2004-05-29 at 02:44, Damien Doligez wrote:
> On May 27, 2004, at 06:33, skaller wrote:
>
> > If I call open on 'fred' 'joe' and 'max' on my system
> > I will get an error in all three cases because I
> > do not have any 'fred' 'joe' and 'max' files.
>
> Try this:
>
> open_out "fred";;
Ah, point taken! I was reading files, didn't
even consider open_out :)
> > In particular 'equivalent files' definition is
> > of no use to me, because pathname components
> > almost never refer to files.
>
> Please read the documentation.
I did. You didn't follow the full discussion.
> It doesn't talk about equivalent
> files, but about equivalent names.
yeah, and I had no idea what that meant.
So I asked and Xavier replied:
> Two names are equivalent if
> they refer to the same file.
and that characterisation is what I have been disputing.
--
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850,
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-28 19:34 ` skaller
@ 2004-05-29 8:37 ` Damien Doligez
2004-05-29 10:01 ` skaller
0 siblings, 1 reply; 37+ messages in thread
From: Damien Doligez @ 2004-05-29 8:37 UTC (permalink / raw)
To: caml-list
On May 28, 2004, at 21:34, skaller wrote:
>> Two names are equivalent if
>> they refer to the same file.
>
> and that characterisation is what I have been disputing.
More precisely, two names are equivalent if all file-system
operations give the same result and side effects when called
with either name.
-- Damien
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-29 8:37 ` Damien Doligez
@ 2004-05-29 10:01 ` skaller
2004-05-29 16:02 ` David Brown
0 siblings, 1 reply; 37+ messages in thread
From: skaller @ 2004-05-29 10:01 UTC (permalink / raw)
To: Damien Doligez; +Cc: caml-list
On Sat, 2004-05-29 at 18:37, Damien Doligez wrote:
> More precisely, two names are equivalent if all file-system
> operations give the same result and side effects when called
> with either name.
Right: this is a much more complete definition.
However it is still wrong :)
For example: I have no write permission
on the file system, so I can't open_out any files.
It also isn't entirely clear how you would actually
*measure* 'same result' in all cases -- normally
this is obvious, but there will be nasty cases:
the problem is that it is the nasty cases that
are at issue here. opening x and ./x isn't
the same thing on Unix due to permissions distinctions?
My experience on ISO C++ committee suggests this:
The kind of definition you're giving is considered
*motivation*. We want to consider two filenames
equivalent 'roughly in the sense of the operational
behaviour on an actual file system'.
But whilst it provides motivation, such a definition
can't be used as a normative specification. It just
isn't abstract enough or independent enough of
vagaries of some actual file system.
So instead we're forced to define the semantics
entirely in terms of the actual string operations,
and instead of promising behaviour on an existing
file system, state it as an intended consequence
of the specification.
EXAMPLE: chop_extension doesn't do what I expect.
But there is no problem with it actually meeting
its specification. There is no bug in it.
However, the specification does not satisfy the
motivation so Xavier might actually change it.
I would argue the specification here is of the
correct kind, and if the function is changed,
the reasoning will be that the specification
doesn't match user expectations .. not that there
is a problem with deciding what the function actually
does, or whether the implementation is correct.
I know this sounds pedantic. However the definition
in terms of strings is actually more useful: for
example I can use the functions to manipulate
pathname components such that I have strings
which are never intended to correspond to filenames.
Of course, I will usually intend that *eventually*
I'm constructing filenames.
But even that isn't always the case: in flxcc program
I have to synthesise a module name from a filename.
The module name is closely related to the filename,
but isn't allowed to have non-identifier characters inside.
I translate weird characters such as '.' '/' etc to '_'.
For this purpose clearly:
stdio.h and ./stdio.h
are not equivalent. I've spent the last two days dealing
with this issue, which arose simply because I didn't
consider that concat (dirname x) (basename x) might
not produce a filename equal to x: clearly the
equivalence of names isn't enough for me here:
I need equality. I have had to fix this by throwing
out leading './' ..
I also have no idea if, for example dirname might
include a trailing '/' or not. Again, it matters,
because I need to more processing than what Filename
module supports: I'm left with the choice of
(1) "do the whole thing yourself" or (2) "guess at what
strings Filename module actually produces" and
write some implementation dependent code.. :(
I'm doing (2) at the moment but will probably
move to (3) use Sylvain de Gall's fileutils module.
--
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850,
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-29 10:01 ` skaller
@ 2004-05-29 16:02 ` David Brown
0 siblings, 0 replies; 37+ messages in thread
From: David Brown @ 2004-05-29 16:02 UTC (permalink / raw)
To: skaller; +Cc: Damien Doligez, caml-list
On Sat, May 29, 2004 at 08:01:10PM +1000, skaller wrote:
> It also isn't entirely clear how you would actually
> *measure* 'same result' in all cases -- normally
> this is obvious, but there will be nasty cases:
> the problem is that it is the nasty cases that
> are at issue here. opening x and ./x isn't
> the same thing on Unix due to permissions distinctions?
What nasty case are you thinking of? Opening both "x" and "./x" will
require search (x) permission on the current directory, and will either
both work, or both fail in the same situations.
However, that is about the only "equivalent" path that is really
equivalent. "foo/../x" and "x" may not be equivalent from a permission
perspective (or not at all, say if 'foo' is a symlink).
A good description might be: there are many paths that can possibly
refer to the same file. A string manipulation library routine can
certainly come up with pathnames that are potentially equivalent. To
determine real equivalence, something like 'realpath' should be used.
Of course, according to the Linux manpage for realpath:
"Never use this function. It is broken by design since it is
impossible to determine a suitable size for the output buffer."
It wouldn't be very hard to implement realpath in ocaml, just using the
Unix module. It could even be done safely :-)
Dave
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-26 9:05 ` Xavier Leroy
` (2 preceding siblings ...)
2004-05-26 10:34 ` skaller
@ 2004-05-26 11:21 ` Alex Baretta
2004-05-26 16:43 ` Richard Jones
3 siblings, 1 reply; 37+ messages in thread
From: Alex Baretta @ 2004-05-26 11:21 UTC (permalink / raw)
To: Xavier Leroy, Ocaml, pascky
Xavier Leroy wrote:
> Concerning the Str interface, I received several complaints about it,
> to which I replied "why don't you propose an alternate functional API?",
> to which I never got any reply. So, if you have ideas, go ahead. If
> you can get some peer reviewing on your design, that would be much better.
> Good APIs aren't easy to design (witness the lengthy discussion on I/O
> on this list), so it's unlikely that you can find a good one all by
> yourself in 30 minutes.
Actually, I'm a happy user of Str, but I find the absence in Ocaml of a
functional "canonical" regexp feature striking.
>>There's one extra thing I'd like to point out. We have the
>>chop_extension function. Why in the world is there no find_extension
>>function?
>>find_extension "foo.bar" --> "bar"
>
>
> Why in the world would that be generally useful? Remember, we don't
> shoot for completeness (it's unattainable anyway), just for
> usefulness.
find_extension would be no more and no less general than chop_extension.
It's actually the dual of the latter, in a way. Again, it's no big deal,
but it seems to me that the lack of a find_extension function in
Filename is an involuntary omission rather than a well thought out
design decision.
***
Are the following two pieces of code not dual of each other? Is this not
a good enough reason to include the former?
let file_extension name =
try
let index = String.rindex name '.' + 1 in
let ext_len = String.length name - index in
String.sub name index ext_len
with Not_found ->
invalid_arg "Xcaml.file_extension"
let chop_extension name =
try
String.sub name 0 (String.rindex name '.')
with Not_found ->
invalid_arg "Filename.chop_extension"
Duality is a property I consider of paramount importance in a formal
model. Ocaml is a modeling language for algorithms. Duality is central
to Ocaml, in my opinion. For this very same reason I once asked why such
a crucial operator as ++ had not been included in Pervasives.
Alex
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-26 11:21 ` Alex Baretta
@ 2004-05-26 16:43 ` Richard Jones
2004-05-27 4:48 ` skaller
` (2 more replies)
0 siblings, 3 replies; 37+ messages in thread
From: Richard Jones @ 2004-05-26 16:43 UTC (permalink / raw)
Cc: Ocaml
On Wed, May 26, 2004 at 01:21:01PM +0200, Alex Baretta wrote:
> Actually, I'm a happy user of Str, but I find the absence in Ocaml of a
> functional "canonical" regexp feature striking.
I'm fascinated to know what this "functional" API would look like. I
use Pcre and it doesn't appear to have any global internal state
AFAICS ...
Rich.
--
Richard Jones. http://www.annexia.org/ http://www.j-london.com/
Merjis Ltd. http://www.merjis.com/ - improving website return on investment
MOD_CAML lets you run type-safe Objective CAML programs inside the Apache
webserver. http://www.merjis.com/developers/mod_caml/
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-26 16:43 ` Richard Jones
@ 2004-05-27 4:48 ` skaller
2004-05-27 7:46 ` Markus Mottl
2004-05-27 17:29 ` brogoff
2004-05-28 11:23 ` Alex Baretta
2 siblings, 1 reply; 37+ messages in thread
From: skaller @ 2004-05-27 4:48 UTC (permalink / raw)
To: Richard Jones; +Cc: Ocaml
On Thu, 2004-05-27 at 02:43, Richard Jones wrote:
> On Wed, May 26, 2004 at 01:21:01PM +0200, Alex Baretta wrote:
> > Actually, I'm a happy user of Str, but I find the absence in Ocaml of a
> > functional "canonical" regexp feature striking.
>
> I'm fascinated to know what this "functional" API would look like. I
> use Pcre and it doesn't appear to have any global internal state
> AFAICS ...
AFAIK: The C Pcre that it wraps does use global variables
and so while the interface appears re-entrant
it isn't.
--
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850,
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-27 4:48 ` skaller
@ 2004-05-27 7:46 ` Markus Mottl
2004-05-27 9:33 ` skaller
0 siblings, 1 reply; 37+ messages in thread
From: Markus Mottl @ 2004-05-27 7:46 UTC (permalink / raw)
To: skaller; +Cc: Richard Jones, Ocaml
On Thu, 27 May 2004, skaller wrote:
> AFAIK: The C Pcre that it wraps does use global variables
> and so while the interface appears re-entrant
> it isn't.
To prevent people from getting a false impression of PCRE as interfaced
to OCaml: it _is_ safe. Calling PCRE-functions from threads or executing
two different regular expressions in an intermittent way should never
crash your program or lead to unexpected results.
The use of global variables by the C-library does not necessarily imply
that the program is unsafe. It all depends on their use, and in this
case the global variables (e.g. pcre_callout) are initialized exactly
once at startup time, i.e. before the user can access any functions.
Libraries implemented in imperative style may still have reentrant
interfaces.
--
Markus Mottl http://www.oefai.at/~markus markus@oefai.at
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-27 7:46 ` Markus Mottl
@ 2004-05-27 9:33 ` skaller
0 siblings, 0 replies; 37+ messages in thread
From: skaller @ 2004-05-27 9:33 UTC (permalink / raw)
To: Markus Mottl; +Cc: Richard Jones, Ocaml
On Thu, 2004-05-27 at 17:46, Markus Mottl wrote:
> On Thu, 27 May 2004, skaller wrote:
> > AFAIK: The C Pcre that it wraps does use global variables
> > and so while the interface appears re-entrant
> > it isn't.
>
> To prevent people from getting a false impression of PCRE as interfaced
> to OCaml: it _is_ safe.
You are using the word 'safe' imprecisely.
Reentrant is specific. So is thread-safe.
Re-entrant + no shared data structures implies thread-safe.
Thread safe doesn't imply re-entrant. In particular,
re-entrant is a stronger condition in the sense that
it implies all callbacks including asynchronous invocations
from signal handlers will work: thread safe code may
deadlock here.
> The use of global variables by the C-library does not necessarily imply
> that the program is unsafe. It all depends on their use, and in this
> case the global variables (e.g. pcre_callout) are initialized exactly
> once at startup time, i.e. before the user can access any functions.
If this is so, the globals are merely constants, and the code
will be re-entrant after the initialisation is complete.
However my examination of the C Pcre at one time
showed this was NOT the case. Instead, certain
options were stored in global variables on every
call, and that implies the code isn't re-entrant
and cannot ever be made so.
I may have been wrong, and perhaps I was right but
Pcre has changed: I last looked some years ago
whilst working on Vyper ..
As Markkus comments use of imperative style is
not the issue here. A function can easily
modify variables and be re-entrant provided
the variables are (rooted) on the stack.
--
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850,
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-26 16:43 ` Richard Jones
2004-05-27 4:48 ` skaller
@ 2004-05-27 17:29 ` brogoff
2004-05-28 12:00 ` Keith Wansbrough
2004-05-28 11:23 ` Alex Baretta
2 siblings, 1 reply; 37+ messages in thread
From: brogoff @ 2004-05-27 17:29 UTC (permalink / raw)
To: Ocaml
On Wed, 26 May 2004, Richard Jones wrote:
> On Wed, May 26, 2004 at 01:21:01PM +0200, Alex Baretta wrote:
> > Actually, I'm a happy user of Str, but I find the absence in Ocaml of a
> > functional "canonical" regexp feature striking.
>
> I'm fascinated to know what this "functional" API would look like. I
> use Pcre and it doesn't appear to have any global internal state
> AFAICS ...
As you note, the interface to Pcre, unlike Str, appears stateless.
If you examine some functional data structure libraries, like Martin Erwig's
FGL, or Gerard Huet's ZEN, you notice that they use the idea of "location +
context" (similar to the C++ STL, where the locations are iterators) to
navigate the structure. For string processing, the context is the entire string,
and the location is the pair of endpoints of the current string. So it seems
that the right way to a functional API for string processing would be in terms
of substrings, { base : string; left : int; right : int } or just
string * int * int, which is what is done in the SML Basis. The idea extends
naturally to regexp scanning, and it's easy to see how to modify the Str
interface to use substrings, eliminating Not_found exceptions by considering
an empty substring to be a failed match.
Of course, substrings make sense when the underlying string is array like, not
in FPLs like Haskell which use lists of chars as strings, but I consider that an
egregious error of Haskell.
-- Brian
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-27 17:29 ` brogoff
@ 2004-05-28 12:00 ` Keith Wansbrough
2004-05-28 16:43 ` brogoff
0 siblings, 1 reply; 37+ messages in thread
From: Keith Wansbrough @ 2004-05-28 12:00 UTC (permalink / raw)
To: brogoff; +Cc: caml-list
> interface to use substrings, eliminating Not_found exceptions by considering
> an empty substring to be a failed match.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
It's not clear that this is always the right thing to do... \([0-9]*\)
can succeed but return an empty substring.
--KW 8-)
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-28 12:00 ` Keith Wansbrough
@ 2004-05-28 16:43 ` brogoff
2004-05-28 17:49 ` Marcin 'Qrczak' Kowalczyk
0 siblings, 1 reply; 37+ messages in thread
From: brogoff @ 2004-05-28 16:43 UTC (permalink / raw)
To: Keith Wansbrough; +Cc: caml-list
On Fri, 28 May 2004, Keith Wansbrough wrote:
> > interface to use substrings, eliminating Not_found exceptions by considering
> > an empty substring to be a failed match.
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> It's not clear that this is always the right thing to do... \([0-9]*\)
> can succeed but return an empty substring.
Point taken. There's still a lot of "out of band" values that could be used to
represent failure, such as substrings with negative indices. And of course,
exceptions are perfectly fine for ML, though their functionalness is
arguable, and a Clean (or Haskell 2, where they'll hopefully fix this!)
substring library wouldn't have them.
-- Brian
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-28 16:43 ` brogoff
@ 2004-05-28 17:49 ` Marcin 'Qrczak' Kowalczyk
0 siblings, 0 replies; 37+ messages in thread
From: Marcin 'Qrczak' Kowalczyk @ 2004-05-28 17:49 UTC (permalink / raw)
To: brogoff; +Cc: Keith Wansbrough, caml-list
W liście z pią, 28-05-2004, godz. 09:43 -0700, brogoff napisał:
> Point taken. There's still a lot of "out of band" values that could be used to
> represent failure, such as substrings with negative indices.
I don't think this is a good idea, precisely because it's an "in band"
encoding. Match failure should be indicated either by None instead of
Some whatever, or by a predicate on the whole match object, or with an
exception - but not with a special substring.
--
__("< Marcin Kowalczyk
\__/ qrczak@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [Caml-list] unix.chop_extension
2004-05-26 16:43 ` Richard Jones
2004-05-27 4:48 ` skaller
2004-05-27 17:29 ` brogoff
@ 2004-05-28 11:23 ` Alex Baretta
2 siblings, 0 replies; 37+ messages in thread
From: Alex Baretta @ 2004-05-28 11:23 UTC (permalink / raw)
To: Richard Jones, caml >> Ocaml
Richard Jones wrote:
> On Wed, May 26, 2004 at 01:21:01PM +0200, Alex Baretta wrote:
>
>>Actually, I'm a happy user of Str, but I find the absence in Ocaml of a
>>functional "canonical" regexp feature striking.
>
>
> I'm fascinated to know what this "functional" API would look like. I
> use Pcre and it doesn't appear to have any global internal state
> AFAICS ...
>
> Rich.
>
I simply don't know PCRE well enough to discuss it. I'm not arguing in
favor or against any specific API. I just observed that it is striking
the the support available in the core ocaml distribution has a
procedural API rather than a functional one. This does not imply that
Str is any worse or better than PCRE.
Alex
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 37+ messages in thread