caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Fwd: Re: [Caml-list] The boon of static type checking
@ 2005-02-07  2:24 Jon Harrop
  2005-02-07  2:55 ` skaller
  2005-02-10  2:10 ` String to list to string Juancarlo Añez
  0 siblings, 2 replies; 22+ messages in thread
From: Jon Harrop @ 2005-02-07  2:24 UTC (permalink / raw)
  To: caml-list

On Sunday 06 February 2005 22:26, Radu Grigore wrote:
> By "fork lineages" you mean keeping "old" versions of the map around?

Exactly, "persistence".

> > I would like to think that your stack will be big enough to fit 40
> > recursive calls if you expect your computer to handle trillion-element
> > maps!
>
> In the worst case (most unbalanced) 40 corresponds to a few million
> keys, not trillions. If I haven't made a mistake the minimum number of
> keys in a (OCaml) Map with height h is given by the recurrence:
> T(h)=1+T(h-1)+T(h-3).

I think you are right. Regardless, your computer is unlikely to struggle with 
80 recursive function calls.

> > > Yet another problem is the missing cardinal function.
> >
> > An arbitrary number of functions are "missing". You can't expect INRIA to
> > implement all infinity of them for you! :-)
>
> Considering the fact that the code in set.ml and map.ml looks like it
> is copy&pasted I guess asking for some consistency in the interface is
> not unreasonable. In any case copy&pasting the cardinal function
> wouldn't provide better performance (I think).

Yes, Set's cardinal looks O(n) to me. So you're asking for an equally 
inefficient cardinal function in Map? ;-)

Personally, I'd like to see a core balanced binary tree implementation with
various different set and map (and other) data structures built from it.

> > The STL will probably take <<1ms because the STL's size() is probably
> > O(1) whereas Map's fold is O(n). In contrast, forking lineages is
> > probably O(n) with the STL (requiring a copy) and O(1) with ocaml's Map.
>
> I haven't actually tested with STL but the implementation of size() is
> indeed a simple return. Do you have an example where forking lineages
> is useful?

I often exploit persistence when writing interpreters in a functional style,
to handle bindings in an inner scope. Note that more intelligent people use
an imperative style and the seemingly-quirky Hashtbl for this...

> As I said, I don't doubt there are situations where it is
> useful; just that right now I'm having trouble coming up with a
> realistic example.

I can't think of any other places I've used persistence...

> I am exploring possibilities right now. My only "complaint" is that
> switching from Hashtbl to Map for a dictionary of about 200000 keys
> decreases the performance a LOT! I would have expected a factor of
> 20-30 but it seems to be a lot more: probably due to memory management
> (as Gerd Stolpmann notices) due to the functional nature of Map.

Now that I've had a little more time to think about it, I can think of three
possible causes of this poor performance. The most likely is the O(ln n)
(unnecessary) string comparisons it will be doing (particularly if your
strings are similar "lexicographically"), compared to a single hash. Next is
the time spent in allocate and GC, as someone else said. Third is cache
incoherence, as I said before.

You could profile your code to work out which but you might as well skip this
and just use a more appropriate data structure. Or use Hashtbl and sort, as
Gerd says. If you really want to stick with Map then you could also store a
hash along with each string, so the comparison acts on the hash before it
acts on the string. This should greatly speed up the comparisons (oh, and
write your own comparison function, using String.compare and not
Pervasives.compare) and, I think, the whole program.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.


^ permalink raw reply	[flat|nested] 22+ messages in thread
* RE: [Caml-list] String to list to string
@ 2005-02-10 18:35 Harrison, John R
  2005-02-10 19:28 ` Jon Harrop
  2005-02-10 19:32 ` brogoff
  0 siblings, 2 replies; 22+ messages in thread
From: Harrison, John R @ 2005-02-10 18:35 UTC (permalink / raw)
  To: brogoff, caml-list; +Cc: Harrison, John R

| > Haskell treats strings as lists of chars by default.
|
| Just goes to show you that even really smart people can do some
amazingly
| dumb things.

That's far too categorical; opinions differ on this subject. I'd be
quite
happy with strings as lists of characters, and I consider OCaml's
mutable
strings a far worse design error.

And yes, I use "implode" and "explode" all the time, because lists of
characters are often simple and convenient to deal with in a functional
style, and string manipulations are almost never performance-critical
in the kind of code I write.

John.


^ permalink raw reply	[flat|nested] 22+ messages in thread
* RE: [Caml-list] String to list to string
@ 2005-02-10 19:51 Harrison, John R
  0 siblings, 0 replies; 22+ messages in thread
From: Harrison, John R @ 2005-02-10 19:51 UTC (permalink / raw)
  To: Jon Harrop, caml-list; +Cc: Harrison, John R

| > ...I consider OCaml's mutable strings a far worse design error.
| 
| That's interesting. Why?

Because it destroys persistence, one of the most fundamental advantages
of functional programming. Any function I call with a string argument
can
choose to modify it, and cause side-effects elsewhere.

Even Java has immutable strings, doesn't it? And I notice this feature
of OCaml is fixed in F#. All for very sound reasons, in my opinion.

John.


^ permalink raw reply	[flat|nested] 22+ messages in thread
* RE: [Caml-list] String to list to string
@ 2005-02-10 21:18 Harrison, John R
  0 siblings, 0 replies; 22+ messages in thread
From: Harrison, John R @ 2005-02-10 21:18 UTC (permalink / raw)
  To: brogoff; +Cc: caml-list, Harrison, John R

| My preference for functional string processing is, as I said,
substrings.
| They are convenient, and encapsulate the index manipulations in the
data
| type. I'm almost certain you've seen them, what's your issue?

None whatsoever. I'm perfectly willing to believe that some variant
of character arrays is the best string representation. (Provided
they are immutable!)

But for the relatively trivial string manipulations that I (and
quite possibly many other OCaml users) care about, lists of
characters are very convenient and perfectly adequate. So I don't
feel your hyperbolic dismissals of such a representation as
"dumb" and "ridiculous" are justified. You could say that with
equal force about any use of lists instead of arrays, and I'm
sure there are plenty of programmers who would.

John.


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2005-02-15 13:34 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-02-07  2:24 Fwd: Re: [Caml-list] The boon of static type checking Jon Harrop
2005-02-07  2:55 ` skaller
2005-02-10  2:10 ` String to list to string Juancarlo Añez
2005-02-10  2:27   ` [Caml-list] " William D.Neumann
2005-02-10  3:24     ` Erik de Castro Lopo
2005-02-10  6:31       ` Radu Grigore
2005-02-10  6:52         ` Erik de Castro Lopo
2005-02-10  3:41   ` Jon Harrop
2005-02-15  1:16     ` Aaron Bohannon
2005-02-15 10:33       ` Richard Jones
2005-02-15 13:34         ` Eric C. Cooper
2005-02-10 10:09   ` Richard Jones
2005-02-10 19:19     ` Juancarlo Añez
     [not found]     ` <E1CzJqb-00031c-00@furbychan.cocan.org>
2005-02-10 19:41       ` Richard Jones
2005-02-10 17:58   ` brogoff
2005-02-10 18:35 Harrison, John R
2005-02-10 19:28 ` Jon Harrop
2005-02-11  1:22   ` skaller
2005-02-11  2:05     ` John Prevost
2005-02-10 19:32 ` brogoff
2005-02-10 19:51 Harrison, John R
2005-02-10 21:18 Harrison, John R

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).