caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: David Chase <chase@world.std.com>
To: caml-list@inria.fr
Subject: Re: [Caml-list] [RANT] String representation (was: Strings as arrays or lists...)
Date: Tue, 04 Mar 2003 08:49:52 -0500	[thread overview]
Message-ID: <5.2.0.9.0.20030304080202.03220958@pop.theWorld.com> (raw)
In-Reply-To: <Pine.A41.4.44.0303041312560.4431978-100000@ibm1.cicrp.juss ieu.fr>

At 01:43 PM 3/4/2003 +0100, Diego Olivier Fernandez Pons wrote:
>    Bonjour,
>
>Some of the features you wish are 'not so hard to implement', at least
>if you already have 'conceptually bugged low-level' strings :

I hope you aren't proposing that everyone implement the "String" data
structure that they wish they really had.  Strings are too basic, and
too common, to be left to this.

Consider Java's approach of String and StringBuffer.  My main gripe
(after years of use) is that too many functions should take a StringBuffer
(which should probably be an abstract type, though there is indeed some
loss of efficiency) and return a StringBuffer, rather than simply returning
a String.  From an optimization point-of-view, the most useful thing
for performance would be to find a way to have an abstract type with
non-abstract synchronization.  All StringBuffer ops are synchronized,
and probably should be, but a compiler and idioms can help IF THE
SYNCHRONIZATION IS EXPOSED.  That is, in Java, this code

  StringBuffer sb ...
  synchronized (sb) { // I know that sb is thread-private
     while (...) {
        sb.append(...);
     }
  }

is likely to run faster than it would without the outer synchronization
because the JIT compiler can see that the (non-abstract, exposed, final)
StringBuffer synchronizations are redundant.

On-the-other-hand, multithreaded programming appears to be so hard
for the typical and even good-typical programmer, that I'd be interested
in seeing if there were better ways to do it (say, threads communicating
only through queues).

>> - Arrays need fast random access by numeric index, strings do not
>> need that.
>
>You can easily hava a (log n/k + 1) acces where n is the total size of
>the string and k is the size of each bucket (if you choose a data
>structure with constant buffer size)

That will be noticeably slower.  In the Java compiler that I worked
on, the String operations were specially recognized by the compiler
so that it could infer success of string bounds checks, and that
"minor" optimization was noticeably beneficial.  On the other hand,
with some cleverness and a biggish K, you might be able to get most
of the benefit simply from the "bounds check" (as in, small strings
fit in their first and only bucket, illegal indices necessarily also
overflow off the end of the first bucket).

David Chase


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


  parent reply	other threads:[~2003-03-04 13:49 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-03-03 18:28 [oliver: Re: [Caml-list] Strings as arrays or lists...] Oliver Bandel
2003-03-03 20:10 ` brogoff
2003-03-03 21:05   ` William Lovas
2003-03-03 21:32     ` Basile STARYNKEVITCH
2003-03-03 22:10     ` [Caml-list] [RANT] String representation (was: Strings as arrays or lists...) Nicolas George
2003-03-04 12:43       ` Diego Olivier Fernandez Pons
2003-03-04 16:14         ` William D. Neumann
2003-03-04 18:38           ` Xavier Leroy
2003-03-04 18:50             ` William D. Neumann
2003-03-04 19:01         ` Nicolas George
     [not found]       ` <Pine.A41.4.44.0303041312560.4431978-100000@ibm1.cicrp.juss ieu.fr>
2003-03-04 13:49         ` David Chase [this message]
2003-03-04  0:20     ` [oliver: Re: [Caml-list] Strings as arrays or lists...] Issac Trotts
2003-03-04  0:24       ` Alain.Frisch
2003-03-04  1:06         ` Issac Trotts
2003-03-04  0:39       ` Olivier Andrieu
2003-03-04  0:39     ` brogoff
2003-03-03 21:40   ` [Caml-list] extensional polymorphism james woodyatt
2003-03-04  1:10     ` brogoff
2003-03-04  2:04       ` james woodyatt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5.2.0.9.0.20030304080202.03220958@pop.theWorld.com \
    --to=chase@world.std.com \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).