caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Goswin von Brederlow <goswin-v-b@web.de>
To: caml-list@inria.fr
Subject: Re: [Caml-list] Asynchronous IO programming in OCaml
Date: Wed, 27 Oct 2010 15:43:37 +0200	[thread overview]
Message-ID: <877hh33g52.fsf@frosties.localdomain> (raw)
In-Reply-To: <20101027111835.GA5664@gaia> ("Jeremie Dimino"'s message of "Wed, 27 Oct 2010 13:18:35 +0200")

Jérémie Dimino <jeremie@dimino.org> writes:

> On Wed, Oct 27, 2010 at 11:33:51AM +0200, Goswin von Brederlow wrote:
>> You aren't doing any multithreading. You are creating a thread and
>> waiting for the thread to finish its read before strating a second.
>> There are never ever 2 reads running in parallel. So all you do is add
>> thread creation and destruction for every read to your first example.
>
> Yes, i know that. The idea was just to show the overhead of context
> switches.

But then you tune your benchmark to give you the result you want instead
of benchmarking what normaly happens.

You already have a context switch when the read returns (or any other
syscall that blocks). The context switch to a thread that waits on it
costs the same as switching to the main thread. When it blocked at
least.

>> You should start multiple threads and let them read from different
>> offsets (use pread) and only once they are all started join them all
>> again.
>
> Sure, but doing this directly in Lwt raises other problems:
>
> - This means prefetching a large part of the file into the program
>   memory.

Yes. totaly. But if you want to do work asynchonously then you need to
spend the memory to keep the data for multiple jobs in memory.

> - The kernel already prefetches files from the disk, so most of the time
>   this is just a memory copy in parallel...

That only works for sequential reads on a verry limited number of files
(if more than one at all). If you have 1000+ clients requesting files
from a webserver for example they will never be covered by the kernels
read-ahead.

And don't forget. Multi core systems are more and more widely
spread. What is wrong with 2 cores doing memcpy twice as fast?

> - How many threads do we launch in parallel ? This depends on the
>   overhead of context switching between threads, which can not be
>   determined at runtime.
>
> I think that the solution with mincore + mmap is better because it uses
> threads only when really needed.
>
> Jérémie

For reads I have to agree with you there. You can only hide the cost of
a thread when the read blocks. By using mincore to test if a read would
block first you avoid context switches where they aren't free.

Unfortunately you can't use mmap() for write (efficiently). Writing to a
mmap()ed file will first read in the old data from disk, then overwrite
the memory and sometime later write it back to disk. There seems to be
no way to tell the kernel to skip reading in a page on wirst access
because one is going to completly overwrite it anyway.


Do you have a different solution for writes that will avoid threads when
a write won't block? And how do you restart jobs once the data has
actually been commited to disk? (i.e. how do you do fsync()?)

MfG
        Goswin




  reply	other threads:[~2010-10-27 13:43 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-24 10:34 Jon Harrop
2010-10-24 12:51 ` [Caml-list] " philippe
2010-10-24 12:52 ` Dario Teixeira
2010-10-24 16:33   ` oliver
2010-10-24 18:50     ` Dario Teixeira
2010-10-24 19:04       ` bluestorm
2010-10-24 20:02       ` oliver
2010-10-24 21:51     ` Michael Ekstrand
2010-10-24 16:17 ` Jake Donham
2010-10-24 20:54   ` Anil Madhavapeddy
2010-10-24 22:50     ` Jérémie Dimino
2010-10-25  3:42       ` Markus Mottl
2010-10-25  7:49         ` Richard Jones
2010-10-25  8:42       ` Goswin von Brederlow
2010-10-25 11:10         ` Jérémie Dimino
     [not found]           ` <AANLkTimP77PDEChW3Yt6uUy_qxYpj6EOZWQ_==id-LBC@mail.gmail.com>
     [not found]             ` <20101025143317.GB32282@aurora>
2010-10-25 15:34               ` Yaron Minsky
2010-10-25 17:26                 ` Jérémie Dimino
2010-10-27  9:33                   ` Goswin von Brederlow
2010-10-27 11:18                     ` Jérémie Dimino
2010-10-27 13:43                       ` Goswin von Brederlow [this message]
2010-10-27 15:30                         ` Jérémie Dimino
2010-10-28  9:00                           ` Goswin von Brederlow
2010-10-28  9:28                             ` Jérémie Dimino
2010-10-28 10:11                               ` Goswin von Brederlow
2010-10-25 15:58           ` DS
2010-10-24 20:42 ` Goswin von Brederlow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877hh33g52.fsf@frosties.localdomain \
    --to=goswin-v-b@web.de \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).