Gnus development mailing list
 help / color / mirror / Atom feed
From: Steinar Bang <sb@dod.no>
To: ding@gnus.org
Subject: Re: url-retrieve parallelism
Date: Sun, 19 Dec 2010 09:32:27 +0100	[thread overview]
Message-ID: <87r5dew4bo.fsf@dod.no> (raw)
In-Reply-To: <m3d3oyh9oh.fsf@quimbies.gnus.org>

>>>>> Lars Magne Ingebrigtsen <larsi@gnus.org>:

> shr (and gnus-html, I guess) fire off a call to `url-retrieve' for every
> <img> it finds.  If a HTML message has 1000 <img>s, then Emacs is going
> to do a DOS of the poor image web server.

> We obviously want to have more than a single `url-retrieve' call going
> at once,

If they are all going to the same server, I think the network friendly
thing is to pipeline them in a single HTTP connection.  Ie. fire off all
GET requests for the images without waiting for a response, and handle
them as they arrive:
  http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8.1.2.2

That means that there needs to be something that listens to the incoming
byte stream and identifies what is a response header and routes it to
its handlers.

The old w3c libwww did that kind of thing, and curl does as well AFAIK.
I have no idea what (url-retrieve) does.  Hm... it has a callback
argument...?  No meaningful google matches on "url-retrieve pipeline"
though... 

Re: callbacks, the way libwww did it was to let the object representing
the request live in the system, and when a response corresponding to the
request returned, the request header obejct was linked to the response
header object.  The MIME type of the response was used to select a
handler.  Ie. it wasn't the request that determined what handler should
handle the response.  But the request object provided context, typically
so that the handler would know where to put its handled results.

I think where I'm going with this, is that just providing a callback
sounds too simple.  It's OK if you get what you want, but not if what
you get back is something other than what you asked for.




  parent reply	other threads:[~2010-12-19  8:32 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-19  0:45 Lars Magne Ingebrigtsen
2010-12-19  2:58 ` Philipp Haselwarter
2010-12-19 15:38   ` Lars Magne Ingebrigtsen
2011-01-19 22:20     ` Ted Zlatanov
2010-12-19  8:32 ` Steinar Bang [this message]
2010-12-19  8:38   ` Steinar Bang
2010-12-19  9:02     ` Steinar Bang
2010-12-19 15:39       ` Lars Magne Ingebrigtsen
2010-12-19  9:16 ` David Engster
2010-12-19 15:41   ` Lars Magne Ingebrigtsen
2010-12-19 16:50 ` Julien Danjou
2010-12-19 17:01   ` Lars Magne Ingebrigtsen
2010-12-21  1:22     ` Katsumi Yamaoka
2010-12-21  1:33       ` Lars Magne Ingebrigtsen
2010-12-21  7:52         ` Robert Pluim
2011-01-02  6:53     ` Lars Magne Ingebrigtsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r5dew4bo.fsf@dod.no \
    --to=sb@dod.no \
    --cc=ding@gnus.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).