Gnus development mailing list
 help / color / mirror / Atom feed
* Asynchroneous image retrieval in HTML rendering
@ 2011-04-23 22:18 Antoine Levitt
  2011-04-28 20:03 ` Antoine Levitt
  2011-04-29  9:04 ` Julien Danjou
  0 siblings, 2 replies; 47+ messages in thread
From: Antoine Levitt @ 2011-04-23 22:18 UTC (permalink / raw)
  To: ding

Right now, with mm-text-html-renderer set to 'shr or 'gnus-w3m, loading
images on slow servers takes ages, and hangs emacs (for instance, check
out gwene.com.wordpress.terrytao, which has latex code rendered into
lots of small images). 'w3m (with emacs-w3m installed) results in a nice
smooth display, and does not hang emacs.

Is there a setting I missed somewhere? If not, how feasible is it to add
asynchroneous image retrieval to shr? Sorry if the subject has already
come up, but I didn't find anything about it.




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-04-23 22:18 Asynchroneous image retrieval in HTML rendering Antoine Levitt
@ 2011-04-28 20:03 ` Antoine Levitt
  2011-04-29  9:04 ` Julien Danjou
  1 sibling, 0 replies; 47+ messages in thread
From: Antoine Levitt @ 2011-04-28 20:03 UTC (permalink / raw)
  To: ding

24/04/11 00:18, Antoine Levitt
> Right now, with mm-text-html-renderer set to 'shr or 'gnus-w3m, loading
> images on slow servers takes ages, and hangs emacs (for instance, check
> out gwene.com.wordpress.terrytao, which has latex code rendered into
> lots of small images). 'w3m (with emacs-w3m installed) results in a nice
> smooth display, and does not hang emacs.
>
> Is there a setting I missed somewhere? If not, how feasible is it to add
> asynchroneous image retrieval to shr? Sorry if the subject has already
> come up, but I didn't find anything about it.

Alright, so this doesn't seem to generate a lot of enthusiasm. If it's
too hard to implement full asynchroneous retrieval, is it possible to
block images declared as being of size 1x1, which are heavily used as
trackers in RSS feeds (for instance, gwene.uk.co.guardian)?  It's a
cheap hack, unlikely to break anything, and it'd save some nasty freezes
on slow connections (the combination of prefetching, lots of images and
slow DNS lookup is particularly annoying, causing long uninterruptible
freezes)




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-04-23 22:18 Asynchroneous image retrieval in HTML rendering Antoine Levitt
  2011-04-28 20:03 ` Antoine Levitt
@ 2011-04-29  9:04 ` Julien Danjou
  2011-04-29  9:18   ` Antoine Levitt
  1 sibling, 1 reply; 47+ messages in thread
From: Julien Danjou @ 2011-04-29  9:04 UTC (permalink / raw)
  To: Antoine Levitt; +Cc: ding

[-- Attachment #1: Type: text/plain, Size: 701 bytes --]

On Sun, Apr 24 2011, Antoine Levitt wrote:

> Right now, with mm-text-html-renderer set to 'shr or 'gnus-w3m, loading
> images on slow servers takes ages, and hangs emacs (for instance, check
> out gwene.com.wordpress.terrytao, which has latex code rendered into
> lots of small images). 'w3m (with emacs-w3m installed) results in a nice
> smooth display, and does not hang emacs.
>
> Is there a setting I missed somewhere? If not, how feasible is it to add
> asynchroneous image retrieval to shr? Sorry if the subject has already
> come up, but I didn't find anything about it.

AFAICT, the image downloading is already asynchronous.

-- 
Julien Danjou
❱ http://julien.danjou.info

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-04-29  9:04 ` Julien Danjou
@ 2011-04-29  9:18   ` Antoine Levitt
  2011-04-29 10:01     ` Julien Danjou
  0 siblings, 1 reply; 47+ messages in thread
From: Antoine Levitt @ 2011-04-29  9:18 UTC (permalink / raw)
  To: ding

29/04/11 11:04, Julien Danjou
> On Sun, Apr 24 2011, Antoine Levitt wrote:
>
>> Right now, with mm-text-html-renderer set to 'shr or 'gnus-w3m, loading
>> images on slow servers takes ages, and hangs emacs (for instance, check
>> out gwene.com.wordpress.terrytao, which has latex code rendered into
>> lots of small images). 'w3m (with emacs-w3m installed) results in a nice
>> smooth display, and does not hang emacs.
>>
>> Is there a setting I missed somewhere? If not, how feasible is it to add
>> asynchroneous image retrieval to shr? Sorry if the subject has already
>> come up, but I didn't find anything about it.
>
> AFAICT, the image downloading is already asynchronous.

Really? I do see freezes that look like they are related to image
downloading. Can you try rendering gwene.com.wordpress.terrytao for
instance? Here, it freezes emacs for a few seconds, even on a fast
connection, and C-ging it with toggle-debug-on-quit gives a backtrace to
url-retrieve.



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-04-29  9:18   ` Antoine Levitt
@ 2011-04-29 10:01     ` Julien Danjou
  2011-04-29 20:34       ` Antoine Levitt
  0 siblings, 1 reply; 47+ messages in thread
From: Julien Danjou @ 2011-04-29 10:01 UTC (permalink / raw)
  To: Antoine Levitt; +Cc: ding

[-- Attachment #1: Type: text/plain, Size: 725 bytes --]

On Fri, Apr 29 2011, Antoine Levitt wrote:

> Really? I do see freezes that look like they are related to image
> downloading. Can you try rendering gwene.com.wordpress.terrytao for
> instance? Here, it freezes emacs for a few seconds, even on a fast
> connection, and C-ging it with toggle-debug-on-quit gives a backtrace to
> url-retrieve.

I don't use NNTP, so I can't test, but what I can say is that most of
freeze I'm experiencing is due to slow/no DNS resolution or connection.
It seems to methat resolving/connection is blocking Emacs, but not the
data downloading. That's something Lars proposed to tackle down a while
back on emacs-devel IIRC, btw.

-- 
Julien Danjou
❱ http://julien.danjou.info

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-04-29 10:01     ` Julien Danjou
@ 2011-04-29 20:34       ` Antoine Levitt
  2011-05-01 15:13         ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 47+ messages in thread
From: Antoine Levitt @ 2011-04-29 20:34 UTC (permalink / raw)
  To: ding

29/04/11 12:01, Julien Danjou
> On Fri, Apr 29 2011, Antoine Levitt wrote:
>
>> Really? I do see freezes that look like they are related to image
>> downloading. Can you try rendering gwene.com.wordpress.terrytao for
>> instance? Here, it freezes emacs for a few seconds, even on a fast
>> connection, and C-ging it with toggle-debug-on-quit gives a backtrace to
>> url-retrieve.
>
> I don't use NNTP, so I can't test, but what I can say is that most of
> freeze I'm experiencing is due to slow/no DNS resolution or connection.
> It seems to methat resolving/connection is blocking Emacs, but not the
> data downloading. That's something Lars proposed to tackle down a while
> back on emacs-devel IIRC, btw.

It seems likely this is a DNS problem, yes.

Alright, I'm going to adopt gnus rule number 1 then: wait for Lars to
fix things :)




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-04-29 20:34       ` Antoine Levitt
@ 2011-05-01 15:13         ` Lars Magne Ingebrigtsen
  2011-05-01 15:23           ` Antoine Levitt
  0 siblings, 1 reply; 47+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-05-01 15:13 UTC (permalink / raw)
  To: ding

Antoine Levitt <antoine.levitt@gmail.com> writes:

> It seems likely this is a DNS problem, yes.
>
> Alright, I'm going to adopt gnus rule number 1 then: wait for Lars to
> fix things :)

:-)

I think the fix here might be to introduce a new variable
`url-asynchronous-dns-lookup' (or something), and then have shr bind it
to make url.el more asynchronous.  The pauses we experience today when
trying to do asynchronous URL retrieval aren't acceptable.  Especially
if a domain is down or semi-down -- reading the articles becomes almost
impossible.

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-01 15:13         ` Lars Magne Ingebrigtsen
@ 2011-05-01 15:23           ` Antoine Levitt
  2011-05-01 15:42             ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 47+ messages in thread
From: Antoine Levitt @ 2011-05-01 15:23 UTC (permalink / raw)
  To: ding

01/05/11 17:13, Lars Magne Ingebrigtsen
> Antoine Levitt <antoine.levitt@gmail.com> writes:
>
>> It seems likely this is a DNS problem, yes.
>>
>> Alright, I'm going to adopt gnus rule number 1 then: wait for Lars to
>> fix things :)
>
> :-)
>
> I think the fix here might be to introduce a new variable
> `url-asynchronous-dns-lookup' (or something), and then have shr bind it
> to make url.el more asynchronous.  The pauses we experience today when
> trying to do asynchronous URL retrieval aren't acceptable.  Especially
> if a domain is down or semi-down -- reading the articles becomes almost
> impossible.

Why is DNS even synchroneous in the first place? I'd expect
"asynchroneous" to mean that both DNS lookup and data transfer are
non-blocking.




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-01 15:23           ` Antoine Levitt
@ 2011-05-01 15:42             ` Lars Magne Ingebrigtsen
  2011-05-01 16:06               ` Lars Magne Ingebrigtsen
  2011-05-02  7:52               ` Antoine Levitt
  0 siblings, 2 replies; 47+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-05-01 15:42 UTC (permalink / raw)
  To: ding

Antoine Levitt <antoine.levitt@gmail.com> writes:

>> I think the fix here might be to introduce a new variable
>> `url-asynchronous-dns-lookup' (or something), and then have shr bind it
>> to make url.el more asynchronous.  The pauses we experience today when
>> trying to do asynchronous URL retrieval aren't acceptable.  Especially
>> if a domain is down or semi-down -- reading the articles becomes almost
>> impossible.

Oopsie.  Actually, the image retrieval in Emacs 24 was extremely
synchronous.  Due to changes in `open-network-stream', the :nowait flag
was no longer passed on, so all connections were established
synchronously, even if url.el tried to do it asynchronously.

I've now fixed this and pushed the fix to Emacs 24.

> Why is DNS even synchroneous in the first place? I'd expect
> "asynchroneous" to mean that both DNS lookup and data transfer are
> non-blocking.

Emacs uses the built-in resolver (from C), and it does it in the main
Emacs thread.  So whenever you resolve something in Emacs, *everything*
in Emacs stops.

I've written the dns.el resolver in Emacs Lisp, but, of course, it's not
as sturdy as the libc one.  (And only works on Linux.)  A better fix
would be to write a C-level resolver that forks its own thread, does the
resolving, and then does a callback.

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-01 15:42             ` Lars Magne Ingebrigtsen
@ 2011-05-01 16:06               ` Lars Magne Ingebrigtsen
  2011-05-01 16:34                 ` Adam Sjøgren
  2011-05-01 17:15                 ` Antoine Levitt
  2011-05-02  7:52               ` Antoine Levitt
  1 sibling, 2 replies; 47+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-05-01 16:06 UTC (permalink / raw)
  To: ding

The other thing about how shr inserts images has to do with the caching.

If shr has already retrieved the images, it will insert them into the
article synchronously.  The idea is that having the text move around a
lot is more annoying than waiting a tenth of a second more to see the
article.

However, if you're reading an article with, say, 500 images, which isn't
unusual in math-ey articles, then inserting all those images takes a
while.

It might, perhaps, make more sense to only insert, say, the first ten
images synchronously, and then switch to asynchronous for the remaining
pictures?  

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-01 16:06               ` Lars Magne Ingebrigtsen
@ 2011-05-01 16:34                 ` Adam Sjøgren
  2011-05-01 16:45                   ` Lars Magne Ingebrigtsen
  2011-05-01 17:15                 ` Antoine Levitt
  1 sibling, 1 reply; 47+ messages in thread
From: Adam Sjøgren @ 2011-05-01 16:34 UTC (permalink / raw)
  To: ding

On Sun, 01 May 2011 18:06:00 +0200, Lars wrote:

> It might, perhaps, make more sense to only insert, say, the first ten
> images synchronously, and then switch to asynchronous for the remaining
> pictures?  

Ideally insert the ones that will be visible in the current view
synchronously and the rest asynchronously.

I don't know if that is easy to do, if not, a fixed number sounds like a
good compromise.


  :-),

   Adam

-- 
 "I wanted a computer, not a glorified fruit machine."        Adam Sjøgren
                                                         asjo@koldfront.dk




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-01 16:34                 ` Adam Sjøgren
@ 2011-05-01 16:45                   ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 47+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-05-01 16:45 UTC (permalink / raw)
  To: ding

asjo@koldfront.dk (Adam Sjøgren) writes:

> Ideally insert the ones that will be visible in the current view
> synchronously and the rest asynchronously.
>
> I don't know if that is easy to do, if not, a fixed number sounds like a
> good compromise.

It's rather difficult to do.  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-01 16:06               ` Lars Magne Ingebrigtsen
  2011-05-01 16:34                 ` Adam Sjøgren
@ 2011-05-01 17:15                 ` Antoine Levitt
  1 sibling, 0 replies; 47+ messages in thread
From: Antoine Levitt @ 2011-05-01 17:15 UTC (permalink / raw)
  To: ding

01/05/11 18:06, Lars Magne Ingebrigtsen
> The other thing about how shr inserts images has to do with the caching.
>
> If shr has already retrieved the images, it will insert them into the
> article synchronously.  The idea is that having the text move around a
> lot is more annoying than waiting a tenth of a second more to see the
> article.
>
> However, if you're reading an article with, say, 500 images, which isn't
> unusual in math-ey articles, then inserting all those images takes a
> while.

That doesn't seem to be that big an inconvenience: on
gwene.com.wordpress.terrytao (which has pretty long posts with a lot of
images), a fully cached article loads in under a second (perceptible,
but not that annoying). So I'd say the current way of loading cached
images is pretty safe.




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-01 15:42             ` Lars Magne Ingebrigtsen
  2011-05-01 16:06               ` Lars Magne Ingebrigtsen
@ 2011-05-02  7:52               ` Antoine Levitt
  2011-05-02 14:30                 ` Lars Magne Ingebrigtsen
  1 sibling, 1 reply; 47+ messages in thread
From: Antoine Levitt @ 2011-05-02  7:52 UTC (permalink / raw)
  To: ding

01/05/11 17:42, Lars Magne Ingebrigtsen
> Antoine Levitt <antoine.levitt@gmail.com> writes:
>
>>> I think the fix here might be to introduce a new variable
>>> `url-asynchronous-dns-lookup' (or something), and then have shr bind it
>>> to make url.el more asynchronous.  The pauses we experience today when
>>> trying to do asynchronous URL retrieval aren't acceptable.  Especially
>>> if a domain is down or semi-down -- reading the articles becomes almost
>>> impossible.
>
> Oopsie.  Actually, the image retrieval in Emacs 24 was extremely
> synchronous.  Due to changes in `open-network-stream', the :nowait flag
> was no longer passed on, so all connections were established
> synchronously, even if url.el tried to do it asynchronously.
>
> I've now fixed this and pushed the fix to Emacs 24.

That doesn't change much for my test cases, so I guess the DNS lookup is
indeed the bottleneck.

>
>> Why is DNS even synchroneous in the first place? I'd expect
>> "asynchroneous" to mean that both DNS lookup and data transfer are
>> non-blocking.
>
> Emacs uses the built-in resolver (from C), and it does it in the main
> Emacs thread.  So whenever you resolve something in Emacs, *everything*
> in Emacs stops.
>
> I've written the dns.el resolver in Emacs Lisp, but, of course, it's not
> as sturdy as the libc one.  (And only works on Linux.)  A better fix
> would be to write a C-level resolver that forks its own thread, does the
> resolving, and then does a callback.

Damn emacs monothread model ;)




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-02  7:52               ` Antoine Levitt
@ 2011-05-02 14:30                 ` Lars Magne Ingebrigtsen
  2011-05-02 14:53                   ` Antoine Levitt
                                     ` (3 more replies)
  0 siblings, 4 replies; 47+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-05-02 14:30 UTC (permalink / raw)
  To: ding

Antoine Levitt <antoine.levitt@gmail.com> writes:

> That doesn't change much for my test cases, so I guess the DNS lookup is
> indeed the bottleneck.

Have you verified that the DNS lookup is really the problem?  Because I
have more theories.  :-) (Or, as scientists call it, "guesses".)

One thing about the async image retrieval is that it's massively
parallel.  If there's 500 images in the article, Gnus will call
`url-retrieve' 500 times in rapid succession, which will then fire off
500 HTTP connections (in parallel, more or less).

I suspect that Emacs doesn't really like having this number of async
socket sentinels going on at the same time, which means that the main
Emacs thread gets no CPU time, which gives us interactive pauses.

(And it's really rude towards the server.  It's basically a one-Emacs
DoS attack.)

So I'm going to rewrite this by adding a new library, called, er,
url-queue.el, which has the same interface as `url-retrieve', but which
manages the parallelism in a more sensible way, by not having more than
(for instance) four connections going at the same time.

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-02 14:30                 ` Lars Magne Ingebrigtsen
@ 2011-05-02 14:53                   ` Antoine Levitt
  2011-05-02 15:17                     ` Lars Magne Ingebrigtsen
  2011-05-02 17:08                   ` Lars Magne Ingebrigtsen
                                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 47+ messages in thread
From: Antoine Levitt @ 2011-05-02 14:53 UTC (permalink / raw)
  To: ding

02/05/11 16:30, Lars Magne Ingebrigtsen
> Antoine Levitt <antoine.levitt@gmail.com> writes:
>
>> That doesn't change much for my test cases, so I guess the DNS lookup is
>> indeed the bottleneck.
>
> Have you verified that the DNS lookup is really the problem?  Because I
> have more theories.  :-) (Or, as scientists call it, "guesses".)

No, just that your recent commit didn't change anything. But the groups
that have trouble, such as gwene.fr.lemonde.blog.vidberg, are also
those whose DNS is slow/buggy.

>
> One thing about the async image retrieval is that it's massively
> parallel.  If there's 500 images in the article, Gnus will call
> `url-retrieve' 500 times in rapid succession, which will then fire off
> 500 HTTP connections (in parallel, more or less).
>
> I suspect that Emacs doesn't really like having this number of async
> socket sentinels going on at the same time, which means that the main
> Emacs thread gets no CPU time, which gives us interactive pauses.

That doesn't seem likely: I also get freezes in groups with just one or
two images. And in this case, doesn't that rather mean that there's
something wrong with the scheduler from emacs/the OS?

>
> (And it's really rude towards the server.  It's basically a one-Emacs
> DoS attack.)

Isn't that already what all browsers do?




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-02 14:53                   ` Antoine Levitt
@ 2011-05-02 15:17                     ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 47+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-05-02 15:17 UTC (permalink / raw)
  To: ding

Antoine Levitt <antoine.levitt@gmail.com> writes:

> But the groups that have trouble, such as
> gwene.fr.lemonde.blog.vidberg, are also those whose DNS is slow/buggy.

Right.  So getting an async resolver somehow is also necessary.

> That doesn't seem likely: I also get freezes in groups with just one or
> two images. And in this case, doesn't that rather mean that there's
> something wrong with the scheduler from emacs/the OS?

Sure.  :-)

>> (And it's really rude towards the server.  It's basically a one-Emacs
>> DoS attack.)
>
> Isn't that already what all browsers do?

Nope.  They limit the number of concurrent connections to the same
server. 

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-02 14:30                 ` Lars Magne Ingebrigtsen
  2011-05-02 14:53                   ` Antoine Levitt
@ 2011-05-02 17:08                   ` Lars Magne Ingebrigtsen
  2011-05-02 17:24                     ` Antoine Levitt
  2011-05-02 17:12                   ` Ted Zlatanov
  2011-05-02 21:15                   ` Steinar Bang
  3 siblings, 1 reply; 47+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-05-02 17:08 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> So I'm going to rewrite this by adding a new library, called, er,
> url-queue.el, which has the same interface as `url-retrieve', but which
> manages the parallelism in a more sensible way, by not having more than
> (for instance) four connections going at the same time.

I've now done this and pushed it to bzr Emacs.  Give it a whirl and see
whether it makes any difference on articles with a lot of images that
are uncached.  

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-02 14:30                 ` Lars Magne Ingebrigtsen
  2011-05-02 14:53                   ` Antoine Levitt
  2011-05-02 17:08                   ` Lars Magne Ingebrigtsen
@ 2011-05-02 17:12                   ` Ted Zlatanov
  2011-05-02 17:20                     ` Lars Magne Ingebrigtsen
  2011-05-02 21:15                   ` Steinar Bang
  3 siblings, 1 reply; 47+ messages in thread
From: Ted Zlatanov @ 2011-05-02 17:12 UTC (permalink / raw)
  To: ding

On Mon, 02 May 2011 16:30:09 +0200 Lars Magne Ingebrigtsen <larsi@gnus.org> wrote: 

LMI> Antoine Levitt <antoine.levitt@gmail.com> writes:
>> That doesn't change much for my test cases, so I guess the DNS lookup is
>> indeed the bottleneck.

LMI> Have you verified that the DNS lookup is really the problem?  Because I
LMI> have more theories.  :-) (Or, as scientists call it, "guesses".)

LMI> One thing about the async image retrieval is that it's massively
LMI> parallel.  If there's 500 images in the article, Gnus will call
LMI> `url-retrieve' 500 times in rapid succession, which will then fire off
LMI> 500 HTTP connections (in parallel, more or less).

LMI> I suspect that Emacs doesn't really like having this number of async
LMI> socket sentinels going on at the same time, which means that the main
LMI> Emacs thread gets no CPU time, which gives us interactive pauses.

LMI> (And it's really rude towards the server.  It's basically a one-Emacs
LMI> DoS attack.)

LMI> So I'm going to rewrite this by adding a new library, called, er,
LMI> url-queue.el, which has the same interface as `url-retrieve', but which
LMI> manages the parallelism in a more sensible way, by not having more than
LMI> (for instance) four connections going at the same time.

It may be nice at this point to include libcurl in Emacs.

I would start by testing an article with 500 local images vs. one with
500 remote images.  That will eliminate the network variable.

Ted




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-02 17:12                   ` Ted Zlatanov
@ 2011-05-02 17:20                     ` Lars Magne Ingebrigtsen
  2011-05-02 21:18                       ` Steinar Bang
  0 siblings, 1 reply; 47+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-05-02 17:20 UTC (permalink / raw)
  To: ding

Ted Zlatanov <tzz@lifelogs.com> writes:

> It may be nice at this point to include libcurl in Emacs.

Perhaps...  but I have a feeling that it's just as easy to do this at
the Emacs Lisp level than at the C level.

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-02 17:08                   ` Lars Magne Ingebrigtsen
@ 2011-05-02 17:24                     ` Antoine Levitt
  2011-05-02 17:40                       ` Lars Magne Ingebrigtsen
  2011-05-02 17:53                       ` Lars Magne Ingebrigtsen
  0 siblings, 2 replies; 47+ messages in thread
From: Antoine Levitt @ 2011-05-02 17:24 UTC (permalink / raw)
  To: ding

02/05/11 19:08, Lars Magne Ingebrigtsen
> Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
>
>> So I'm going to rewrite this by adding a new library, called, er,
>> url-queue.el, which has the same interface as `url-retrieve', but which
>> manages the parallelism in a more sensible way, by not having more than
>> (for instance) four connections going at the same time.
>
> I've now done this and pushed it to bzr Emacs.  Give it a whirl and see
> whether it makes any difference on articles with a lot of images that
> are uncached.  

Old code: about 1min10
New code: about 1min30 ;)

That's on gwene.com.wordpress.terrytao, after rm -rf .emacs.d/url/cache

Which I guess means there's a small benefit in parallelizing as much as
possible. But then again, that's an extreme case.




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-02 17:24                     ` Antoine Levitt
@ 2011-05-02 17:40                       ` Lars Magne Ingebrigtsen
  2011-05-02 21:52                         ` Antoine Levitt
                                           ` (3 more replies)
  2011-05-02 17:53                       ` Lars Magne Ingebrigtsen
  1 sibling, 4 replies; 47+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-05-02 17:40 UTC (permalink / raw)
  To: ding

Antoine Levitt <antoine.levitt@gmail.com> writes:

> Old code: about 1min10
> New code: about 1min30 ;)

Hey, it does something!

> That's on gwene.com.wordpress.terrytao, after rm -rf .emacs.d/url/cache
>
> Which I guess means there's a small benefit in parallelizing as much as
> possible. But then again, that's an extreme case.

It was supposed to be slower to give a better interactive performance.
:-)  Does it feel snappier now -- that is, can you read page two of one
of his articles without Emacs locking up when you hit SPC?  For me, this
now seems to work adequately, while before I had to wait a longer time.

The next thing to tackle is doing the DNS resolving asynchronously.  I'm
not quite sure where to put that in the chain, though...  Hm...  every
call to `url-queue-retrieve' could fire up an async dns.el call, and
then the callback from that call could trigger (possible) queue
runnings.  And stuff that doesn't resolve at all wouldn't hit url.el at
all.

Yes, I think that would be workable...  but dns.el only works on
Linux-like machines, so it feels kinda yucky to rely on that.

If only somebody could implement an async C-level resolver.  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-02 17:24                     ` Antoine Levitt
  2011-05-02 17:40                       ` Lars Magne Ingebrigtsen
@ 2011-05-02 17:53                       ` Lars Magne Ingebrigtsen
  2011-05-02 18:18                         ` Lars Magne Ingebrigtsen
  1 sibling, 1 reply; 47+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-05-02 17:53 UTC (permalink / raw)
  To: ding

Antoine Levitt <antoine.levitt@gmail.com> writes:

> That's on gwene.com.wordpress.terrytao, after rm -rf .emacs.d/url/cache

Oh, wow.  I only tried selecting the first message.  If I select an
earlier message, I get the async prefetch stuff going, and the queue
grows to 4k entries.  And Emacs goes very slow, even if there aren't
that many sockets operating.

Hm...

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-02 17:53                       ` Lars Magne Ingebrigtsen
@ 2011-05-02 18:18                         ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 47+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-05-02 18:18 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Oh, wow.  I only tried selecting the first message.  If I select an
> earlier message, I get the async prefetch stuff going, and the queue
> grows to 4k entries.  And Emacs goes very slow, even if there aren't
> that many sockets operating.

Oops.  I didn't have the url-queue stuff working, since it wasn't
autoloaded.  :-)

So I've now checked in an ";;;###autoload" on the function, and Emacs
responds a bit better.  But selecting an early terrytao article is still
painful, and I'm not sure where things are hanging.

However, I put

76.74.254.126   wordpress.com

into my /etc/hosts, and things got a whole lot spiffier.  So that lends
credence to the theory that the DNS resolving is the main remaining
problem. 

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-02 14:30                 ` Lars Magne Ingebrigtsen
                                     ` (2 preceding siblings ...)
  2011-05-02 17:12                   ` Ted Zlatanov
@ 2011-05-02 21:15                   ` Steinar Bang
  3 siblings, 0 replies; 47+ messages in thread
From: Steinar Bang @ 2011-05-02 21:15 UTC (permalink / raw)
  To: ding

>>>>> Lars Magne Ingebrigtsen <larsi@gnus.org>:

> So I'm going to rewrite this by adding a new library, called, er,
> url-queue.el, which has the same interface as `url-retrieve', but which
> manages the parallelism in a more sensible way, by not having more than
> (for instance) four connections going at the same time.

If they are all GETs against the same server, the nicest way for a
client to behave is to just keep the TCP connection up HTTP/1.1 style
and just fire off the GETs pipelined.








^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-02 17:20                     ` Lars Magne Ingebrigtsen
@ 2011-05-02 21:18                       ` Steinar Bang
  2011-05-02 22:40                         ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 47+ messages in thread
From: Steinar Bang @ 2011-05-02 21:18 UTC (permalink / raw)
  To: ding

>>>>> Lars Magne Ingebrigtsen <larsi@gnus.org>:

> Perhaps...  but I have a feeling that it's just as easy to do this at
> the Emacs Lisp level than at the C level.

Can you do HTTP pipelining at the emacs lisp level?




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-02 17:40                       ` Lars Magne Ingebrigtsen
@ 2011-05-02 21:52                         ` Antoine Levitt
  2011-05-04 18:40                         ` Simon Josefsson
                                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 47+ messages in thread
From: Antoine Levitt @ 2011-05-02 21:52 UTC (permalink / raw)
  To: ding

02/05/11 19:40, Lars Magne Ingebrigtsen
> Antoine Levitt <antoine.levitt@gmail.com> writes:
>
>> Old code: about 1min10
>> New code: about 1min30 ;)
>
> Hey, it does something!

Since this was done with the version with the autoload bug, there was
actually no difference between the two :) Yet another proof that numbers
without an estimate on the error don't mean anything ...

>
>> That's on gwene.com.wordpress.terrytao, after rm -rf .emacs.d/url/cache
>>
>> Which I guess means there's a small benefit in parallelizing as much as
>> possible. But then again, that's an extreme case.
>
> It was supposed to be slower to give a better interactive performance.
> :-)  Does it feel snappier now -- that is, can you read page two of one
> of his articles without Emacs locking up when you hit SPC?  For me, this
> now seems to work adequately, while before I had to wait a longer
> time.

Nope, still freezes pretty badly (even with the autoload fix).

>
> The next thing to tackle is doing the DNS resolving asynchronously.  I'm
> not quite sure where to put that in the chain, though...  Hm...  every
> call to `url-queue-retrieve' could fire up an async dns.el call, and
> then the callback from that call could trigger (possible) queue
> runnings.  And stuff that doesn't resolve at all wouldn't hit url.el at
> all.
>
> Yes, I think that would be workable...  but dns.el only works on
> Linux-like machines, so it feels kinda yucky to rely on that.
>
> If only somebody could implement an async C-level resolver.  :-)

Sorry I'm not taking you up on that, but any knowledge I might have once
had about asynchroneous coding is long gone, and the only thing I
remember about it is "danger, will robinson" ;)

I might end up attempting to have a go at it if it becomes too annoying,
but don't hold you breath.




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-02 21:18                       ` Steinar Bang
@ 2011-05-02 22:40                         ` Lars Magne Ingebrigtsen
  2011-05-03 14:09                           ` Ted Zlatanov
  0 siblings, 1 reply; 47+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-05-02 22:40 UTC (permalink / raw)
  To: ding

Steinar Bang <sb@dod.no> writes:

> Can you do HTTP pipelining at the emacs lisp level?

I don't think the url.el library supports it, but it shouldn't be too
difficult to implement -- if it's worth it.  Which I'm not sure it is.
Sure, setting up new connections is pretty costly, but not that
costly... 

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-02 22:40                         ` Lars Magne Ingebrigtsen
@ 2011-05-03 14:09                           ` Ted Zlatanov
  2011-05-03 18:48                             ` Steinar Bang
  0 siblings, 1 reply; 47+ messages in thread
From: Ted Zlatanov @ 2011-05-03 14:09 UTC (permalink / raw)
  To: ding

On Tue, 03 May 2011 00:40:14 +0200 Lars Magne Ingebrigtsen <larsi@gnus.org> wrote: 

LMI> Steinar Bang <sb@dod.no> writes:
>> Can you do HTTP pipelining at the emacs lisp level?

LMI> I don't think the url.el library supports it, but it shouldn't be too
LMI> difficult to implement -- if it's worth it.  Which I'm not sure it is.
LMI> Sure, setting up new connections is pretty costly, but not that
LMI> costly... 

It would be polite to pipeline.  Say that 10 times fast.

Ted




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-03 14:09                           ` Ted Zlatanov
@ 2011-05-03 18:48                             ` Steinar Bang
  0 siblings, 0 replies; 47+ messages in thread
From: Steinar Bang @ 2011-05-03 18:48 UTC (permalink / raw)
  To: ding

>>>>> Ted Zlatanov <tzz@lifelogs.com>:

> On Tue, 03 May 2011 00:40:14 +0200 Lars Magne Ingebrigtsen <larsi@gnus.org> wrote: 

LMI> I don't think the url.el library supports it, but it shouldn't be
LMI> too difficult to implement -- if it's worth it.  Which I'm not sure
LMI> it is.  Sure, setting up new connections is pretty costly, but not
LMI> that costly...

> It would be polite to pipeline.  Say that 10 times fast.

Yeah, polite to pipeline!

(Ref "Yeah, charcoal!" http://www.youtube.com/watch?v=Ul72F0YaAcU )






^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-02 17:40                       ` Lars Magne Ingebrigtsen
  2011-05-02 21:52                         ` Antoine Levitt
@ 2011-05-04 18:40                         ` Simon Josefsson
  2011-05-30 20:55                           ` Lars Magne Ingebrigtsen
  2011-05-05  8:25                         ` Julien Danjou
  2011-05-05 10:48                         ` Ted Zlatanov
  3 siblings, 1 reply; 47+ messages in thread
From: Simon Josefsson @ 2011-05-04 18:40 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> If only somebody could implement an async C-level resolver.  :-)

What about GNU adns?

http://www.chiark.greenend.org.uk/~ian/adns/

/Simon



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-02 17:40                       ` Lars Magne Ingebrigtsen
  2011-05-02 21:52                         ` Antoine Levitt
  2011-05-04 18:40                         ` Simon Josefsson
@ 2011-05-05  8:25                         ` Julien Danjou
  2011-05-30 20:56                           ` Lars Magne Ingebrigtsen
  2011-05-05 10:48                         ` Ted Zlatanov
  3 siblings, 1 reply; 47+ messages in thread
From: Julien Danjou @ 2011-05-05  8:25 UTC (permalink / raw)
  To: ding

[-- Attachment #1: Type: text/plain, Size: 235 bytes --]

On Mon, May 02 2011, Lars Magne Ingebrigtsen wrote:

> If only somebody could implement an async C-level resolver.  :-)

There's udns:

    http://www.corpit.ru/mjt/udns.html

-- 
Julien Danjou
❱ http://julien.danjou.info

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-02 17:40                       ` Lars Magne Ingebrigtsen
                                           ` (2 preceding siblings ...)
  2011-05-05  8:25                         ` Julien Danjou
@ 2011-05-05 10:48                         ` Ted Zlatanov
  2011-05-05 10:58                           ` Julien Danjou
  2011-05-05 11:17                           ` Antoine Levitt
  3 siblings, 2 replies; 47+ messages in thread
From: Ted Zlatanov @ 2011-05-05 10:48 UTC (permalink / raw)
  To: ding

On Mon, 02 May 2011 19:40:43 +0200 Lars Magne Ingebrigtsen <larsi@gnus.org> wrote: 

LMI> The next thing to tackle is doing the DNS resolving asynchronously.
...
LMI> If only somebody could implement an async C-level resolver.  :-)

On Thu, 05 May 2011 10:25:15 +0200 Julien Danjou <julien@danjou.info> wrote: 

JD>     http://www.corpit.ru/mjt/udns.html

On Wed, 04 May 2011 20:40:58 +0200 Simon Josefsson <simon@josefsson.org> wrote: 

SJ> http://www.chiark.greenend.org.uk/~ian/adns/

Has anyone tested to see if DNS lookups are actually a problem?  Test an
entirely local page vs. a page with lots of remote image URLs.  I don't
see a difference on my machine.

Typically DNS lookups are very fast and HTML pages don't have a huge mix
of domains for the images they reference, so I think this is wasted
effort *for the specific purpose of rendering HTML faster*.  I can't
think of any practical uses within Emacs for async DNS lookups, either.

Ted




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-05 10:48                         ` Ted Zlatanov
@ 2011-05-05 10:58                           ` Julien Danjou
  2011-05-05 11:17                           ` Antoine Levitt
  1 sibling, 0 replies; 47+ messages in thread
From: Julien Danjou @ 2011-05-05 10:58 UTC (permalink / raw)
  To: Ted Zlatanov; +Cc: ding

[-- Attachment #1: Type: text/plain, Size: 933 bytes --]

On Thu, May 05 2011, Ted Zlatanov wrote:

> Has anyone tested to see if DNS lookups are actually a problem?  Test an
> entirely local page vs. a page with lots of remote image URLs.  I don't
> see a difference on my machine.

In a web browser? Because they are parallelized and/or multi-threaded,
maybe?

> Typically DNS lookups are very fast and HTML pages don't have a huge mix
> of domains for the images they reference, so I think this is wasted
> effort *for the specific purpose of rendering HTML faster*.  I can't
> think of any practical uses within Emacs for async DNS lookups, either.

I may be wrong, but I disagree: they are usually fast, except when there
is a problem or you're behind a very slow link. Just because they are
serialized and synchronous.

OTOH, I agree that the only useful place for this enhancement is
probably HTML rendering…

-- 
Julien Danjou
❱ http://julien.danjou.info

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-05 10:48                         ` Ted Zlatanov
  2011-05-05 10:58                           ` Julien Danjou
@ 2011-05-05 11:17                           ` Antoine Levitt
  2011-05-05 13:51                             ` Ted Zlatanov
                                               ` (2 more replies)
  1 sibling, 3 replies; 47+ messages in thread
From: Antoine Levitt @ 2011-05-05 11:17 UTC (permalink / raw)
  To: ding

05/05/11 12:48, Ted Zlatanov
> On Mon, 02 May 2011 19:40:43 +0200 Lars Magne Ingebrigtsen <larsi@gnus.org> wrote: 
>
> LMI> The next thing to tackle is doing the DNS resolving asynchronously.
> ...
> LMI> If only somebody could implement an async C-level resolver.  :-)
>
> On Thu, 05 May 2011 10:25:15 +0200 Julien Danjou <julien@danjou.info> wrote: 
>
> JD>     http://www.corpit.ru/mjt/udns.html
>
> On Wed, 04 May 2011 20:40:58 +0200 Simon Josefsson <simon@josefsson.org> wrote: 
>
> SJ> http://www.chiark.greenend.org.uk/~ian/adns/
>
> Has anyone tested to see if DNS lookups are actually a problem?  Test an
> entirely local page vs. a page with lots of remote image URLs.  I don't
> see a difference on my machine.

Maybe your connexion or DNS servers are better than mine, or some other
parameter is different, but I definitely experienced freezes that were
DNS-related. Lars also pointed out explicitely resolving the DNS in
/etc/hosts sped things up immensely for gwene.com.wordpress.terrytao, so
you could try this if you're not convinced by my admitedly
non-repeatable arguments.

>
> Typically DNS lookups are very fast and HTML pages don't have a huge mix
> of domains for the images they reference, so I think this is wasted
> effort *for the specific purpose of rendering HTML faster*.  I can't
> think of any practical uses within Emacs for async DNS lookups,
> either.

I don't know why, but DNS results don't seem to be cached. I don't know
much about DNS queries so I'm not sure if that's supposed to be the case
or not.




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-05 11:17                           ` Antoine Levitt
@ 2011-05-05 13:51                             ` Ted Zlatanov
  2011-05-05 14:01                             ` David Engster
  2011-05-05 14:25                             ` David Engster
  2 siblings, 0 replies; 47+ messages in thread
From: Ted Zlatanov @ 2011-05-05 13:51 UTC (permalink / raw)
  To: ding

On Thu, 05 May 2011 13:17:40 +0200 Antoine Levitt <antoine.levitt@gmail.com> wrote: 

AL> 05/05/11 12:48, Ted Zlatanov
>> On Mon, 02 May 2011 19:40:43 +0200 Lars Magne Ingebrigtsen <larsi@gnus.org> wrote: 
>> 
LMI> The next thing to tackle is doing the DNS resolving asynchronously.
>> ...
LMI> If only somebody could implement an async C-level resolver.  :-)
>> 
>> On Thu, 05 May 2011 10:25:15 +0200 Julien Danjou <julien@danjou.info> wrote: 
>> 
JD> http://www.corpit.ru/mjt/udns.html
>> 
>> On Wed, 04 May 2011 20:40:58 +0200 Simon Josefsson <simon@josefsson.org> wrote: 
>> 
SJ> http://www.chiark.greenend.org.uk/~ian/adns/
>> 
>> Has anyone tested to see if DNS lookups are actually a problem?  Test an
>> entirely local page vs. a page with lots of remote image URLs.  I don't
>> see a difference on my machine.

AL> Maybe your connexion or DNS servers are better than mine, or some other
AL> parameter is different, but I definitely experienced freezes that were
AL> DNS-related. Lars also pointed out explicitely resolving the DNS in
AL> /etc/hosts sped things up immensely for gwene.com.wordpress.terrytao, so
AL> you could try this if you're not convinced by my admitedly
AL> non-repeatable arguments.

OK.

>> Typically DNS lookups are very fast and HTML pages don't have a huge mix
>> of domains for the images they reference, so I think this is wasted
>> effort *for the specific purpose of rendering HTML faster*.  I can't
>> think of any practical uses within Emacs for async DNS lookups,
>> either.

AL> I don't know why, but DNS results don't seem to be cached. I don't know
AL> much about DNS queries so I'm not sure if that's supposed to be the case
AL> or not.

It's configurable and depends on the DNS records, but typically they
have a TTL of at least an hour.

On Thu, 05 May 2011 12:58:24 +0200 Julien Danjou <julien@danjou.info> wrote: 

JD> I may be wrong, but I disagree: they are usually fast, except when there
JD> is a problem or you're behind a very slow link. Just because they are
JD> serialized and synchronous.

OK, I'll buy that (and Antoine's argument above).

JD> OTOH, I agree that the only useful place for this enhancement is
JD> probably HTML rendering…

There are some DNS-based anti-spam services but I can't think of
anything else that might require it.

Ted




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-05 11:17                           ` Antoine Levitt
  2011-05-05 13:51                             ` Ted Zlatanov
@ 2011-05-05 14:01                             ` David Engster
  2011-05-05 14:07                               ` Julien Danjou
  2011-05-05 15:05                               ` Lars Magne Ingebrigtsen
  2011-05-05 14:25                             ` David Engster
  2 siblings, 2 replies; 47+ messages in thread
From: David Engster @ 2011-05-05 14:01 UTC (permalink / raw)
  To: ding

Antoine Levitt writes:
> 05/05/11 12:48, Ted Zlatanov
>> On Mon, 02 May 2011 19:40:43 +0200 Lars Magne Ingebrigtsen <larsi@gnus.org> wrote: 
>>
>> LMI> The next thing to tackle is doing the DNS resolving asynchronously.
>
>> ...
>> LMI> If only somebody could implement an async C-level resolver.  :-)
>>
>> On Thu, 05 May 2011 10:25:15 +0200 Julien Danjou <julien@danjou.info> wrote: 
>>
>> JD>     http://www.corpit.ru/mjt/udns.html
>>
>> On Wed, 04 May 2011 20:40:58 +0200 Simon Josefsson <simon@josefsson.org> wrote: 
>>
>> SJ> http://www.chiark.greenend.org.uk/~ian/adns/
>>
>> Has anyone tested to see if DNS lookups are actually a problem?  Test an
>> entirely local page vs. a page with lots of remote image URLs.  I don't
>> see a difference on my machine.
>
> Maybe your connexion or DNS servers are better than mine, or some other
> parameter is different, but I definitely experienced freezes that were
> DNS-related. Lars also pointed out explicitely resolving the DNS in
> /etc/hosts sped things up immensely for gwene.com.wordpress.terrytao, so
> you could try this if you're not convinced by my admitedly
> non-repeatable arguments.

I highly doubt this has anything to do with DNS. Usually there's some
kind of cache involved and you shouldn't notice any delay after the
first request. Just run tcpdump and look at the traffic. I see exactly
one DNS request, but I still see a massive delay when retrieving
articles from the mentioned group.

I also see an insane amount of connections the Emacs url package is
opening. There seems to be no upper limit. Just use

(setq url-debug t)

and look in the *URL DEBUG* buffer, or use netstat to see the amount of
connections. It's more or less a little DoS on the server (I've seen
over 200 connections easily), and most servers are configured to
throttle stuff like this. You then have over 200 sentinels waiting to
parse the HTTP headers. No wonder this is slow. The right thing to do is
to open a moderate amount of connections (the max-connections-per-server
in Firefox is 15) and then reuse those.

> I don't know why, but DNS results don't seem to be cached. I don't know
> much about DNS queries so I'm not sure if that's supposed to be the case
> or not.

Usually your local server will cache them for you. Some OS also have
some kind of caching server running, which usually caches a bunch of
services at once (DNS, LDAP, etc.).

-David



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-05 14:01                             ` David Engster
@ 2011-05-05 14:07                               ` Julien Danjou
  2011-05-05 15:05                               ` Lars Magne Ingebrigtsen
  1 sibling, 0 replies; 47+ messages in thread
From: Julien Danjou @ 2011-05-05 14:07 UTC (permalink / raw)
  To: ding

[-- Attachment #1: Type: text/plain, Size: 371 bytes --]

On Thu, May 05 2011, David Engster wrote:

> I highly doubt this has anything to do with DNS. 

There's a chance there is 2 different problems then. I never had 200+
connections opened on an HTTP server, and the hang I keep talking about
is easily reproducible by unplugging the network cable from my computer.

-- 
Julien Danjou
❱ http://julien.danjou.info

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-05 11:17                           ` Antoine Levitt
  2011-05-05 13:51                             ` Ted Zlatanov
  2011-05-05 14:01                             ` David Engster
@ 2011-05-05 14:25                             ` David Engster
  2011-05-05 14:34                               ` Antoine Levitt
  2 siblings, 1 reply; 47+ messages in thread
From: David Engster @ 2011-05-05 14:25 UTC (permalink / raw)
  To: ding

Antoine Levitt writes:
> Maybe your connexion or DNS servers are better than mine, or some other
> parameter is different, but I definitely experienced freezes that were
> DNS-related. Lars also pointed out explicitely resolving the DNS in
> /etc/hosts sped things up immensely for gwene.com.wordpress.terrytao, so
> you could try this if you're not convinced by my admitedly
> non-repeatable arguments.

I have a question though. You've said in your OP that you got a smooth
display with emacs-w3m, but I'm guessing it didn't display all the
images at once, right? Emacs-w3m is pretty neat in that respect: it
immediately displays the text content, and then in the background
(slowly) replaces the alt-text with the images, so you can immediately
start reading.

-David



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-05 14:25                             ` David Engster
@ 2011-05-05 14:34                               ` Antoine Levitt
  2011-05-05 14:59                                 ` David Engster
  0 siblings, 1 reply; 47+ messages in thread
From: Antoine Levitt @ 2011-05-05 14:34 UTC (permalink / raw)
  To: ding

05/05/11 16:25, David Engster
> Antoine Levitt writes:
>> Maybe your connexion or DNS servers are better than mine, or some other
>> parameter is different, but I definitely experienced freezes that were
>> DNS-related. Lars also pointed out explicitely resolving the DNS in
>> /etc/hosts sped things up immensely for gwene.com.wordpress.terrytao, so
>> you could try this if you're not convinced by my admitedly
>> non-repeatable arguments.
>
> I have a question though. You've said in your OP that you got a smooth
> display with emacs-w3m, but I'm guessing it didn't display all the
> images at once, right? Emacs-w3m is pretty neat in that respect: it
> immediately displays the text content, and then in the background
> (slowly) replaces the alt-text with the images, so you can immediately
> start reading.

That's right. shr is supposed to do the same thing, and I think the only
thing standing in its way is DNS resolving. (but I wouldn't bet money on
it)




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-05 14:34                               ` Antoine Levitt
@ 2011-05-05 14:59                                 ` David Engster
  2011-05-05 15:00                                   ` David Engster
  0 siblings, 1 reply; 47+ messages in thread
From: David Engster @ 2011-05-05 14:59 UTC (permalink / raw)
  To: ding

Antoine Levitt writes:
> 05/05/11 16:25, David Engster
>> Antoine Levitt writes:
>>> Maybe your connexion or DNS servers are better than mine, or some other
>>> parameter is different, but I definitely experienced freezes that were
>
>>> DNS-related. Lars also pointed out explicitely resolving the DNS in
>>> /etc/hosts sped things up immensely for gwene.com.wordpress.terrytao, so
>>> you could try this if you're not convinced by my admitedly
>>> non-repeatable arguments.
>>
>> I have a question though. You've said in your OP that you got a smooth
>> display with emacs-w3m, but I'm guessing it didn't display all the
>> images at once, right? Emacs-w3m is pretty neat in that respect: it
>> immediately displays the text content, and then in the background
>> (slowly) replaces the alt-text with the images, so you can immediately
>> start reading.
>
> That's right. shr is supposed to do the same thing, and I think the only
> thing standing in its way is DNS resolving. (but I wouldn't bet money on
> it)

I only skimmed through the code, but I think it only does that for cid:
URLs? Lars should answer this.

The thing is: you don't always know what you'll get. Look at the
terrytao source, and you'll see that the images are dynamically
generated through a PHP script. You can only see in the receiving
content type that this is actually a PNG.

-David



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-05 14:59                                 ` David Engster
@ 2011-05-05 15:00                                   ` David Engster
  0 siblings, 0 replies; 47+ messages in thread
From: David Engster @ 2011-05-05 15:00 UTC (permalink / raw)
  To: ding

David Engster writes:
> The thing is: you don't always know what you'll get. Look at the
> terrytao source, and you'll see that the images are dynamically
> generated through a PHP script. You can only see in the receiving
> content type that this is actually a PNG.

But it's of course wrapped in an <img> tag, so please forget that I ever
wrote that...

-David




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-05 14:01                             ` David Engster
  2011-05-05 14:07                               ` Julien Danjou
@ 2011-05-05 15:05                               ` Lars Magne Ingebrigtsen
  2011-05-05 19:49                                 ` David Engster
  1 sibling, 1 reply; 47+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-05-05 15:05 UTC (permalink / raw)
  To: ding

David Engster <deng@randomsample.de> writes:

> and look in the *URL DEBUG* buffer, or use netstat to see the amount of
> connections. It's more or less a little DoS on the server (I've seen
> over 200 connections easily)

This is fixed in bzr Emacs.

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-05 15:05                               ` Lars Magne Ingebrigtsen
@ 2011-05-05 19:49                                 ` David Engster
  2011-05-05 20:07                                   ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 47+ messages in thread
From: David Engster @ 2011-05-05 19:49 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen writes:
> David Engster <deng@randomsample.de> writes:
>
>> and look in the *URL DEBUG* buffer, or use netstat to see the amount of
>> connections. It's more or less a little DoS on the server (I've seen
>> over 200 connections easily)
>
> This is fixed in bzr Emacs.

And indeed it is. I'm also on a different machine now, but reading
terrytao through gmane and latest emacs-bzr now works as expected and
similar to emacs-w3m.

-David



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-05 19:49                                 ` David Engster
@ 2011-05-05 20:07                                   ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 47+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-05-05 20:07 UTC (permalink / raw)
  To: ding

David Engster <deng@randomsample.de> writes:

> And indeed it is. I'm also on a different machine now, but reading
> terrytao through gmane and latest emacs-bzr now works as expected and
> similar to emacs-w3m.

It needs a bit more work to manage the queue length.  If you enter
the terrytao group and select the first message, the queue will grow to
800 entries, I think.  And the queue is handled sequentially.  So if you
exit the group, url-queue will continue to fetch all the images, which
may not be what you want, because you won't be able to see any other
images in other groups until it's finished.

But I was unsure what to do about it.  Hm...  perhaps just setting
`url-queue' to nil on group exit is sufficient?  But there should
probably also be an interactive `M-x url-queue-stop' command to allow
the user to do this themselves...

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-04 18:40                         ` Simon Josefsson
@ 2011-05-30 20:55                           ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 47+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-05-30 20:55 UTC (permalink / raw)
  To: ding

Simon Josefsson <simon@josefsson.org> writes:

> What about GNU adns?
>
> http://www.chiark.greenend.org.uk/~ian/adns/

Looks good to me.  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Asynchroneous image retrieval in HTML rendering
  2011-05-05  8:25                         ` Julien Danjou
@ 2011-05-30 20:56                           ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 47+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-05-30 20:56 UTC (permalink / raw)
  To: ding

Julien Danjou <julien@danjou.info> writes:

> There's udns:
>
>     http://www.corpit.ru/mjt/udns.html

Looks even nicer for Emacs use.

Now, if only someone would put udns into Emacs...  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/




^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2011-05-30 20:56 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-23 22:18 Asynchroneous image retrieval in HTML rendering Antoine Levitt
2011-04-28 20:03 ` Antoine Levitt
2011-04-29  9:04 ` Julien Danjou
2011-04-29  9:18   ` Antoine Levitt
2011-04-29 10:01     ` Julien Danjou
2011-04-29 20:34       ` Antoine Levitt
2011-05-01 15:13         ` Lars Magne Ingebrigtsen
2011-05-01 15:23           ` Antoine Levitt
2011-05-01 15:42             ` Lars Magne Ingebrigtsen
2011-05-01 16:06               ` Lars Magne Ingebrigtsen
2011-05-01 16:34                 ` Adam Sjøgren
2011-05-01 16:45                   ` Lars Magne Ingebrigtsen
2011-05-01 17:15                 ` Antoine Levitt
2011-05-02  7:52               ` Antoine Levitt
2011-05-02 14:30                 ` Lars Magne Ingebrigtsen
2011-05-02 14:53                   ` Antoine Levitt
2011-05-02 15:17                     ` Lars Magne Ingebrigtsen
2011-05-02 17:08                   ` Lars Magne Ingebrigtsen
2011-05-02 17:24                     ` Antoine Levitt
2011-05-02 17:40                       ` Lars Magne Ingebrigtsen
2011-05-02 21:52                         ` Antoine Levitt
2011-05-04 18:40                         ` Simon Josefsson
2011-05-30 20:55                           ` Lars Magne Ingebrigtsen
2011-05-05  8:25                         ` Julien Danjou
2011-05-30 20:56                           ` Lars Magne Ingebrigtsen
2011-05-05 10:48                         ` Ted Zlatanov
2011-05-05 10:58                           ` Julien Danjou
2011-05-05 11:17                           ` Antoine Levitt
2011-05-05 13:51                             ` Ted Zlatanov
2011-05-05 14:01                             ` David Engster
2011-05-05 14:07                               ` Julien Danjou
2011-05-05 15:05                               ` Lars Magne Ingebrigtsen
2011-05-05 19:49                                 ` David Engster
2011-05-05 20:07                                   ` Lars Magne Ingebrigtsen
2011-05-05 14:25                             ` David Engster
2011-05-05 14:34                               ` Antoine Levitt
2011-05-05 14:59                                 ` David Engster
2011-05-05 15:00                                   ` David Engster
2011-05-02 17:53                       ` Lars Magne Ingebrigtsen
2011-05-02 18:18                         ` Lars Magne Ingebrigtsen
2011-05-02 17:12                   ` Ted Zlatanov
2011-05-02 17:20                     ` Lars Magne Ingebrigtsen
2011-05-02 21:18                       ` Steinar Bang
2011-05-02 22:40                         ` Lars Magne Ingebrigtsen
2011-05-03 14:09                           ` Ted Zlatanov
2011-05-03 18:48                             ` Steinar Bang
2011-05-02 21:15                   ` Steinar Bang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).