Gnus development mailing list
 help / color / mirror / Atom feed
* is there a possibility for gnus to download data without blocking?
@ 2020-08-21  4:06 Wayne Harris
  2020-08-21  4:39 ` Eric Abrahamsen
  0 siblings, 1 reply; 12+ messages in thread
From: Wayne Harris @ 2020-08-21  4:06 UTC (permalink / raw)
  To: ding

Is there a possibility for gnus to download data without blocking?
Wouldn't it be nice to be able to keep using EMACS while Gnus is
downloading?

Is it a limitation of EMACS itself?  If so, is there any plans for
supporting such things in the future?  Wouldn't it be nice?



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: is there a possibility for gnus to download data without blocking?
  2020-08-21  4:06 is there a possibility for gnus to download data without blocking? Wayne Harris
@ 2020-08-21  4:39 ` Eric Abrahamsen
  2020-08-21 10:31   ` dick.r.chiang
  2020-08-21 14:18   ` Wayne Harris
  0 siblings, 2 replies; 12+ messages in thread
From: Eric Abrahamsen @ 2020-08-21  4:39 UTC (permalink / raw)
  To: ding

Wayne Harris <wharris1@protonmail.com> writes:

> Is there a possibility for gnus to download data without blocking?
> Wouldn't it be nice to be able to keep using EMACS while Gnus is
> downloading?
>
> Is it a limitation of EMACS itself?  If so, is there any plans for
> supporting such things in the future?  Wouldn't it be nice?

It would be lovely! There are a few issues: Emacs is single-threaded,
though it has the ability to continue execution while waiting on IO from
an external process. So theoretically we can already "download data
without blocking". In fact, Gnus already does this in a limited way:
when you hit "g", it starts an async external process for each of your
servers (each that involves an external process, anyway), then polls
each one until they're all done, and then continues with updating its
state.

That means there's really only a benefit when you have multiple servers
that can overlap their IO, and you're still going to wait as long as the
longest server takes.

In theory we could have a Gnus that fires off all the servers and then
returns control to the user immediately, but that would involve handling
out-of-band returns as they came in from the servers, and Gnus would
have to be structured very differently than it is now to manage that.

Eric



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: is there a possibility for gnus to download data without blocking?
  2020-08-21  4:39 ` Eric Abrahamsen
@ 2020-08-21 10:31   ` dick.r.chiang
  2020-08-21 17:42     ` Eric Abrahamsen
  2020-08-21 14:18   ` Wayne Harris
  1 sibling, 1 reply; 12+ messages in thread
From: dick.r.chiang @ 2020-08-21 10:31 UTC (permalink / raw)
  To: Eric Abrahamsen; +Cc: ding

EA> In theory we could have a Gnus that fires off all the servers 
and then EA> returns control to the user immediately 

https://github.com/dickmao/gnus does just that.  It's about a 3500 
line change.  Perhaps 1-2 years from now I will make a concerted 
effort to merge, but given,

1. the changes are not true multithreading, and 2. gnus users, 
being of a different generation, are generally content to manually 
retrieve and block,

I give the integration less than 50% chance of seeing the light of day.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: is there a possibility for gnus to download data without blocking?
  2020-08-21  4:39 ` Eric Abrahamsen
  2020-08-21 10:31   ` dick.r.chiang
@ 2020-08-21 14:18   ` Wayne Harris
  2020-08-21 17:16     ` Eric Abrahamsen
  1 sibling, 1 reply; 12+ messages in thread
From: Wayne Harris @ 2020-08-21 14:18 UTC (permalink / raw)
  To: ding

Eric Abrahamsen <eric@ericabrahamsen.net> writes:

> Wayne Harris <wharris1@protonmail.com> writes:
>
>> Is there a possibility for gnus to download data without blocking?
>> Wouldn't it be nice to be able to keep using EMACS while Gnus is
>> downloading?
>>
>> Is it a limitation of EMACS itself?  If so, is there any plans for
>> supporting such things in the future?  Wouldn't it be nice?
>
> It would be lovely! There are a few issues: Emacs is single-threaded,
> though it has the ability to continue execution while waiting on IO from
> an external process. So theoretically we can already "download data
> without blocking". In fact, Gnus already does this in a limited way:
> when you hit "g", it starts an async external process for each of your
> servers (each that involves an external process, anyway), then polls
> each one until they're all done, and then continues with updating its
> state.

The fact that it uses external processes seems totally acceptable and
perhaps even desirable.  I like external processes.  That means I could
look at that part of the system individually using my shell.

> That means there's really only a benefit when you have multiple servers
> that can overlap their IO, and you're still going to wait as long as the
> longest server takes.

Interesting.  I haven't noticed that.  For example, when I say ``A A''
on the group buffer to fetch a list of all groups, I see each news
server being read one after the other.  But you didn't say this should
have happened.  You said when I say ``g'' and I can't really tell what
happens when I say ``g'', so I'll believe you.

> In theory we could have a Gnus that fires off all the servers and then
> returns control to the user immediately, but that would involve handling
> out-of-band returns as they came in from the servers, and Gnus would
> have to be structured very differently than it is now to manage that.

I don't expect ever to be able to download a bunch of data and have it
all very fast on my screen.  But I do think that a system being always
faster than the user gives the user an impression of total control,
which is pleasurable.  So I would consider that a very nice improvement
for Gnus itself.

And for EMACS, I can't see the need for anything else because if it can
poll on external processes, that is powerful enough for anything I could
think of.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: is there a possibility for gnus to download data without blocking?
  2020-08-21 14:18   ` Wayne Harris
@ 2020-08-21 17:16     ` Eric Abrahamsen
  2020-08-21 20:30       ` Wayne Harris
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Abrahamsen @ 2020-08-21 17:16 UTC (permalink / raw)
  To: ding

Wayne Harris <wharris1@protonmail.com> writes:

> Eric Abrahamsen <eric@ericabrahamsen.net> writes:
>
>> Wayne Harris <wharris1@protonmail.com> writes:
>>
>>> Is there a possibility for gnus to download data without blocking?
>>> Wouldn't it be nice to be able to keep using EMACS while Gnus is
>>> downloading?
>>>
>>> Is it a limitation of EMACS itself?  If so, is there any plans for
>>> supporting such things in the future?  Wouldn't it be nice?
>>
>> It would be lovely! There are a few issues: Emacs is single-threaded,
>> though it has the ability to continue execution while waiting on IO from
>> an external process. So theoretically we can already "download data
>> without blocking". In fact, Gnus already does this in a limited way:
>> when you hit "g", it starts an async external process for each of your
>> servers (each that involves an external process, anyway), then polls
>> each one until they're all done, and then continues with updating its
>> state.
>
> The fact that it uses external processes seems totally acceptable and
> perhaps even desirable.  I like external processes.  That means I could
> look at that part of the system individually using my shell.
>
>> That means there's really only a benefit when you have multiple servers
>> that can overlap their IO, and you're still going to wait as long as the
>> longest server takes.
>
> Interesting.  I haven't noticed that.  For example, when I say ``A A''
> on the group buffer to fetch a list of all groups, I see each news
> server being read one after the other.  But you didn't say this should
> have happened.  You said when I say ``g'' and I can't really tell what
> happens when I say ``g'', so I'll believe you.

Huh, I have never in my life used "A A". I just tried it, and indeed the
connections are made synchronously/consecutively. That's not a surprise,
because "g" (gnus-group-get-new-news) goes to quite a lot of trouble to
make the connections asynchronous, and that work isn't done elsewhere in
the code.

>> In theory we could have a Gnus that fires off all the servers and then
>> returns control to the user immediately, but that would involve handling
>> out-of-band returns as they came in from the servers, and Gnus would
>> have to be structured very differently than it is now to manage that.
>
> I don't expect ever to be able to download a bunch of data and have it
> all very fast on my screen.  But I do think that a system being always
> faster than the user gives the user an impression of total control,
> which is pleasurable.  So I would consider that a very nice improvement
> for Gnus itself.

The main issue is: if you get control of Emacs back right away, what
happens when the server response comes in? Work still needs to be done
to parse the response and incorporate it into Gnus' state, and that work
needs to be done in lisp, which means the user is probably going to get
interrupted again, to some extent. Let's see how Dick is handling it! :)



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: is there a possibility for gnus to download data without blocking?
  2020-08-21 10:31   ` dick.r.chiang
@ 2020-08-21 17:42     ` Eric Abrahamsen
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Abrahamsen @ 2020-08-21 17:42 UTC (permalink / raw)
  To: ding

dick.r.chiang@gmail.com writes:

> EA> In theory we could have a Gnus that fires off all the servers and
> then EA> returns control to the user immediately
> https://github.com/dickmao/gnus does just that.  It's about a 3500
> line change.  Perhaps 1-2 years from now I will make a concerted
> effort to merge, but given,
>
> 1. the changes are not true multithreading, and 2. gnus users, being
> of a different generation, are generally content to manually retrieve
> and block,
>
> I give the integration less than 50% chance of seeing the light of day.

Interesting! Is this the same patchset you proposed on emacs.bugs a
while ago? I understood that that was about threading, but didn't
realize that the fetch process actually returned immediately. How do you
handle the "surprise" updating, from the user's point of view?

I will take a look at your repo when I have time for more Gnus
deep-dives, which should be in... less than 1-2 years from now? I hope.
I see that you're getting rid of global
gnus-summary-buffer/gnus-article-buffer, and I think those are great
changes regardless of whatever else comes on top of that. Would you be
interested in getting those changes in first?

Eric



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: is there a possibility for gnus to download data without blocking?
  2020-08-21 17:16     ` Eric Abrahamsen
@ 2020-08-21 20:30       ` Wayne Harris
  2020-08-22  2:27         ` Wayne Harris
  0 siblings, 1 reply; 12+ messages in thread
From: Wayne Harris @ 2020-08-21 20:30 UTC (permalink / raw)
  To: ding

Eric Abrahamsen <eric@ericabrahamsen.net> writes:

[...]

>>> In theory we could have a Gnus that fires off all the servers and then
>>> returns control to the user immediately, but that would involve handling
>>> out-of-band returns as they came in from the servers, and Gnus would
>>> have to be structured very differently than it is now to manage that.
>>
>> I don't expect ever to be able to download a bunch of data and have it
>> all very fast on my screen.  But I do think that a system being always
>> faster than the user gives the user an impression of total control,
>> which is pleasurable.  So I would consider that a very nice improvement
>> for Gnus itself.
>
> The main issue is: if you get control of Emacs back right away, what
> happens when the server response comes in? Work still needs to be done
> to parse the response and incorporate it into Gnus' state, and that work
> needs to be done in lisp, which means the user is probably going to get
> interrupted again, to some extent.

I would guess the result would be a lot better because EMACS is fast and
the network, slow.  You continue to use EMACS as the data comes in.
When it's all local, Gnus takes control of EMACS but finishes quickly
because ELISP is fast.  So you'd notice it a lot less, if at all.

> Let's see how Dick is handling it! :)

Yes, let's do that.  I bet it won't be easy for me, but I'll give it a
try after dinner today, that is, within 6 hours.  Expect my report.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: is there a possibility for gnus to download data without blocking?
  2020-08-21 20:30       ` Wayne Harris
@ 2020-08-22  2:27         ` Wayne Harris
  2020-08-22 10:45           ` dick.r.chiang
  0 siblings, 1 reply; 12+ messages in thread
From: Wayne Harris @ 2020-08-22  2:27 UTC (permalink / raw)
  To: ding

Wayne Harris <wharris1@protonmail.com> writes:
> Eric Abrahamsen <eric@ericabrahamsen.net> writes:

[...]

>> Let's see how Dick is handling it! :)
>
> Yes, let's do that.  I bet it won't be easy for me, but I'll give it a
> try after dinner today, that is, within 6 hours.  Expect my report.

A quick attempt at compiling it will likely not work because I'm
running a Windows system without a resourceful toolbox for all kinds
of software.

--8<---------------cut here---------------start------------->8---
%./autogen.sh
Checking whether you have the necessary tools...
(Read INSTALL.REPO for more details on building Emacs)
Checking for autoconf (need at least version 2.65) ... missing
[...]
--8<---------------cut here---------------end--------------->8---

Of course.  Won't even install such tools because I'm sure there's
many more needed.  Let's read the source code without running it.  We
surely should get to interesting places if we start at Lisp's function
gnus-group-list-active or perhaps gnus-group-get-new-news.  Indeed,
we find docstring

--8<---------------cut here---------------start------------->8---
(defun gnus-group-get-new-news (&optional arg one-level background)
  "Get newly arrived articles.
[...]
If BACKGROUND then run `gnus-get-unread-articles' in a separate thread.
--8<---------------cut here---------------end--------------->8---

GNU Emacs 24 has no such BACKGROUND option.  We can see in dickmao's
source code that there are changes to gnus-get-unread-articles too
because it's also taking the BACKGROUND argument, just like the
docstring suggests.  That seems to be the only change dickmao added.
So let's read gnus-get-unread-articles.

It's a long procedure.  As a quick side note, reading dickmao's
gnus-start.el we learn that the variable

  gnus-threaded-get-unread-articles

is required for the threaded modus operandi, just like his README
discreetly mentioned with 
 
  (custom-set-variables '(gnus-threaded-get-unread-articles t)).

The docstring for the variable confirms with

--8<---------------cut here---------------start------------->8---
  "Instantiate parallel threads for `gnus-get-unread-articles' which encapsulates
most of the network retrieval when `gnus-group-get-new-news' is run."
--8<---------------cut here---------------end--------------->8---

and denounces dickmao's line with 81 characters, not counting the
newline!  He's a longliner and we'll use that against him in court for
sure.  ``We're gonna nail his ass.''

The new function has the following signature:

--8<---------------cut here---------------start------------->8---
(cl-defun gnus-get-unread-articles (&optional requested-level dont-connect
                                              one-level background
                                    &aux (level (gnus-group-default-level
                                                 requested-level t)))
  "Go through `gnus-newsrc-alist' and compare with `gnus-active-hashtb'
  and compute how many unread articles there are in each group."
[...]
--8<---------------cut here---------------end--------------->8---

Dickmao forked GNU Emacs 27.0.50.  It will be useful to compare his code
against a version that's close such as emacs-27.0.90.  In any case, I am
not able to understand much.  

I believe he builds a list of THINGS to do in a list called COMMANDS.
These THINGS are procedures he builds with applications of
apply-partially.  If BACKGROUND was asked, then all THINGS to do, which
are stored in COMMANDS, will suffer the following action:

   (apply #'gnus-run-thread
          gnus-mutex-get-unread-articles
          thread-group
          commands)

At this point I advance that each server will be handled by a separate
thread and each thread will all the THINGS contained in COMMANDS in the
order they are.  That's why he uses gnus-push-end.  He's ordering each
task of each thread.  What happens in parallel, therefore, is just the
conversation with each separate server.

Once all threads are done, a procedure of name CODA will be run.  If
BACKGROUND is not true, then there is no thread for anything and CODA
will do it all.  CODA is not a procedure defined in gnus-start.el and
it's not clear what it is or what it does.

What each thread does sequentially is

  gnus-open-server
  gnus-retrieve-group-data-early
  gnus-read-active-for-groups

et cetera.  A buffer called

  *gnus-thread <THREAD-NAME>*

is created for each thread, so it's likely these buffers collect the
data produced by each server.  Now we would have to see how the data is
used and Gnus buffers are updated.

At this point, we should compile and run the software to confirm the
theory exposed here.

The question was how does dickmao does it.  In summary, he added
procedures like make-thread and now he's just using it. :-) But I think
the real thing you wanted to know was how he would handle, for example,
the situation the intermixing of different servers returning data and
perhaps not finishing and all the complications of network talk and also
updating the buffers.  I would expect that when one server delivers all
the information stored in likely buffer named

  *gnus-thread <THREAD-NAME>*

then dickmao updates Gnus buffers with that information.  If a server
does not produce sensible information, that buffer is discarded and it's
as if we never spoke to that server at all.  The updating of Gnus
buffers would not happen in parallel, of course.  It's only the talk to
the network that does.

One question comes up now.  Dickmao seems to have parallelized the
network conversations, which is perhaps possible to do with external
processes too.  For instance, spawn an external process for each server,
collect up all the data in a separate buffer.  Same strategy as
dickmao's, but using processes, not threads.  

Perhaps Gnus currently does things sequentially and blocks on the
network talk?  If so, it would be way better if it just wouldn't block.
Maybe we don't need dickmao's threads after all.  But, yes, we're back
to speculation.  I guess we like it. :-)



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: is there a possibility for gnus to download data without blocking?
  2020-08-22  2:27         ` Wayne Harris
@ 2020-08-22 10:45           ` dick.r.chiang
  2020-08-22 15:52             ` Wayne Harris
  0 siblings, 1 reply; 12+ messages in thread
From: dick.r.chiang @ 2020-08-22 10:45 UTC (permalink / raw)
  To: Wayne Harris; +Cc: ding

I am gratified someone studied my changes.

WH> CODA ... and it's not clear what it is or what it does.

CODA merely updates the message counts in *Group* after retrieval threads
complete.

WH> But I think the real thing you wanted to know was how he would handle, for
WH> the situation the intermixing of different servers returning data ...

As the 90s pop song goes, one has to "keep'em separated."  This means making
"total war" on the monolithicity of the summary, article, and nntp buffers
which Gnus has since inception assumed global singletons.

WH> Same strategy as dickmao's, but using processes, not threads.
WH> If so, it would be way better if [Gnus] just wouldn't block.  Maybe we
WH> don't need dickmao's threads after all.

As EA previously stated, shunting all retrieval logic to a separate process B
still poses the serious problem of integrating B's results in
the main process A.  Moreover, multiple processes begs the ornery question
of interprocess communication, consumes more system resources,
and admits the likelihood of orphaned processes.





^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: is there a possibility for gnus to download data without blocking?
  2020-08-22 10:45           ` dick.r.chiang
@ 2020-08-22 15:52             ` Wayne Harris
  2020-08-22 16:11               ` dick.r.chiang
  0 siblings, 1 reply; 12+ messages in thread
From: Wayne Harris @ 2020-08-22 15:52 UTC (permalink / raw)
  To: ding

dick.r.chiang@gmail.com writes:

> I am gratified someone studied my changes.

:-)

> WH> CODA ... and it's not clear what it is or what it does.
>
> CODA merely updates the message counts in *Group* after retrieval threads
> complete.

Why is it called CODA and where is the definition of the procedure?

> WH> But I think the real thing you wanted to know was how he would handle, for
> WH> the situation the intermixing of different servers returning data ...
>
> As the 90s pop song goes, one has to "keep'em separated."  

I think that's ``Come Out and Play'', by Offspring, or something like
that. :-)

> This means making "total war" on the monolithicity of the summary,
> article, and nntp buffers which Gnus has since inception assumed
> global singletons.

My idea is that nearly everything stays as it is.  I think the only
thing that should be ``parallelized'' is the download of data from nntp
servers.

> WH> Same strategy as dickmao's, but using processes, not threads.
> WH> If so, it would be way better if [Gnus] just wouldn't block.  Maybe we
> WH> don't need dickmao's threads after all.
>
> As EA previously stated, shunting all retrieval logic to a separate process B
> still poses the serious problem of integrating B's results in
> the main process A.  

Yes, but I guess the difficulty is nearly the same you're facing with
your threads.  You have to manage the threads; you'd have to manage the
processes.  Are you saying threads are easier to manage than processes?



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: is there a possibility for gnus to download data without blocking?
  2020-08-22 15:52             ` Wayne Harris
@ 2020-08-22 16:11               ` dick.r.chiang
  2020-08-22 17:07                 ` Eric Abrahamsen
  0 siblings, 1 reply; 12+ messages in thread
From: dick.r.chiang @ 2020-08-22 16:11 UTC (permalink / raw)
  To: Wayne Harris; +Cc: ding

WH> Why is it called CODA and where is the definition of the procedure?

*coda* is a musical term for an afterword.

https://github.com/dickmao/gnus/blob/ddd82c2b1261b2c5d5425c0738112f47e2bb18bf/lisp/gnus/gnus-start.el#L1789-L1799

WH> Yes, but I guess the difficulty is nearly the same you're facing with your
WH> threads.

Having been around the block innumerable times, I know better than to try
convincing the gallery of my club choice and approach shot.  I've offered my
surface judgment of processes versus threads.  Further speculation without
actual proof-of-concept coding is almost never productive (again something I've
learned over many years).



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: is there a possibility for gnus to download data without blocking?
  2020-08-22 16:11               ` dick.r.chiang
@ 2020-08-22 17:07                 ` Eric Abrahamsen
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Abrahamsen @ 2020-08-22 17:07 UTC (permalink / raw)
  To: ding

dick.r.chiang@gmail.com writes:

> WH> Why is it called CODA and where is the definition of the procedure?
>
> *coda* is a musical term for an afterword.
>
> https://github.com/dickmao/gnus/blob/ddd82c2b1261b2c5d5425c0738112f47e2bb18bf/lisp/gnus/gnus-start.el#L1789-L1799
>
> WH> Yes, but I guess the difficulty is nearly the same you're facing with your
> WH> threads.
>
> Having been around the block innumerable times, I know better than to try
> convincing the gallery of my club choice and approach shot.  I've offered my
> surface judgment of processes versus threads.  Further speculation without
> actual proof-of-concept coding is almost never productive (again something I've
> learned over many years).

I'm not sure what we're talking about here -- there isn't a choice
between threads and processes. Processes are necessary to talk to the
external servers, and they can be synchronous or asynchronous. Elisp
threads don't actually *do* very much, they just isolate an execution
environment, and make it easier (potentially *much* easier) to reason
about execution. You can't achieve any concurrency with threads that you
couldn't without them. Or am I missing the point of this discussion?

Eric



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-08-22 17:08 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-21  4:06 is there a possibility for gnus to download data without blocking? Wayne Harris
2020-08-21  4:39 ` Eric Abrahamsen
2020-08-21 10:31   ` dick.r.chiang
2020-08-21 17:42     ` Eric Abrahamsen
2020-08-21 14:18   ` Wayne Harris
2020-08-21 17:16     ` Eric Abrahamsen
2020-08-21 20:30       ` Wayne Harris
2020-08-22  2:27         ` Wayne Harris
2020-08-22 10:45           ` dick.r.chiang
2020-08-22 15:52             ` Wayne Harris
2020-08-22 16:11               ` dick.r.chiang
2020-08-22 17:07                 ` Eric Abrahamsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).