Blue (or was it yellow?) GNUS suggestions

Gnus development mailing list
 help / color / mirror / Atom feed

* Blue (or was it yellow?) GNUS suggestions
@ 1996-05-25 21:31 Sudish Joseph
  1996-05-27  1:49 ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 7+ messages in thread
From: Sudish Joseph @ 1996-05-25 21:31 UTC (permalink / raw)


I'm now setup at home with a 28.8 link and Linux.  So, all sorts of
bandwidth-related ideas now occur to me (read: netcom's lines suck :-)


* Caching the list of newsgroups locally.  
Setting gnus-read-active-file to 'some solves the speed problem in the
most common case (startup), but it makes it very painful to subscribe
to new/boring-until-now groups.  Setting gnus-save-killed-list to 't
solves this, but makes exit very slow.  I guess what I'm asking is if
it's possible to split the killed list from the .eld file, so that
it's not necessary to read and write such a large chunk of data every
session.  Viewing it as a reduced (w/o article ranges) local copy of
the active file might be better than just calling it a killed list.
Updating this cache when we see new groups (gnus-check-new-newsgroups
is 'ask-server, of course) would keep things very neat.


* Defering splitting of mail in nnml groups to group entry.  
This isn't related to bandwidth in any way, but I might as well bring
it up here.  The actual speed hit in nnml is in the writing of the
articles to individual files, not in the nov file generation.  So,
just spooling all articles to one file per nnml group that would be
split into separate files upon group selection would be neat.  This
would still allow for nov generation at split time.  At OSU, I was on
the perl-porters mailing list, a very high-volume list.  I mention OSU
only because our NFS setup sucked so hard speedwise.  Which meant that
I had to sit through a long split session for that one list every time
I fired up gnus, even though I read that list once a week or so.  This
was one of the main reasons I stuck with asynchronous mail delivery.
I think this, or having nnfolder with nov, would be a very cool option
for people with NFS mounted home directories.  Note that there isn't
much difference between what I describe above and having nnfolder use
nov--the only difference is that we do an extra split at group entry
time.  Now that I have a local ext2 filesys to spool to, asynch
splitting isn't worth the trouble to me and I also get to use sexps to
describe my splitting, instead of procmails tortuous syntax, but I
think it's important that GNUS makes things better for people with
slow writes to their home directory--which covers a lot of people in
larger universities, I'd say.


* Prefetching of articles in the next group.
The nntp-async stuff is a real godsend for dialup lines--once you've
waited for that first article, subsequent articles are quick to
select.  It'd be nice to extend this to work across group boundaries.
Also, I think the current async stuff doesn't handle the case where
you go back and read an earlier article in the group too well.  Even
though that article is in the async buffer, it seems to refetch it.


* Some way to force GNUS to drop all active tcp connections to the
  NNTP server and open them anew.
I'm using diald to manage my connections.  While gnus uses a fresh
connection for articles in a group, it maintains the "control"
connection (the connection used for switching groups, etc.).  This
means that if diald drops my line due to inactivity and then fires it
up again, the control connection is invalid since I have a new IP
address.  A command in the group buffer to force GNUS to re-open the
control connection, while maintaining current state info, would be
very useful.  I'm not suggesting we go to the extreme of http, reusing
existing connections is much better, IMO, except when the connection
itself is invalid (there's no way for gnus to determine when the
connection is invalid, so it has to be a user command).


* Incremental/asynchronous group entry.
This one is much harder, and probably should be left until Lars has
cycled through all the colors in the rainbow :-) It has also been
discussed before.  Presently, upon entering a large group, I wait for
a longish bit as the headers get sucked in (all hail the great god
XOVER, without whose intervention life would be a void) and then for a
shorter, but still significant, period while gnus prepares the
summary.  Incremental threading and summary generation (ugly thought,
but GNUS does most of the required magic already when you fetch a
parent article) would make a huge difference.  As I see it, the only
real obstacle here is in the thread and sorting code--sorting thread
roots while _incrementaly updating_ threads seems difficult to me.  I
would happily forgo sorting if I could hit SPC on a newsgroup and
enter it immediately.  This won't be such an attractive option for
peple with bad newsfeeds, since article numbers will be badly skewed
w.r.t posting date (IBM's newsfeed often gets articles 14-15 days
after they're posted, this makes sorting by date a necessity :( ).
I'm willing to bet that quite a few would prefer faster group entry to
sorting.  Netscape does asynchronous threading, if you haven't checked
it out.


-- 
Sudish 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Blue (or was it yellow?) GNUS suggestions
  1996-05-25 21:31 Blue (or was it yellow?) GNUS suggestions Sudish Joseph
@ 1996-05-27  1:49 ` Lars Magne Ingebrigtsen
  1996-05-28  0:45   ` Sudish Joseph
  0 siblings, 1 reply; 7+ messages in thread
From: Lars Magne Ingebrigtsen @ 1996-05-27  1:49 UTC (permalink / raw)

Sudish Joseph <sudish@vnet.ibm.com> writes:

> * Caching the list of newsgroups locally.  
> Setting gnus-read-active-file to 'some solves the speed problem in the
> most common case (startup), but it makes it very painful to subscribe
> to new/boring-until-now groups.  Setting gnus-save-killed-list to 't
> solves this, but makes exit very slow.  I guess what I'm asking is if
> it's possible to split the killed list from the .eld file, so that
> it's not necessary to read and write such a large chunk of data every
> session.  Viewing it as a reduced (w/o article ranges) local copy of
> the active file might be better than just calling it a killed list.
> Updating this cache when we see new groups (gnus-check-new-newsgroups
> is 'ask-server, of course) would keep things very neat.

Well -- whenever new groups arrive, the cache would have to be
updated.  Which would be just as slow as things are now, more or
less.  (Most days at least a couple of groups arrive.)

> * Defering splitting of mail in nnml groups to group entry.  
> This isn't related to bandwidth in any way, but I might as well bring
> it up here.  The actual speed hit in nnml is in the writing of the
> articles to individual files, not in the nov file generation.  So,
> just spooling all articles to one file per nnml group that would be
> split into separate files upon group selection would be neat. 

Sounds like quite a lot of work, and not that much gain, so I don't
think I'll write it, at least.

> I think this, or having nnfolder with nov, would be a very cool option
> for people with NFS mounted home directories. 

nnfolder+nov would be a possibility, but I'm not sure that would be
much of a speedup, really.  

> * Prefetching of articles in the next group.

This is on the Red Gnus todo list.  I've written a new implementation
of nntp.el which is fully & totally asynchronous, so I think there
just might be lots of this sort of thing in Red Gnus.

> * Some way to force GNUS to drop all active tcp connections to the
>   NNTP server and open them anew.

This is on the Red Gnus todo list.

> * Incremental/asynchronous group entry.

By far the most time spent is in actually generating the summary
buffer.  (Unless you sort by date -- then sorting takes most of the
time.)  So this would be possible.  The thread generation could be put
in a daemonic process that would output one thread at a time and let
the user read while it's generating.  (Hey -- it could even generate
the next group while you're reading the current one.  :-)

It wouldn't actually be that much work either, I think.  We'll see.

-- 
  "Yes.  The journey through the human heart 
     would have to wait until some other time."

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Blue (or was it yellow?) GNUS suggestions
  1996-05-27  1:49 ` Lars Magne Ingebrigtsen
@ 1996-05-28  0:45   ` Sudish Joseph
  1996-05-28 19:52     ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 7+ messages in thread
From: Sudish Joseph @ 1996-05-28  0:45 UTC (permalink / raw)
  Cc: ding

Lars writes:
> Sudish Joseph <sudish@vnet.ibm.com> writes:
>> * Caching the list of newsgroups locally.  
>> Updating this cache when we see new groups (gnus-check-new-newsgroups
>> is 'ask-server, of course) would keep things very neat.
>
> Well -- whenever new groups arrive, the cache would have to be
> updated.  Which would be just as slow as things are now, more or
> less.  (Most days at least a couple of groups arrive.)

Ugh, good point.  How about letting the user decide when to update the
cached file?  We could store the date of the last update in the cache
itself, maintaining this date separately from the date used for the
current NEWGROUPS stuff?  I.e., view the cache as a generic local copy
of the active file, to be used whenever we need to know the names of
all newsgroups--no need to tie it in with killed groups/new groups/etc.

> The thread generation could be put in a daemonic process that would
> output one thread at a time and let the user read while it's

While this would be very cool in itself, I was thinking more in terms
of rewriting the core thread/summary code as a loop that checks the
buffer associated with the NNTP connection, sees if any new headers
have arrived, slurps all new headers in a list, generates/updates
current threads/summary info for that list, then loops back to check
the nntp buffer, etc. 

(It's been months since I did any process interaction elisp, but isn't
the actual update of a process/connection buffer already fully
asynchronous in emacs?)

Anyways, having a separate process actually generate thread info would
be very cool, too.  I looked into having mthreads (comes with trn, or
used to) generate lisp output (even text would be nice, instead of
it's current binary madness) about 14 - 16 months back--it's not hard
to do, the code is well commented.  I had it outputting summaries on
threads very quickly.  It would need to be rewritten to be
demand-driven rather than run as a daemon (I think that was how it was
coded, but it's been a long while.)

Also, if we're going to go down this path, it'd be nice if we were in
synch with the TRN group on the format of the actual thread data
passed to emacs.  I recall Wayne Davison calling for people interested
in threshing out a thread info format appropriate for NNTP servers
some time back (a year, maybe).  I don't know if anything ever came
out of it, but you might want to check...

A standardised thread data format, along with a UA-independent daemon
that generated this data would be way cool.  I'd assume that the TRN
people (and any other UA authors, but it's mainly *IX based readers
that will benefit from a user-space daemon; other systems might have
to wait for servers to support) would be happy to work out a neutral
format 

For e.g., having fields embedded in strings, with "[" and "]" around
them would make it easy for gnus to use the lisp reader to parse this
data, and should please the parentheses-haters, too.  At the very
least, it'll have to be text-based (mthreads currently outputs raw
structs, so that TRN can read it in even faster--or it did, last I
looked).

Thinking about it some more, the daemon approach has other good things
going for it.  For instance, the mere existence of a thread database
format wouldn't help lots of people as they would still have to wait
for INN to be upgraded and wait even longer for lazy sysadmins to
update their local servers.  Using daemons would solve this, and help
such a format gain popularity quickly.

> generating.  (Hey -- it could even generate the next group while
> you're reading the current one.  :-)

Coolness.

-Sudish

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Blue (or was it yellow?) GNUS suggestions
  1996-05-28  0:45   ` Sudish Joseph
@ 1996-05-28 19:52     ` Lars Magne Ingebrigtsen
  1996-05-28 22:59       ` Sudish Joseph
  0 siblings, 1 reply; 7+ messages in thread
From: Lars Magne Ingebrigtsen @ 1996-05-28 19:52 UTC (permalink / raw)

Sudish Joseph <sudish@ix.netcom.com> writes:

> How about letting the user decide when to update the cached file?
> We could store the date of the last update in the cache itself,
> maintaining this date separately from the date used for the current
> NEWGROUPS stuff?  I.e., view the cache as a generic local copy of
> the active file, to be used whenever we need to know the names of
> all newsgroups--no need to tie it in with killed groups/new
> groups/etc.

That's a possibility.  Or the cache could be updated when the list of
killed groups reach a certain length or something.  I've added this to
the Red Gnus todo list.

> While this would be very cool in itself, I was thinking more in terms
> of rewriting the core thread/summary code as a loop that checks the
> buffer associated with the NNTP connection, sees if any new headers
> have arrived, slurps all new headers in a list, generates/updates
> current threads/summary info for that list, then loops back to check
> the nntp buffer, etc. 

That sounds very complicated.  And is it useful?  How many new
articles typically arrive for the group you're reading while you're
reading the group?

> Anyways, having a separate process actually generate thread info would
> be very cool, too.  I looked into having mthreads (comes with trn, or
> used to) generate lisp output (even text would be nice, instead of
> it's current binary madness) about 14 - 16 months back--it's not hard
> to do, the code is well commented.  I had it outputting summaries on
> threads very quickly.  It would need to be rewritten to be
> demand-driven rather than run as a daemon (I think that was how it was
> coded, but it's been a long while.)

I don't quite see how Gnus would be helped by this.  The way Gnus
threads things is highly user-customizable (gathering, fuzzy subject
comparison, sparse nodes, ancient articles, ad nauseam).  If we had a
separate process spewing out the threads at us, then we'd lose that
Lispish extensibility.

Anyways, the threading itself is no problem.  The most time spent is
used to generate the summary buffer, and there no external process can
help us.

-- 
  "Yes.  The journey through the human heart 
     would have to wait until some other time."

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Blue (or was it yellow?) GNUS suggestions
  1996-05-28 19:52     ` Lars Magne Ingebrigtsen
@ 1996-05-28 22:59       ` Sudish Joseph
  1996-05-29  9:30         ` Per Abrahamsen
  1996-05-31  7:28         ` Lars Magne Ingebrigtsen
  0 siblings, 2 replies; 7+ messages in thread
From: Sudish Joseph @ 1996-05-28 22:59 UTC (permalink / raw)
  Cc: ding

Lars Magne Ingebrigtsen <larsi@ifi.uio.no> writes:
> Sudish Joseph <sudish@ix.netcom.com> writes:
> > a loop that checks the buffer associated with the NNTP connection,
> > sees if any new headers have arrived, slurps all new headers in a
> > list, generates/updates current threads/summary info for that
> > list, then loops back to check the nntp buffer, etc.
>
> That sounds very complicated.  And is it useful?  How many new
> articles typically arrive for the group you're reading while you're
> reading the group?

No, I was rambling about doing this for _every single_ header.  Well,
clumps of headers, where clump is defined as all headers in the *nntp*
buffer at the time you get to the end of the the buffer.

On a fast link, you'd get the same behaviour as today, since more
headers would arrive in the time GNUS finished inspecting the ones
that were there.

On a slow link, GNUS could get a hell of a lot done in the time it sat
waiting for _all_ the headers to arrive.

But yeah, it'd be very complex.  Just figuring out how to let the
user select an article in the midst of the above loop would be pain
enough for anyone.

> I don't quite see how Gnus would be helped by this.  The way Gnus
> threads things is highly user-customizable (gathering, fuzzy subject
> comparison, sparse nodes, ancient articles, ad nauseam).  If we had a
> separate process spewing out the threads at us, then we'd lose that
> Lispish extensibility.

Well, that's why the format of the thread info becomes important.  It
should be general enough to support all that and more :-) Um, fuzzy
subjects are a sorting issue, imho.  If INN was maintaining this info,
one can imagine stuff like "gimme the roots of all threads rooted in
articles that arrived after article n", etc.

> Anyways, the threading itself is no problem.  The most time spent is
> used to generate the summary buffer, and there no external process can
> help us.

Ugh, clarity was never my strong suite.  Here's another attempt.

Currently all events that occur the summary generation process have to
wait upon the arrival of _all requested headers_.  IMO, it'd be a big
win if GNUS went ahead and prepared the summary for the headers it
already has, thus distributing the cost of summary generation over the
time wasted waiting for all headers.  Obviously, this is not a big win
if you're sitting on a fast, unsaturated network--there's a saturated
token ring here at work where it'd help, though :)--but it won't be a
loss either.

It would also improve the responsiveness of GNUS for all users, since
you'd be dropped into a summary buffer almost as soon as you selected
the group.  To get this for people on a fast network, we'd have to add
some crock like always grabbing the first n headers (n being small)
and generating a summary using those before going back to what I
outline above.

This would be a _huge_ win for people on dialup links, and mostly
irrelevant for people with a network card (well, the improvement in
responsiveness would help them, too).

But, like you said, it'd be a pain to code/maintain.

-Sudish

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Blue (or was it yellow?) GNUS suggestions
  1996-05-28 22:59       ` Sudish Joseph
@ 1996-05-29  9:30         ` Per Abrahamsen
  1996-05-31  7:28         ` Lars Magne Ingebrigtsen
  1 sibling, 0 replies; 7+ messages in thread
From: Per Abrahamsen @ 1996-05-29  9:30 UTC (permalink / raw)



I don't know about how practical it would be, but the visual effect of
a summary that was being build/threaded/sorted concurrently with the
incoming headers would be stunning.

You'd need an `add-this-header-to-the-summary-buffer' command, and a
process watch that call that function each time a new header becomes
available. 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Blue (or was it yellow?) GNUS suggestions
  1996-05-28 22:59       ` Sudish Joseph
  1996-05-29  9:30         ` Per Abrahamsen
@ 1996-05-31  7:28         ` Lars Magne Ingebrigtsen
  1 sibling, 0 replies; 7+ messages in thread
From: Lars Magne Ingebrigtsen @ 1996-05-31  7:28 UTC (permalink / raw)


"Sudish Joseph" <sudish@VNET.IBM.COM> writes:

> IMO, it'd be a big win if GNUS went ahead and prepared the summary
> for the headers it already has, thus distributing the cost of
> summary generation over the time wasted waiting for all headers.

Yeah, that would be ultra-kool.  Hm.  One could do this sort of thing
without having an external threader type thing as well.  But one would
have to write the thing in C.  Updating/removing/inserting things in
this manner in Elisp would just be far too slow.

But it would be seriously cool.  

-- 
  "Yes.  The journey through the human heart 
     would have to wait until some other time."


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~1996-05-31  7:28 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1996-05-25 21:31 Blue (or was it yellow?) GNUS suggestions Sudish Joseph
1996-05-27  1:49 ` Lars Magne Ingebrigtsen
1996-05-28  0:45   ` Sudish Joseph
1996-05-28 19:52     ` Lars Magne Ingebrigtsen
1996-05-28 22:59       ` Sudish Joseph
1996-05-29  9:30         ` Per Abrahamsen
1996-05-31  7:28         ` Lars Magne Ingebrigtsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).