Gnus development mailing list
 help / color / mirror / Atom feed
From: Albrecht Kadlec <albrecht@anaphi.vlsivie.tuwien.ac.at>
Cc: ding@ifi.uio.no, larsi@ifi.uio.no
Subject: Archive: Zip-/Directories? (was: Re: Old messages from this list in Digest Format via FTP ?)
Date: Tue, 6 Feb 96 14:06 MET	[thread overview]
Message-ID: <m0tjn66-0000z9C@anaphi.vlsivie.tuwien.ac.at> (raw)
In-Reply-To: <199602050720.BAA29446@sina.hpc.uh.edu>

>>>>> Jason L Tibbitts writes:

>>>>> "AS" == Aharon Schkolnik <aharon@healdb.matat.health.gov.il> writes:
AS> Can I FTP the old messages from this list in digest format from
AS> somewhere ?

J> Well, I keep the archives and they're not digested.  Perhaps you already
J> know that you can type G a in the group buffer to get a newsgroup
J> containing the last 500 articles from this list, or C-u G a to get the
J> _entire_ archive.  WARNING: that's currently over 7000 articles and the
J> overview is 1.4MB.  It's probably pretty useless at this point, but then
J> info that old is probably useless, too.  Whatever happened to the proposal
J> to allow nnml folders to contain several subdirectories?

I saw someone mentioning it again a month ago, but I couldn't find the
article again.

It was sth. like packing some articles, so that there were chunks of a
specific length - say 10K. The reason was to keep ftp overhead to a
minimum (1K requests were said to be very inefficient).

I wanted to add a "please support zipped .overviews and chunks":
"closed" .overviews could then be easily zipped as could the 10K chunks.
only the active chunk and .overview don't necessarily have to be zipped.


Sysadmins would jump at this too, since it saves MEGS of disk space and CPU
time (automatic compression, when a .gz is requested)

(BTW: Maybe sth along these ideas could be established as THE archive
standard ??  Don't tweak me now - I might wake up ;^) )





But which format should the articles be chunked into?
.tar.gz is unix standard. But you'll never get the DOS-OS/2-NT world to use
it. - ok, maybe the OS/2 world, but not the DOS / WIN / NT people.

I also think it's more elegant and easier to administrate, if we used a
packed format with an internal directory, and an "ADD-TO / EXTRACT FROM"
functionality like .zip files.

The responsiveness of .zip archives should be much better than 
'gzip -d | tar -x'.

Also with .zip archives the sysadmin could set up a script to ADD each
newly arrived article to the active chunk, switching chunks every 10 to 20
articles, and add it's headers to the .overview (switched every 200
articles, the average article nedds about 200 bytes in the .overview
-> 40K, compressed 20K).

The Gnus archive (a rather crowded example of a mailing list archive :-) )
would then consist of some 700 article chunks of 10 articles each (=18K) in
some 35 directories ('a 20 files) and the corresponding .overview (cover
200 articles, also 20K each) files. 
200 articles are about 2 weeks' articles on the gnus list.


First Approach: (dislike)
	Gnus could request the entire .zip file and let jka-compress or
	some other package do the rest of the dirty stuff, simply
	requesting an article from the .zip file.


#pragma dream on

Second Approach: (mucho like)
	One of the "guru's" mentioned a new dired-like package about a half
	year ago. 
	Is there anything like a find-file extension where I can give a
	pathname like "~/oldstuff.zip/.emacs.el". 
	Anyone saw a package, that could do this stuff ?

	Would be way cool to simply specify:

	/ftp@ftp.hpc.uh.edu:/pub/emacs/ding-list-recent/articles7000-7009.zip/7001

	and have ange-ftp fetch the zip file to a buffer, jka-compress
	extract the article, read it and then, if article 7002 is read,
	find-file already has the zip file locally and only calls
	jka-compress to extract the article. 


	If such a beast doesn't exist, how about creating it - most of the
	code is there somewhere. As a first estimation, one would have to
	make jka-compress, ange-ftp and dired-mode work together
	(integrate?) from find-file-not-found-hook (or extend load-file to 
	support jka-compress, as it supports ange-ftp).

#pragma dream off

I like the second one better, since all of emacs-country (all of Gnus!!!)
would benefit from such an extension, and it's less complicated to use for
Gnus: simply call find-file for the article fetching and saving. The
argument needs some fiddling with numbers - though.

(one could dream of a compressed cache with one .zip file per newsgroup:
News/cache/alt.solar.thermal.zip/article*
News/cache/alt.solar.thermal.overview.gz)

How many man-years am I talking about? - is it impossible?


J> You can also get the stuff as a single tar file; just get the
J> (non-existant) file ding-list-recent.tar.gz or ding-list.tar.gz from
J> ftp.hpc.uh.edu:/pub/emacs.  The second will still be huge (~13MB).

7000 Articles, so the average article is 1.9K long (zipped !!!)
Seems pretty much to me, but I based my calculations above on this info.

J> If it would be of any use to anyone, I may be persuaded to batch up monthly
J> chunks but I probably won't make them into digests unless someone provides
J> a quick script to do so.

One month seems to be roughly 500 articles (on the ding-list).

Should be an easy one to create an .overview from 200 articles and put 10
articles into a zip file of it's own in a subdir ?

albrecht


  reply	other threads:[~1996-02-06 13:06 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
1996-02-05  5:37 Old messages from this list in Digest Format via FTP ? Aharon Schkolnik
1996-02-05  7:20 ` Jason L Tibbitts III
1996-02-06 13:06   ` Albrecht Kadlec [this message]
1996-02-05 19:18 ` Steven L Baur

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m0tjn66-0000z9C@anaphi.vlsivie.tuwien.ac.at \
    --to=albrecht@anaphi.vlsivie.tuwien.ac.at \
    --cc=albrecht@auto.tuwien.ac.at \
    --cc=ding@ifi.uio.no \
    --cc=larsi@ifi.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).