I keep a copy of the utzoo files. 

And then I hacked the altavista desktop search the files using Apache to filter content inline. 

https://altavista.superglobalmegacorp.com/altavista

I know I'd love to feed it more data, the utzoo stuff is massive for 1991, but it's really trivial for 2019.  It's around 10GB decompressed.  



From: TUHS <tuhs-bounces@minnie.tuhs.org> on behalf of Larry McVoy <lm@mcvoy.com>
Sent: Thursday, November 21, 2019, 11:53 AM
To: Bakul Shah
Cc: tuhs@tuhs.org
Subject: Re: [TUHS] Steve Bellovin recounts the history of USENET

On Wed, Nov 20, 2019 at 07:50:53PM -0800, Bakul Shah wrote: > On Wed, 20 Nov 2019 19:14:23 -0800 Larry McVoy wrote: > > Yeah, I'd be super happy if he joined the list. I enjoyed reading > > those, wished he had gone into more detail. > > > > On the Usenet topic, does anyone remember dejanews? Searchable > > archive of all the posts to Usenet. Google bought them and then, > > so far as I know, the searchable part went away. > > > > If someone knows how to search back to the beginnings of Usenet, > > my early tech life is all there, I'd love to be able to show my kids > > that. Big arguing with Mash on comp.arch, following Guy Harris on > > comp.unix-wizards, etc. > > I have occasionally downloaded some mbox.zip files from > https://archive.org/details/usenet > But there are too many files there. Would be nice if there > was a collaborative effort to organize them in a more usable, > searchable state. Pretty much all of it (minus binaries > groups) can be stored locally (or using some global > namespace. So is that all of Usenet? -- --- Larry McVoy lm at mcvoy.com http://www.mcvoy.com/lm