The Early History of Usenet, Part I: The Technological Setting: https://www.cs.columbia.edu/~smb/blog/2019-11/2019-11-14a.html The Early History of Usenet, Part II: Hardware and Economics: https://www.cs.columbia.edu/~smb/blog/2019-11/2019-11-15.html The Early History of Usenet, Part III: File Format: https://www.cs.columbia.edu/~smb/blog/2019-11/2019-11-17.html Fun reading. Bellovin is another person we should try to get to join this list. Enjoy, Arnold
Yeah, I'd be super happy if he joined the list. I enjoyed reading those, wished he had gone into more detail. On the Usenet topic, does anyone remember dejanews? Searchable archive of all the posts to Usenet. Google bought them and then, so far as I know, the searchable part went away. If someone knows how to search back to the beginnings of Usenet, my early tech life is all there, I'd love to be able to show my kids that. Big arguing with Mash on comp.arch, following Guy Harris on comp.unix-wizards, etc. On Tue, Nov 19, 2019 at 09:01:39PM +0200, Arnold Robbins wrote: > The Early History of Usenet, Part I: The Technological Setting: > https://www.cs.columbia.edu/~smb/blog/2019-11/2019-11-14a.html > > The Early History of Usenet, Part II: Hardware and Economics: > https://www.cs.columbia.edu/~smb/blog/2019-11/2019-11-15.html > > The Early History of Usenet, Part III: File Format: > https://www.cs.columbia.edu/~smb/blog/2019-11/2019-11-17.html > > Fun reading. Bellovin is another person we should try to get > to join this list. > > Enjoy, > > Arnold -- --- Larry McVoy lm at mcvoy.com http://www.mcvoy.com/lm
there is a big US bias in the archives of USENET. All I could find
preserved (before Google deleted it) was my updates to the maps for
York.ac.uk. In collecting history, the US erased most of Europe and
Asia basically. Our timelines are artificially compressed into the
modern era.
UCL gatewayed a lot of stuff into other news/forum spaces. So, our
view of the world was a disjoint set of UK news, USENET news, European
news, VMS news, BITNET lists. The world was an amazing place. Kuwait
camel breeders association operating online in teaching hospital email
lists in 1985
On Thu, Nov 21, 2019 at 11:14 AM Larry McVoy <lm@mcvoy.com> wrote:
>
> Yeah, I'd be super happy if he joined the list. I enjoyed reading
> those, wished he had gone into more detail.
>
> On the Usenet topic, does anyone remember dejanews? Searchable
> archive of all the posts to Usenet. Google bought them and then,
> so far as I know, the searchable part went away.
>
> If someone knows how to search back to the beginnings of Usenet,
> my early tech life is all there, I'd love to be able to show my kids
> that. Big arguing with Mash on comp.arch, following Guy Harris on
> comp.unix-wizards, etc.
>
> On Tue, Nov 19, 2019 at 09:01:39PM +0200, Arnold Robbins wrote:
> > The Early History of Usenet, Part I: The Technological Setting:
> > https://www.cs.columbia.edu/~smb/blog/2019-11/2019-11-14a.html
> >
> > The Early History of Usenet, Part II: Hardware and Economics:
> > https://www.cs.columbia.edu/~smb/blog/2019-11/2019-11-15.html
> >
> > The Early History of Usenet, Part III: File Format:
> > https://www.cs.columbia.edu/~smb/blog/2019-11/2019-11-17.html
> >
> > Fun reading. Bellovin is another person we should try to get
> > to join this list.
> >
> > Enjoy,
> >
> > Arnold
>
> --
> ---
> Larry McVoy lm at mcvoy.com http://www.mcvoy.com/lm
On Thu, Nov 21, 2019 at 11:18:10AM +0800, George Michaelson wrote:
> there is a big US bias in the archives of USENET. All I could find
> preserved (before Google deleted it) was my updates to the maps for
> York.ac.uk. In collecting history, the US erased most of Europe and
> Asia basically. Our timelines are artificially compressed into the
> modern era.
Can you explain this a bit? When I was on Usenet, 1980-1990 or so, it
was very small, my guess is maybe 10,000 people that posted, maybe less.
My memory is I could post a question to comp.arch or where ever, and I'd
wake up in the morning and someone from Australia or some other place
over the pond had an answer. It was usually a grad student or a prof
or someone really smart.
So is this an archive thing? Because in my memory, it was not a Usenet
thing, smart people from all over the world posted.
As an aside, I remember being on a canoe with my dad, a physics researcher
and prof, and trying to explain Usenet to him. I said something like
"it is so cool Dad, so many cool people, everyone should be on it". And
then AOL happened and it went to shit. If my thoughts helped that along
I am _so_ sorry. It was awesome when it was small.
This list is sort of like early Usenet, smart people, people who know the
history. Lets keep it small but Steve should be here.
--lm
On Wed, 20 Nov 2019, Larry McVoy wrote: > On the Usenet topic, does anyone remember dejanews? Searchable > archive of all the posts to Usenet. Google bought them and then, > so far as I know, the searchable part went away. See https://groups.google.com/forum/#!original/net.sources/84CTdvAdlb0/gTkGJnbSXxEJ Click the drop-down arrow in the search field. You can also use keywords in the search form. Here is another example https://groups.google.com/forum/#!searchin/net.unix/dmr$20AND$20before$3A1985$2F01$2F01%7Csort:date/net.unix/9VegaP_SIyI/3GHz6bPEDgsJ
I'm finding source that I posted in 1986. And I'm finding that I was a cocky jerk back in the day :) Thanks for this Reed! On Wed, Nov 20, 2019 at 09:34:54PM -0600, reed@reedmedia.net wrote: > On Wed, 20 Nov 2019, Larry McVoy wrote: > > > On the Usenet topic, does anyone remember dejanews? Searchable > > archive of all the posts to Usenet. Google bought them and then, > > so far as I know, the searchable part went away. > > See > https://groups.google.com/forum/#!original/net.sources/84CTdvAdlb0/gTkGJnbSXxEJ > > Click the drop-down arrow in the search field. > > You can also use keywords in the search form. > Here is another example > > https://groups.google.com/forum/#!searchin/net.unix/dmr$20AND$20before$3A1985$2F01$2F01%7Csort:date/net.unix/9VegaP_SIyI/3GHz6bPEDgsJ -- --- Larry McVoy lm at mcvoy.com http://www.mcvoy.com/lm
On 11/20/19 10:14 PM, Larry McVoy wrote: > Yeah, I'd be super happy if he joined the list. I enjoyed reading > those, wished he had gone into more detail. > > On the Usenet topic, does anyone remember dejanews? Searchable > archive of all the posts to Usenet. Google bought them and then, > so far as I know, the searchable part went away. > > If someone knows how to search back to the beginnings of Usenet, > my early tech life is all there, I'd love to be able to show my kids > that. Big arguing with Mash on comp.arch, following Guy Harris on > comp.unix-wizards, etc. https://groups.google.com/forum/#!forum/comp.arch Then a combination of date-based filters and searching for messages should get you closer. I've used it before, it's a huge pain in the ass, but it's better than zero. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/
On Wed, Nov 20, 2019 at 10:40:48PM -0500, Chet Ramey wrote:
> On 11/20/19 10:14 PM, Larry McVoy wrote:
> > Yeah, I'd be super happy if he joined the list. I enjoyed reading
> > those, wished he had gone into more detail.
> >
> > On the Usenet topic, does anyone remember dejanews? Searchable
> > archive of all the posts to Usenet. Google bought them and then,
> > so far as I know, the searchable part went away.
> >
> > If someone knows how to search back to the beginnings of Usenet,
> > my early tech life is all there, I'd love to be able to show my kids
> > that. Big arguing with Mash on comp.arch, following Guy Harris on
> > comp.unix-wizards, etc.
>
> https://groups.google.com/forum/#!forum/comp.arch
>
> Then a combination of date-based filters and searching for messages
> should get you closer. I've used it before, it's a huge pain in the
> ass, but it's better than zero.
Yeah on the pain in the ass, is there a way to search all of Usenet or
is it per group only?
On 11/20/19 10:42 PM, Larry McVoy wrote: >>> If someone knows how to search back to the beginnings of Usenet, >>> my early tech life is all there, I'd love to be able to show my kids >>> that. Big arguing with Mash on comp.arch, following Guy Harris on >>> comp.unix-wizards, etc. >> >> https://groups.google.com/forum/#!forum/comp.arch >> >> Then a combination of date-based filters and searching for messages >> should get you closer. I've used it before, it's a huge pain in the >> ass, but it's better than zero. > > Yeah on the pain in the ass, is there a way to search all of Usenet or > is it per group only? From the groups interface, it seems to be group-at-a-time. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/
On Wed, 20 Nov 2019 19:14:23 -0800 Larry McVoy <lm@mcvoy.com> wrote: > Yeah, I'd be super happy if he joined the list. I enjoyed reading > those, wished he had gone into more detail. > > On the Usenet topic, does anyone remember dejanews? Searchable > archive of all the posts to Usenet. Google bought them and then, > so far as I know, the searchable part went away. > > If someone knows how to search back to the beginnings of Usenet, > my early tech life is all there, I'd love to be able to show my kids > that. Big arguing with Mash on comp.arch, following Guy Harris on > comp.unix-wizards, etc. I have occasionally downloaded some mbox.zip files from https://archive.org/details/usenet But there are too many files there. Would be nice if there was a collaborative effort to organize them in a more usable, searchable state. Pretty much all of it (minus binaries groups) can be stored locally (or using some global namespace.
In reading my old posts, I found this as my .signature in 1986, I know people will argue with it but I still agree, I get that there is Rust and Go and whatever. The programmers that I hang with still like C. "If you are undertaking anything substantial, C is the only reasonable choice of programming language" -- Brian W. Kerninghan
On Wed, Nov 20, 2019 at 07:50:53PM -0800, Bakul Shah wrote: > On Wed, 20 Nov 2019 19:14:23 -0800 Larry McVoy <lm@mcvoy.com> wrote: > > Yeah, I'd be super happy if he joined the list. I enjoyed reading > > those, wished he had gone into more detail. > > > > On the Usenet topic, does anyone remember dejanews? Searchable > > archive of all the posts to Usenet. Google bought them and then, > > so far as I know, the searchable part went away. > > > > If someone knows how to search back to the beginnings of Usenet, > > my early tech life is all there, I'd love to be able to show my kids > > that. Big arguing with Mash on comp.arch, following Guy Harris on > > comp.unix-wizards, etc. > > I have occasionally downloaded some mbox.zip files from > https://archive.org/details/usenet > But there are too many files there. Would be nice if there > was a collaborative effort to organize them in a more usable, > searchable state. Pretty much all of it (minus binaries > groups) can be stored locally (or using some global > namespace. So is that all of Usenet? -- --- Larry McVoy lm at mcvoy.com http://www.mcvoy.com/lm
On Wed, 20 Nov 2019, Larry McVoy wrote:
> In reading my old posts, I found this as my .signature in 1986, I know
> people will argue with it but I still agree, I get that there is Rust
> and Go and whatever. The programmers that I hang with still like C.
>
> "If you are undertaking anything substantial, C is the only reasonable
> choice of programming language" -- Brian W. Kerninghan
It still is! XD
-uso.
On Wed, 20 Nov 2019 19:52:46 -0800 Larry McVoy <lm@mcvoy.com> wrote:
Larry McVoy writes:
> On Wed, Nov 20, 2019 at 07:50:53PM -0800, Bakul Shah wrote:
> > On Wed, 20 Nov 2019 19:14:23 -0800 Larry McVoy <lm@mcvoy.com> wrote:
> > > Yeah, I'd be super happy if he joined the list. I enjoyed reading
> > > those, wished he had gone into more detail.
> > >
> > > On the Usenet topic, does anyone remember dejanews? Searchable
> > > archive of all the posts to Usenet. Google bought them and then,
> > > so far as I know, the searchable part went away.
> > >
> > > If someone knows how to search back to the beginnings of Usenet,
> > > my early tech life is all there, I'd love to be able to show my kids
> > > that. Big arguing with Mash on comp.arch, following Guy Harris on
> > > comp.unix-wizards, etc.
> >
> > I have occasionally downloaded some mbox.zip files from
> > https://archive.org/details/usenet
> > But there are too many files there. Would be nice if there
> > was a collaborative effort to organize them in a more usable,
> > searchable state. Pretty much all of it (minus binaries
> > groups) can be stored locally (or using some global
> > namespace.
>
> So is that all of Usenet?
Probably not. Too many files to check but I think most or all
of dejanews stuf is there.
uk.* and eu.* seem to be unfindable. Stuff from before the great
USENET re-org is mostly unfindable. cross posting to lists, only
partially visible. I've tried to find my own rants, its like I was
born into the world in 1996. What happened to 1982-onward? It just ..
evaporated.
If things have improved, I'd be happy to wish it true, but what I
recall is the archives were bootstrapped from tapes held by people who
felt it was the best they could do, in a time where people didn't
really keep ephemera, and alas, the stuff wasn't all, it was the view
of all. which some people saw.
I think uk.* never left the island. Maybe this is one of those
definitions things: we used the A and B news protocol, we used UUCP,
we were on USENET, but if we weren't in the backbone cabal, its like
we didn't exist.
People love to talk about shebang addressing (me too, and VMS a::b::c)
but Honey-Danber, was the shizzle. They made the world so much
simpler, by doing the obvious flattening of the pathspace into
namespace, with path dealt with elsewhere. moving to a@somewhere was
god-given help to morons. Shebang paths sucked.
(It would not surprise me for a hk.* and jp.* and su.* and the like to
say: "brother, you have no idea")
On Thu, Nov 21, 2019 at 11:28 AM Larry McVoy <lm@mcvoy.com> wrote:
>
> On Thu, Nov 21, 2019 at 11:18:10AM +0800, George Michaelson wrote:
> > there is a big US bias in the archives of USENET. All I could find
> > preserved (before Google deleted it) was my updates to the maps for
> > York.ac.uk. In collecting history, the US erased most of Europe and
> > Asia basically. Our timelines are artificially compressed into the
> > modern era.
>
> Can you explain this a bit? When I was on Usenet, 1980-1990 or so, it
> was very small, my guess is maybe 10,000 people that posted, maybe less.
> My memory is I could post a question to comp.arch or where ever, and I'd
> wake up in the morning and someone from Australia or some other place
> over the pond had an answer. It was usually a grad student or a prof
> or someone really smart.
>
> So is this an archive thing? Because in my memory, it was not a Usenet
> thing, smart people from all over the world posted.
>
> As an aside, I remember being on a canoe with my dad, a physics researcher
> and prof, and trying to explain Usenet to him. I said something like
> "it is so cool Dad, so many cool people, everyone should be on it". And
> then AOL happened and it went to shit. If my thoughts helped that along
> I am _so_ sorry. It was awesome when it was small.
>
> This list is sort of like early Usenet, smart people, people who know the
> history. Lets keep it small but Steve should be here.
>
> --lm
On Thu, 21 Nov 2019 16:56:13 +0800 George Michaelson <ggm@algebras.org> wrote:
> uk.* and eu.* seem to be unfindable. Stuff from before the great
> USENET re-org is mostly unfindable. cross posting to lists, only
> partially visible. I've tried to find my own rants, its like I was
> born into the world in 1996. What happened to 1982-onward? It just ..
> evaporated.
Check out the archive,org link I provided earlier. I found a
couple of posts from you in net.lang.c some time in 1984.
I shudder to think how naive, arrogant or both they are. But, thank
you for your detective work. I exist, I am not a number is a glorious
feeling.
What this says, is that where our posts into the world crossed outside
closed (national) namespaces in UUCP backed services, they did get
archived as much as any other did. But I think my other strand remains
true. The body of posts I and others made into uk.* is probably now
lost forever. Ephemeral data preservation is chancey. A future
digital archeologist will be looking at these bits, inferring stuff
which in some sense is true (the US was the centre of much discussion
in this space) and not true (the absence of data states about other
places is not strongly indicative of their richness and intensity,
because they were not preserved)
-G
On Thu, Nov 21, 2019 at 5:41 PM Bakul Shah <bakul@bitblocks.com> wrote:
>
> On Thu, 21 Nov 2019 16:56:13 +0800 George Michaelson <ggm@algebras.org> wrote:
> > uk.* and eu.* seem to be unfindable. Stuff from before the great
> > USENET re-org is mostly unfindable. cross posting to lists, only
> > partially visible. I've tried to find my own rants, its like I was
> > born into the world in 1996. What happened to 1982-onward? It just ..
> > evaporated.
>
> Check out the archive,org link I provided earlier. I found a
> couple of posts from you in net.lang.c some time in 1984.
On 21 Nov 2019, at 09:56, George Michaelson <ggm@algebras.org> wrote:
> (It would not surprise me for a hk.* and jp.* and su.* and the like to
> say: "brother, you have no idea”)
And it.* and many other “local” groups from unis but also companies, ;)
What about clari.net? Anyone remember them? They had groups for “real news”, financial info, etc. on a subscription basis.
Arrigo
[-- Attachment #1: Type: text/plain, Size: 1746 bytes --] I keep a copy of the utzoo files. And then I hacked the altavista desktop search the files using Apache to filter content inline. https://altavista.superglobalmegacorp.com/altavista I know I'd love to feed it more data, the utzoo stuff is massive for 1991, but it's really trivial for 2019. It's around 10GB decompressed. From: TUHS <tuhs-bounces@minnie.tuhs.org> on behalf of Larry McVoy <lm@mcvoy.com> Sent: Thursday, November 21, 2019, 11:53 AM To: Bakul Shah Cc: tuhs@tuhs.org Subject: Re: [TUHS] Steve Bellovin recounts the history of USENET On Wed, Nov 20, 2019 at 07:50:53PM -0800, Bakul Shah wrote: > On Wed, 20 Nov 2019 19:14:23 -0800 Larry McVoy wrote: > > Yeah, I'd be super happy if he joined the list. I enjoyed reading > > those, wished he had gone into more detail. > > > > On the Usenet topic, does anyone remember dejanews? Searchable > > archive of all the posts to Usenet. Google bought them and then, > > so far as I know, the searchable part went away. > > > > If someone knows how to search back to the beginnings of Usenet, > > my early tech life is all there, I'd love to be able to show my kids > > that. Big arguing with Mash on comp.arch, following Guy Harris on > > comp.unix-wizards, etc. > > I have occasionally downloaded some mbox.zip files from > https://archive.org/details/usenet > But there are too many files there. Would be nice if there > was a collaborative effort to organize them in a more usable, > searchable state. Pretty much all of it (minus binaries > groups) can be stored locally (or using some global > namespace. So is that all of Usenet? -- --- Larry McVoy lm at mcvoy.com http://www.mcvoy.com/lm [-- Attachment #2: Type: text/html, Size: 3271 bytes --]
Jason Stevens <jsteve@superglobalmegacorp.com> wrote:
> I keep a copy of the utzoo files.
Any chance of getting them to Warren for storage? Or are they
generally available somewhere?
Thanks,
Arnold
[-- Attachment #1: Type: text/plain, Size: 732 bytes --] Oh sure I forgot to add them to my reply https://utzoo.superglobalmegacorp.com I keep a backup of them there. The original site is gone. I think archive.org should have it too as utzoo. They are a must have for any amateur historian. It was so awesome pulling out the 4.2bsd nic driver for simh. And an incredible resource for seeing how the ancients dealt with things. Get Outlook for Android On Thu, Nov 21, 2019 at 9:23 PM +0800, <arnold@skeeve.com> wrote: Jason Stevens wrote: > I keep a copy of the utzoo files. Any chance of getting them to Warren for storage? Or are they generally available somewhere? Thanks, Arnold [-- Attachment #2: Type: text/html, Size: 1994 bytes --]
arnold@skeeve.com writes: > Jason Stevens <jsteve@superglobalmegacorp.com> wrote: > >> I keep a copy of the utzoo files. > > Any chance of getting them to Warren for storage? Or are they > generally available somewhere? They are also on archive.org: https://archive.org/details/utzoo-wiseman-usenet-archive -- Leah Neukirchen <leah@vuxu.org> https://leahneukirchen.org/
On Thu, Nov 21, 2019 at 12:46:49PM +0000, Jason Stevens wrote:
> I keep a copy of the utzoo files.
> And then I hacked the altavista desktop search the files using
> Apache to filter content inline.
> https://altavista.superglobalmegacorp.com/altavista
Nice stuff. Works with dillo browser. Not so much with emacs-w3,
because it cannot guess what is a type of data delivered from your
server - I am asked a question, give it a hint "text/html" in response
and then your page loads without any further problem. After initial
page is loaded, search went ok, no problem. HTH.
I actually used Altavista in 90-ties. Compared to today, it feels so
modern.
Like Button!
--
Regards,
Tomasz Rola
--
** A C programmer asked whether computer had Buddha's nature. **
** As the answer, master did "rm -rif" on the programmer's home **
** directory. And then the C programmer became enlightened... **
** **
** Tomasz Rola mailto:tomasz_rola@bigfoot.com **
On Wed, 20 Nov 2019, Larry McVoy wrote:
> If someone knows how to search back to the beginnings of Usenet, my
> early tech life is all there, I'd love to be able to show my kids that.
> Big arguing with Mash on comp.arch, following Guy Harris on
> comp.unix-wizards, etc.
I think I'd be embarrassed over some of my early posts... And yeah,
Guy Harris was brilliant, and always helpful.
-- Dave
On Thu, Nov 21, 2019 at 04:58:01PM +0100, Leah Neukirchen wrote:
>
> arnold@skeeve.com writes:
>
> > Jason Stevens <jsteve@superglobalmegacorp.com> wrote:
> >
> >> I keep a copy of the utzoo files.
> >
> > Any chance of getting them to Warren for storage? Or are they
> > generally available somewhere?
>
> They are also on archive.org:
> https://archive.org/details/utzoo-wiseman-usenet-archive
>
> --
> Leah Neukirchen <leah@vuxu.org> https://leahneukirchen.org/
I'm half tempted to take the archive.org Usenet files and throw them
into Elasticsearch and create a web front end for searching. Storage
would be expensive, but search would rock!
Justin
[-- Attachment #1: Type: text/plain, Size: 2650 bytes --] On Fri, 22 Nov 2019 at 15:19, Justin R. Andrusk <jra@andrusk.com> wrote: > On Thu, Nov 21, 2019 at 04:58:01PM +0100, Leah Neukirchen wrote: > > > > arnold@skeeve.com writes: > > > > > Jason Stevens <jsteve@superglobalmegacorp.com> wrote: > > > > > >> I keep a copy of the utzoo files. > > > > > > Any chance of getting them to Warren for storage? Or are they > > > generally available somewhere? > > > > They are also on archive.org: > > https://archive.org/details/utzoo-wiseman-usenet-archive > > > > -- > > Leah Neukirchen <leah@vuxu.org> https://leahneukirchen.org/ > > I'm half tempted to take the archive.org Usenet files and throw them > into Elasticsearch and create a web front end for searching. Storage > would be expensive, but search would rock! > Has anyone definitely proven that any of the contents of these files are not in the searchable Google Groups interface? I don't really see any need to duplicate their efforts. I am 100% certain that Google got Deja News's entire archive and 99% certain that it was fairly quickly supplemented with the University of Toronto material provided by Henry Spencer. Certainly the headers in a thread like this would seem to indicate that the material all came from utzoo: https://groups.google.com/forum/#!msg/net.unix-wizards/krbEHGQ95_o/QaV2LNSeMlgJ (see "show original" for any message in the dropdown box in the upper right hand corner by the date). While Google has not shown a tremendous deal of interest in Groups over the years - notably, the search was very lacking/incomplete at various points - I would think that there is now enough acknowledgement of the historical importance of these messages that Google would at the very least do their best to preserve what they have. I would also imagine that if someone else had approached them with a substantial enough private archive that they would have accepted it, and not necessarily done a huge press release depending on the time frame, but that's pure supposition on my part. It would be fascinating to look through messages from before 1995 (when Deja News started archiving) to see if any clues can be unearthed about message sources other than utzoo. As somewhat of an aside, my father was the head sysadmin at Deja News at the time of their purchase by Google and I may have recounted this story before but it's worth sharing again. Google's entire purchase of Deja News involved a couple of Google engineers flying to Austin with a large disk array, letting it mirror over a weekend, and then flying back to California. Google did not, as far as I recall, take possession of any physical assets whatsoever. -Henry [-- Attachment #2: Type: text/html, Size: 3747 bytes --]
On Fri, Nov 22, 2019 at 03:49:58PM -0500, Henry Bent wrote:
>
> Has anyone definitely proven that any of the contents of these files are
> not in the searchable Google Groups interface? I don't really see any need
> to duplicate their efforts.
That data is essentially unavailable to people without Google accounts.
Even back when it was, the search had degraded to the point where you
could paste explicit quotes from messages and those messages would not
be in the results page.
I wholeheartedly see a need to duplicate (and surpass) their efforts. The
Deja News service was wonderful; Google's implementation is not. Even if
someon were to convince them to improve it, they've demonstrated they're
not a good company to rely on for long-term availability of services
that are not active surveillance vehicles.
khm
On Fri, Nov 22, 2019 at 01:06:45PM -0800, Kurt H Maier wrote:
> On Fri, Nov 22, 2019 at 03:49:58PM -0500, Henry Bent wrote:
> >
> > Has anyone definitely proven that any of the contents of these files are
> > not in the searchable Google Groups interface? I don't really see any need
> > to duplicate their efforts.
>
> That data is essentially unavailable to people without Google accounts.
> Even back when it was, the search had degraded to the point where you
> could paste explicit quotes from messages and those messages would not
> be in the results page.
>
> I wholeheartedly see a need to duplicate (and surpass) their efforts. The
> Deja News service was wonderful; Google's implementation is not. Even if
> someon were to convince them to improve it, they've demonstrated they're
> not a good company to rely on for long-term availability of services
> that are not active surveillance vehicles.
Amen.
[-- Attachment #1: Type: text/plain, Size: 1136 bytes --] On Fri, 22 Nov 2019 at 16:06, Kurt H Maier <khm@sciops.net> wrote: > On Fri, Nov 22, 2019 at 03:49:58PM -0500, Henry Bent wrote: > > > > Has anyone definitely proven that any of the contents of these files are > > not in the searchable Google Groups interface? I don't really see any > need > > to duplicate their efforts. > > That data is essentially unavailable to people without Google accounts. > Why do you say that? I used a browser that was fully logged out of Google and in paranoid/private settings mode and I could browse newsgroups, do basic searching, etc. eg: https://groups.google.com/forum/#!searchin/net.unix-wizards/iris%7Csort:date > Even back when it was, the search had degraded to the point where you > could paste explicit quotes from messages and those messages would not > be in the results page. > I alluded to this sort of problem in my previous email, but in my recent experience the search results have been satisfactory and I consider the degraded search problem resolved. I have been able to search for very precise text strings with entirely satisfactory results over the last few months. -Henry [-- Attachment #2: Type: text/html, Size: 1745 bytes --]
On 11/22/2019 3:18 PM, Justin R. Andrusk wrote:
> I'm half tempted to take the archive.org Usenet files and throw them
> into Elasticsearch and create a web front end for searching. Storage
> would be expensive, but search would rock!
Can we run multiple nodes of Elastic, and replicate between each other?
I just recently started playing with it, it's quite impressive. Except
for that one logstash file "read" mode that by default deletes the file
once it's done with it (a 4-year-long access.log that I wanted to read in).
anyway.
art k.
On Fri, Nov 22, 2019 at 05:21:20PM -0500, Henry Bent wrote:
>
> Why do you say that? I used a browser that was fully logged out of Google
> and in paranoid/private settings mode and I could browse newsgroups, do
> basic searching, etc. eg:
> https://groups.google.com/forum/#!searchin/net.unix-wizards/iris%7Csort:date
Clicking on any entry in those search results prompts me to log in.
Suffice it to say it is not satisfactory; TUHS is not the place to debug
Google's software for them, so I'll drop the matter.
I am willing to help any effort to make this data available in less
painful formats or protocols. Feel free to reach out, anyone.
khm
On Fri, Nov 22, 2019 at 06:21:49PM -0500, Arthur Krewat wrote:
>
> On 11/22/2019 3:18 PM, Justin R. Andrusk wrote:
> > I'm half tempted to take the archive.org Usenet files and throw them
> > into Elasticsearch and create a web front end for searching. Storage
> > would be expensive, but search would rock!
>
> Can we run multiple nodes of Elastic, and replicate between each other?
>
> I just recently started playing with it, it's quite impressive. Except
> for that one logstash file "read" mode that by default deletes the file
> once it's done with it (a 4-year-long access.log that I wanted to read in).
>
> anyway.
>
> art k.
Yes, that's how the clustering works with Elasticsearch. You setup
multiple nodes that are part of a cluster and data is replicated across
all of them. If one goes down, you don't lose any data as the others
will reconstitute the data.
Going to look at adding the Usenet data to a Graylog instance as that
uses Elasticsearch as a backend and the front end UI is already there to
give you a GUI for searching and doing analytics on what you send to it.
Justin
On Fri, Nov 22, 2019 at 04:00:34PM -0800, Kurt H Maier wrote:
> On Fri, Nov 22, 2019 at 05:21:20PM -0500, Henry Bent wrote:
> >
> > Why do you say that? I used a browser that was fully logged out of Google
> > and in paranoid/private settings mode and I could browse newsgroups, do
> > basic searching, etc. eg:
> > https://groups.google.com/forum/#!searchin/net.unix-wizards/iris%7Csort:date
>
> Clicking on any entry in those search results prompts me to log in.
> Suffice it to say it is not satisfactory; TUHS is not the place to debug
> Google's software for them, so I'll drop the matter.
All I can say is that's not my experience. I just dropped the above
link into an incognito browser window (so no cookies, logins, etc.),
and I was able to click on any of those links and read them.
Cheers,
- Ted
[-- Attachment #1: Type: text/plain, Size: 2190 bytes --] I for one believe in duplication, and not relying on a single source. All the artifacts survive today because they were scattered to the winds and found again. Plus when building a database having 10gb of human entered data is invaluable. I should add that the first public listing of hack on Google is missing the last part. But it's in the utzoo archive. https://nethackwiki.com/wiki/Hack_1.0 So the google stuff is incomplete. Besides it's fun to re-read the world when the rumours of Spocks iniment death in the next movie circulated, and fans petitioned to save him, or even the fallout of Tienemen square, and how it parallels in reddit. I should also add now that Intel is purging all their old drivers and documents online, even a company with a vested interest in their own past doesn't care. Google is sunsetting their cloud printing after being up for a decade. It's only a matter of time before they find past free speech inconvenient and problematic and terminate groups. TLDR is that data needs to be shared, not made inaccessible by one company, and the Google usenet thing is incomplete. On Sat, Nov 23, 2019 at 5:33 AM +0800, "Larry McVoy" <lm@mcvoy.com> wrote: On Fri, Nov 22, 2019 at 01:06:45PM -0800, Kurt H Maier wrote: > On Fri, Nov 22, 2019 at 03:49:58PM -0500, Henry Bent wrote: > > > > Has anyone definitely proven that any of the contents of these files are > > not in the searchable Google Groups interface? I don't really see any need > > to duplicate their efforts. > > That data is essentially unavailable to people without Google accounts. > Even back when it was, the search had degraded to the point where you > could paste explicit quotes from messages and those messages would not > be in the results page. > > I wholeheartedly see a need to duplicate (and surpass) their efforts. The > Deja News service was wonderful; Google's implementation is not. Even if > someon were to convince them to improve it, they've demonstrated they're > not a good company to rely on for long-term availability of services > that are not active surveillance vehicles. Amen. [-- Attachment #2: Type: text/html, Size: 4416 bytes --]
On Sat, Nov 23, 2019 at 01:48:07AM +0000, Jason Stevens wrote:
> Plus when building a database having 10gb of human entered data is invaluable.??
I agree even though it is self serving. I'm the guy that posted something
on soc.singles and disappeared off usenet for three months (was installing
Unix on a supercomputer at the Tokyo Institute of Technology, no UUCP,
no Usenet).
I came back and people were *still* arguing about what I said. Huh.
I don't really care for me at this point, but I'd love for my kids
to learn about me through all those posts. soc.singles was a
distraction, comp.arch, comp.unix-wizards, there is a pretty big
window into who I am in those posts. I've been reading 30-35 year
old posts I made, and while I'm ashamed of how cocky I was, there
was some substance there.
So I'd love an interface like dejanews had. You could limit to a
set of groups (I don't remember how you did that but it was a thing)
and you could limit it over a date range, and of course you could
search by string. I think there was more ways to tailor the search.
If I can offer up some help getting this back, let me know.
--lm
All, netnews is a bit off-topic for TUHS so perhaps -> COFF? Thanks, Warren
On 11/22/2019 8:32 PM, Justin R. Andrusk wrote:
>
> Yes, that's how the clustering works with Elasticsearch. You setup
> multiple nodes that are part of a cluster and data is replicated across
> all of them. If one goes down, you don't lose any data as the others
> will reconstitute the data.
>
Yes, I know, I was legitimately asking ;)
art .k
On 2019-11-20 21:14, Larry McVoy wrote: > Yeah, I'd be super happy if he joined the list. I enjoyed reading > those, wished he had gone into more detail. > > On the Usenet topic, does anyone remember dejanews? Searchable > archive of all the posts to Usenet. Google bought them and then, > so far as I know, the searchable part went away. Deja News was a customer of the data center I worked at back in '97-'99, smartnap.net. My usenet server fed directly into theirs, which made all of my other customers (and several other people on the net) want to peer with me, since one of the ways some people judged how "good" a usenet feed was was how quickly a post could be viewed on dejanews.com. They took up about 1/4 the space of our ~12k sq ft facility. My current boss was one of the news admins there. As I understood it, they had pretty much everything from when they started, plus had donated tapes of older stuff that they would periodically load up and add to the online bits. By the time Google bought them, they'd dropped the 'News' part of their name and were focusing on being a product search engine under the name "Deja." All Google was really interested in was the usenet archives. > If someone knows how to search back to the beginnings of Usenet, > my early tech life is all there, I'd love to be able to show my kids > that. Big arguing with Mash on comp.arch, following Guy Harris on > comp.unix-wizards, etc. Searching for an old username of mine on group.google.com finds posts I made in 1993, when I first started using usenet in college. -- Michael Parson Pflugerville, TX KF5LGQ