From mboxrd@z Thu Jan  1 00:00:00 1970
To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net>
In-reply-to: Your message of "Thu, 30 Aug 2012 15:35:47 +0530."
	<CAEoi9W6c2JaxrTDwR76TjuN9DeBXnO6gueOGCaEh+Fksfb9zqQ@mail.gmail.com>
References: <CAOEdRO0Qq2ptPf1p39PjMRaaAB9-KRNMRsFBmV+b_a2WNa=g5Q@mail.gmail.com>
	<CAEoi9W6K07ZgNS+7-oeQY3dkedW7qZLnOh79m5J7KCw7os5aEQ@mail.gmail.c>
	<68ce90976b22bdb0929ccccafef4b7d0@kw.quanstro.net>
	<3330200.XJjoRb8JbZ@blitz>
	<5538fcd345a73fc294c6ee568f2fcdb4@kw.quanstro.net>
	<CAEoi9W5bZE+coAaQA-Be4P_JQoB6TR31GS=gYhq=6BhHNXCRKw@mail.gmail.c>
	<8ee568439e1855124564ecf0e83ac2b3@kw.quanstro.net>
	<CAEoi9W5_r=4w5EcdrbKuqD566p=C5KvEEHg72YKzdat18En2-w@mail.gmail.c>
	<6c7d84f203a2cc4f3427e177e34fa9d9@brasstown.quanstro.net>
	<CAEoi9W6c2JaxrTDwR76TjuN9DeBXnO6gueOGCaEh+Fksfb9zqQ@mail.gmail.com>
Date: Thu, 30 Aug 2012 10:02:41 -0700
From: Bakul Shah <bakul@bitblocks.com>
Message-Id: <20120830170241.A8D27B827@mail.bitblocks.com>
Subject: Re: [9fans] rc's shortcomings (new subject line)
Topicbox-Message-UUID: b5282d38-ead7-11e9-9d60-3106f5b1d025

On Thu, 30 Aug 2012 15:35:47 +0530 Dan Cross <crossd@gmail.com>  wrote:
> On Wed, Aug 29, 2012 at 7:27 PM, erik quanstrom <quanstro@quanstro.net> wrote
> :
> >> > rc already has non-linear pipelines.  but they're not very convienient.
> >>
> >> And somewhat limited.  There's no real concept of 'fanout' of output,
> >> for instance (though that's a fairly trivial command, so probably
> >> doesn't count), or multiplexing input from various sources that would
> >> be needed to implement something like a shell-level data flow network.
> >>
> >> Muxing input from multiple sources is hard when the data isn't somehow
> >> self-delimited.
> >>[...]
> >> There may be other ways to achieve the same thing; I remember that the
> >> boundaries of individual writes used to be preserved on read, but I
> >> think that behavior changed somewhere along the way; maybe with the
> >> move away from streams?  Or perhaps I'm misremembering?
> >
> > pipes still preserve write boundaries, as does il.  (even the 0-byte write)
>  but tcp of course by
> > definition does not.  but either way, the protocol would need to be
> > self-framed to be transported on tcp.  and even then, there are protocols
> > that are essentially serial, like tls.
>
> Right.  I think this is the reason for Bakul's question about
> s-expressions or JSON or a similar format; those formats are
> inherently self-delimiting.

Indeed.

>                              The problem with that is that, for
> passing those things around to work without some kind of reverse
> 'tee'-like intermediary, the system has to understand the the things
> that are being transferred are s-expressions or JSON records or
> whatever, not just streams of uninterpreted bytes.  We've steadfastly
> rejected such system-imposing structure on files in Unix-y type
> environments since 1969.

I think that it is time to try something new.  A lot of things
don't fit into into this model of bytepipes connecting
processes. A lot of commands even in this model use line
"objects". But the kind of composability one gets in Scheme,
go, functional languages etc is missing. Even go seems to go
the "one language for all" Lispy model -- its typed channels
are all within a single address space. [Actually I would have
much preferred if they had just focused on adding channels and
parallel processes to a Scheme instead of creating a whole new
language but that is whole 'nother discussion!]

> But conceptually, these IPC mechanisms are sort of similar to channels
> in CSP-style languages.  A natural question then becomes, how do
> CSP-style languages handle the issue?  Channels work around the muxing
> thing by being typed; elements placed onto a channel are indivisible
> objects of that type, so one doesn't need to worry about interference
> from other objects simultaneously placed onto the same channel in
> other threads of execution.  Could we do something similar with pipes?
>  I don't know that anyone wants typed file descriptors; that would
> open a whole new can of worms.

I am suggesting channels of self-typed objects.  The idea of
communicating self-identifying objects between loosely coupled
processes is blindingly obvious to me. And as long as you have
one producer/one consumer pair, there are no problems in
implementing this on any system but the unix model of
"inheriting" file descriptors (or even passing them around to
unrelated processes) and then letting them all blab
indiscriminately on the same channel just doesn't work. So again
Unix gets in the way.

> This sounds wonderful, of course, but in Lisp and Scheme, lists are
> built from cons cells, and even if I have some magic
> 'split-into-halves' function that satisfies the requirements of
> reduce, doing so is still necessarily linear, so I don't gain much.
> Besides, having to pass around the identity all the time is a bummer.

You can use cdr-coding. Or just use arrays like APL/j/k. But a
knowledge of function properties is essential to implement
them efficiently and for one arg version (without the initial
value). In k "/" is the reduce operator so you can say

    +/!3	// !3 == 0 1 2
    0+/!3 	// two arg version of +.

And
    */!0
    +/!0
do the right thing because the language knows identity
elements for + and *.

> But in clojure, the Lisp concept of a list (composed of cons cells) is
> generalized into the concept of a 'seq'.  A seq is just a sequence of
> things; it could be a list, a vector, some other container (say, a
> sequence of key/value pairs derived from some kind of associated
> structure), or a stream of data being read from a file or network
> connection.
>
> What's the *real* problem here?  The issue is that reduce "knows" too
> much about the things it is reducing over.  Doing things sequentially
> is easy, but slow; doing things in parallel requires that reduce know
> a lot about the type of thing it's reducing over (e.g., this magic
> 'split-into-halves' function.  Further, that might not be appropriate
> for *all* sequence types; e.g., files or lists made from cons cells.

Even more common than reduce is map. No reason why you can't
parallelize

	8c *.c

> Your example of running multiple 'grep's in parallel sort of reminded
> me of this, though it occurs to me that this can probably be done with

This is a map not reduce. The "reduce" would be implicitly
(and often wrongly) done by fd inheritance.

I have a toy array-scheme where all functions except array
functions are applied elementwise when given arrays.

	(+ #(1 2 3) #(4 5 6)) => #(5 7 9)
	(* 2 #(4 5 6)) => #(8 10 12)
	(* 2 #(4 #(5 5) 6)) => #(8 #(10 10) 12)

So I think map can be made implicit but reduction has to be
explicit. This is what APLs do.

> a command: a sort of 'parallel apply' thing that can run a command
> multiple times concurrently, each invocation on a range of the
> arguments.  But making it simple and elegant is likely to be tricky.

APL/k have peach for parallel-each ("each" is their way of saying map).