It's useful internally in protocol implementation, specifically to avoid copying in transport protocols (for later retransmission), and the modifications aren't vast.
A few changes were trickier, often because of small bugs in the original code. icmp does some odd things i think.

Btw, "zero copy" isn't the right term and I preferred another term that I've now forgotten. Minimal copying, perhaps.
For one thing, messages can eventually end up being copied to contiguous blocks for devices without decent scatter-gather DMA.

Messages are a tuple (mutable header stack, immutable slices of immutable data).
Originally the data was organised as a tree, but nemo suggested using just an array, so I changed it.
It's important that it's (logically) immutable. Headers are pushed onto and popped from the header stack, and the current stack top is mutable.

There were new readmsg and writemsg system calls to carry message structures between kernel and user level.
The message was immutable on writemsg. Between processes in the same program, message transfers could be done by exchanging pointers into a shared region.

I'll see if I wrote up some of it. I think there were manual pages for the Messages replacing Blocks.

My mcs lock implementation was probably more useful, and I use that in my copy of the kernel known as 9k

Also, NUMA effects are more important in practice on big multicores. Some of the off-chip delays are brutal.

On Sun, 14 Oct 2018 at 09:50, hiro <23hiro@gmail.com> wrote:
thanks, this will allow us to know where to look more closely.

On 10/14/18, Francisco J Ballesteros <nemo@lsub.org> wrote:
> Pure "producer/cosumer" stuff, like sending things through a pipe as long as
> the source didn't need to touch the data ever more.
> Regarding bugs, I meant "producing bugs" not "fixing bugs", btw.
>
>> On 14 Oct 2018, at 09:34, hiro <23hiro@gmail.com> wrote:
>>
>> well, finding bugs is always good :)
>> but since i got curious could you also tell which things exactly got
>> much faster, so that we know what might be possible?
>>
>> On 10/14/18, FJ Ballesteros <nemo@lsub.org> wrote:
>>> yes. bugs, on my side at least.
>>> The copy isolates from others.
>>> But some experiments in nix and in a thing I wrote for leanxcale show
>>> that
>>> some things can be much faster.
>>> It’s fun either way.
>>>
>>>> El 13 oct 2018, a las 23:11, hiro <23hiro@gmail.com> escribió:
>>>>
>>>> and, did it improve anything noticeably?
>>>>
>>>>> On 10/13/18, Charles Forsyth <charles.forsyth@gmail.com> wrote:
>>>>> I did several versions of one part of zero copy, inspired by several
>>>>> things
>>>>> in x-kernel, replacing Blocks by another structure throughout the
>>>>> network
>>>>> stacks and kernel, then made messages visible to user level. Nemo did
>>>>> another part, on his way to Clive
>>>>>
>>>>>> On Fri, 12 Oct 2018, 07:05 Ori Bernstein, <ori@eigenstate.org> wrote:
>>>>>>
>>>>>> On Thu, 11 Oct 2018 13:43:00 -0700, Lyndon Nerenberg
>>>>>> <lyndon@orthanc.ca>
>>>>>> wrote:
>>>>>>
>>>>>>> Another case to ponder ...   We're handling the incoming I/Q data
>>>>>>> stream, but need to fan that out to many downstream consumers.  If
>>>>>>> we already read the data into a page, then flip it to the first
>>>>>>> consumer, is there a benefit to adding a reference counter to that
>>>>>>> read-only page and leaving the page live until the counter expires?
>>>>>>>
>>>>>>> Hiro clamours for benchmarks.  I agree.  Some basic searches I've
>>>>>>> done don't show anyone trying this out with P9 (and publishing
>>>>>>> their results).  Anybody have hints/references to prior work?
>>>>>>>
>>>>>>> --lyndon
>>>>>>>
>>>>>>
>>>>>> I don't believe anyone has done the work yet. I'd be interested
>>>>>> to see what you come up with.
>>>>>>
>>>>>>
>>>>>> --
>>>>>>   Ori Bernstein
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>
>
>
>