The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
* [TUHS] Data structures in Unix editors
@ 2021-03-31 17:39 David C. Brock
  2021-03-31 18:07 ` arnold
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: David C. Brock @ 2021-03-31 17:39 UTC (permalink / raw)
  To: tuhs

[-- Attachment #1: Type: text/plain, Size: 935 bytes --]

All of the great discussion on this list about editors has made me curious about the data structures used in the various Unix editors.

I found a great discussion of this for sam in Rob Pike’s publication “The Text Editor sam.”

I’d like to read similar discussions of the data structures for ed, em, ex/vi. If anyone has suggestions of references, they would be very welcome!

Similarly, if there are any pointers to references on some other data structures in editors like TECO, QED and E, I’d welcome them as well.

All the best,

David
...........
David C. Brock
Director and Curator
Software History Center
Computer History Museum
computerhistory.org/softwarehistory<http://computerhistory.org/softwarehistory>
Email: dbrock@computerhistory.org
Twitter: @dcbrock
Skype: dcbrock
1401 N. Shoreline Blvd.
Mountain View, CA 94943
(650) 810-1010 main
(650) 810-1886 direct
Pronouns: he, him, his



[-- Attachment #2: Type: text/html, Size: 4356 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [TUHS] Data structures in Unix editors
  2021-03-31 17:39 [TUHS] Data structures in Unix editors David C. Brock
@ 2021-03-31 18:07 ` arnold
  2021-03-31 18:23   ` Richard Salz
  2021-03-31 18:20 ` Lars Brinkhoff
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: arnold @ 2021-03-31 18:07 UTC (permalink / raw)
  To: tuhs, dbrock

The "Software Tools" books used very simple arrays of characters
to manage the text buffers for their version of 'ed'.  Those books
are worth reading in any case. :-)

Arnold

"David C. Brock" <dbrock@computerhistory.org> wrote:

> All of the great discussion on this list about editors has made me curious about the data structures used in the various Unix editors.
>
> I found a great discussion of this for sam in Rob Pike’s publication “The Text Editor sam.”
>
> I’d like to read similar discussions of the data structures for ed, em, ex/vi. If anyone has suggestions of references, they would be very welcome!
>
> Similarly, if there are any pointers to references on some other data structures in editors like TECO, QED and E, I’d welcome them as well.
>
> All the best,
>
> David
> ...........
> David C. Brock
> Director and Curator
> Software History Center
> Computer History Museum
> computerhistory.org/softwarehistory<http://computerhistory.org/softwarehistory>
> Email: dbrock@computerhistory.org
> Twitter: @dcbrock
> Skype: dcbrock
> 1401 N. Shoreline Blvd.
> Mountain View, CA 94943
> (650) 810-1010 main
> (650) 810-1886 direct
> Pronouns: he, him, his
>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [TUHS] Data structures in Unix editors
  2021-03-31 17:39 [TUHS] Data structures in Unix editors David C. Brock
  2021-03-31 18:07 ` arnold
@ 2021-03-31 18:20 ` Lars Brinkhoff
  2021-03-31 18:49 ` Bakul Shah
  2021-04-01 12:56 ` Tony Finch
  3 siblings, 0 replies; 15+ messages in thread
From: Lars Brinkhoff @ 2021-03-31 18:20 UTC (permalink / raw)
  To: David C. Brock; +Cc: tuhs

David C. Brock wrote:
> Similarly, if there are any pointers to references on some other data
> structures in editors like TECO, QED and E, I’d welcome them as well.

http://web.mit.edu/~yandros/doc/craft-text-editing/Chapter-6.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [TUHS] Data structures in Unix editors
  2021-03-31 18:07 ` arnold
@ 2021-03-31 18:23   ` Richard Salz
  0 siblings, 0 replies; 15+ messages in thread
From: Richard Salz @ 2021-03-31 18:23 UTC (permalink / raw)
  To: arnold; +Cc: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 453 bytes --]

> Similarly, if there are any pointers to references on some other data
structures in editors like TECO, QED and E, I’d welcome them as well.

The classic is "cookbook for an emacs" but it's from 1980
https://dspace.mit.edu/handle/1721.1/15905

There's also this that's 25 years more recent but 16 years old:
https://ecc-comp.blogspot.com/2015/05/a-brief-glance-at-how-5-text-editors.html

And this: https://nullprogram.com/blog/2017/09/07/

[-- Attachment #2: Type: text/html, Size: 799 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [TUHS] Data structures in Unix editors
  2021-03-31 17:39 [TUHS] Data structures in Unix editors David C. Brock
  2021-03-31 18:07 ` arnold
  2021-03-31 18:20 ` Lars Brinkhoff
@ 2021-03-31 18:49 ` Bakul Shah
  2021-04-01 12:56 ` Tony Finch
  3 siblings, 0 replies; 15+ messages in thread
From: Bakul Shah @ 2021-03-31 18:49 UTC (permalink / raw)
  To: David C. Brock; +Cc: tuhs

[-- Attachment #1: Type: text/plain, Size: 504 bytes --]

On Mar 31, 2021, at 10:46 AM, David C. Brock <dbrock@computerhistory.org> wrote:
> 
> I’d like to read similar discussions of the data structures for ed, em, ex/vi. If anyone has suggestions of references, they would be very welcome!
> 
> Similarly, if there are any pointers to references on some other data structures in editors like TECO, QED and E, I’d welcome them as well.

Charles Crowley’s “Data Structures of Text Sequences”:
https://www.cs.unm.edu/~crowley/papers/sds.pdf



[-- Attachment #2: Type: text/html, Size: 947 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [TUHS] Data structures in Unix editors
  2021-03-31 17:39 [TUHS] Data structures in Unix editors David C. Brock
                   ` (2 preceding siblings ...)
  2021-03-31 18:49 ` Bakul Shah
@ 2021-04-01 12:56 ` Tony Finch
  2021-04-01 14:24   ` Richard Salz
                     ` (2 more replies)
  3 siblings, 3 replies; 15+ messages in thread
From: Tony Finch @ 2021-04-01 12:56 UTC (permalink / raw)
  To: David C. Brock; +Cc: tuhs

[-- Attachment #1: Type: text/plain, Size: 717 bytes --]

David C. Brock <dbrock@computerhistory.org> wrote:
>
> I’d like to read similar discussions of the data structures for ed, em,
> ex/vi. If anyone has suggestions of references, they would be very
> welcome!

A curious one is nvi, which uses the Berkeley DB RECNO interface to access
a text file as an array of lines (RECNO = record number).

Tony.
-- 
f.anthony.n.finch  <dot@dotat.at>  https://dotat.at/
Sole, Lundy, Fastnet, Irish Sea, Shannon: East or northeast 4 to 6,
occasionally 7 in Lundy. Rough at first in northwest Shannon,
otherwise slight or moderate, then occasionally rough later in east
Sole. Fog patches at first in Sole. Moderate or good, occasionally
very poor at first in Sole.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [TUHS] Data structures in Unix editors
  2021-04-01 12:56 ` Tony Finch
@ 2021-04-01 14:24   ` Richard Salz
  2021-04-01 21:25   ` Dave Horsfall
  2021-04-05 23:23   ` Mary Ann Horton
  2 siblings, 0 replies; 15+ messages in thread
From: Richard Salz @ 2021-04-01 14:24 UTC (permalink / raw)
  To: Tony Finch; +Cc: tuhs

[-- Attachment #1: Type: text/plain, Size: 226 bytes --]

>
> A curious one is nvi, which uses the Berkeley DB RECNO interface to access
> a text file as an array of lines (RECNO = record number).
>
>
Not so curious: Keith Bostic was the principal author of both nvi and
Berkeley DB.

[-- Attachment #2: Type: text/html, Size: 466 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [TUHS] Data structures in Unix editors
  2021-04-01 12:56 ` Tony Finch
  2021-04-01 14:24   ` Richard Salz
@ 2021-04-01 21:25   ` Dave Horsfall
  2021-04-01 21:32     ` John Cowan
  2021-04-05 23:23   ` Mary Ann Horton
  2 siblings, 1 reply; 15+ messages in thread
From: Dave Horsfall @ 2021-04-01 21:25 UTC (permalink / raw)
  To: The Eunuchs Hysterical Society

On Thu, 1 Apr 2021, Tony Finch wrote:

> A curious one is nvi, which uses the Berkeley DB RECNO interface to 
> access a text file as an array of lines (RECNO = record number).

On FreeBSD as least, "nvi" is used instead of "vi" (and is linked).

And oddly enough, BDB is exactly how I would have implemented it...  The 
basic datum is a line after all (I have no idea about EMACS, and don't 
want to know) so it makes sense to use a structure that can rapidly access 
arbitrary line numbers.

-- Dave

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [TUHS] Data structures in Unix editors
  2021-04-01 21:25   ` Dave Horsfall
@ 2021-04-01 21:32     ` John Cowan
  2021-04-02 22:40       ` Dave Horsfall
  0 siblings, 1 reply; 15+ messages in thread
From: John Cowan @ 2021-04-01 21:32 UTC (permalink / raw)
  To: Dave Horsfall; +Cc: The Eunuchs Hysterical Society

[-- Attachment #1: Type: text/plain, Size: 1020 bytes --]

On Thu, Apr 1, 2021 at 5:26 PM Dave Horsfall <dave@horsfall.org> wrote:


> And oddly enough, BDB is exactly how I would have implemented it...  The
> basic datum is a line after all (I have no idea about EMACS, and don't
> want to know) so it makes sense to use a structure that can rapidly access
> arbitrary line numbers.
>

I'd use SQLite nowadays, because it takes extraordinary care to make sure
that no data is lost short of disk failure.  It is considerably more robust
than the underlying filesystem, and runs embedded in its process.  It also
means you can readily carry about arbitrary data in additional columns; for
example, you could make line marks persistent, including dot.



John Cowan          http://vrici.lojban.org/~cowan        cowan@ccil.org
Well, I have news for our current leaders and the leaders of tomorrow:
the Bill of Rights is not a frivolous luxury, in force only during
times of peace and prosperity.  We don't just push it to the side
when the going gets tough.  --Molly Ivins (pbuh)

[-- Attachment #2: Type: text/html, Size: 2207 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [TUHS] Data structures in Unix editors
  2021-04-01 21:32     ` John Cowan
@ 2021-04-02 22:40       ` Dave Horsfall
  2021-04-02 23:20         ` Adam Thornton
  0 siblings, 1 reply; 15+ messages in thread
From: Dave Horsfall @ 2021-04-02 22:40 UTC (permalink / raw)
  To: The Eunuchs Hysterical Society

[-- Attachment #1: Type: text/plain, Size: 690 bytes --]

On Thu, 1 Apr 2021, John Cowan wrote:

[ Me thinking of using BDB for editor data structures ]

> I'd use SQLite nowadays, because it takes extraordinary care to make 
> sure that no data is lost short of disk failure.  It is considerably 
> more robust than the underlying filesystem, and runs embedded in its 
> process.  It also means you can readily carry about arbitrary data in 
> additional columns; for example, you could make line marks persistent, 
> including dot.

Good point; thanks.  I'd forgotten about SQLite...  I doubt if I'll be 
writing a new editor any time soon though (VI works just fine) but was 
planning on incorporating it in a project I'm working on.

-- Dave

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [TUHS] Data structures in Unix editors
  2021-04-02 22:40       ` Dave Horsfall
@ 2021-04-02 23:20         ` Adam Thornton
  2021-04-03  0:34           ` Jon Forrest
  0 siblings, 1 reply; 15+ messages in thread
From: Adam Thornton @ 2021-04-02 23:20 UTC (permalink / raw)
  To: The Eunuchs Hysterical Society

[-- Attachment #1: Type: text/plain, Size: 1339 bytes --]

's' from Webb Miller's _A Software Tools Sampler_ is an exhaustively
documented sort-of-stripped-down vi.  Admittedly it's a little tricky to
get your hands on the source document (I got it from interlibrary loan back
at the end of the Before Times, and archive.org lets you check it out for a
limited time, but there may be a waitlist).

It's interesting in that it's not just studying the source to see how it
works, but an actual (large) chapter of a book where he steps through the
construction of the individual functions and their keybindings.

Adam

On Fri, Apr 2, 2021 at 3:42 PM Dave Horsfall <dave@horsfall.org> wrote:

> On Thu, 1 Apr 2021, John Cowan wrote:
>
> [ Me thinking of using BDB for editor data structures ]
>
> > I'd use SQLite nowadays, because it takes extraordinary care to make
> > sure that no data is lost short of disk failure.  It is considerably
> > more robust than the underlying filesystem, and runs embedded in its
> > process.  It also means you can readily carry about arbitrary data in
> > additional columns; for example, you could make line marks persistent,
> > including dot.
>
> Good point; thanks.  I'd forgotten about SQLite...  I doubt if I'll be
> writing a new editor any time soon though (VI works just fine) but was
> planning on incorporating it in a project I'm working on.
>
> -- Dave

[-- Attachment #2: Type: text/html, Size: 1833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [TUHS] Data structures in Unix editors
  2021-04-02 23:20         ` Adam Thornton
@ 2021-04-03  0:34           ` Jon Forrest
  0 siblings, 0 replies; 15+ messages in thread
From: Jon Forrest @ 2021-04-03  0:34 UTC (permalink / raw)
  To: tuhs



On 4/2/2021 4:20 PM, Adam Thornton wrote:
> 's' from Webb Miller's _A Software Tools Sampler_ is an exhaustively 
> documented sort-of-stripped-down vi.  Admittedly it's a little tricky to 
> get your hands on the source document (I got it from interlibrary loan 
> back at the end of the Before Times, and archive.org 
> <http://archive.org> lets you check it out for a limited time, but there 
> may be a waitlist).

I have a copy of this book for sale. It's in excellent
condition. It wasn't sold with the software that's printed in
the book so you'll have to somehow get the software yourself.

I'd like $45 plus shipping. (I'll only ship to the
US). If you're interested please contact me directly offlist.

Cordially,
Jon Forrest


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [TUHS] Data structures in Unix editors
  2021-04-01 12:56 ` Tony Finch
  2021-04-01 14:24   ` Richard Salz
  2021-04-01 21:25   ` Dave Horsfall
@ 2021-04-05 23:23   ` Mary Ann Horton
  2 siblings, 0 replies; 15+ messages in thread
From: Mary Ann Horton @ 2021-04-05 23:23 UTC (permalink / raw)
  To: tuhs

This is actually what ed, ex, and vi do as well, albeit not with a 
standard interface.

The buffer is an array of integer offsets into a file. Line n in the 
buffer is pointed at by buffer[n], whose contents are an offset into the 
temp file. That offset points to a null terminated line of buffer text.

When a new line of text is created (or changed) the new line is appended 
to the temp file and the corresponding offset goes into the buffer array.

One big reason for this was that ed was originally written for a 16 bit 
machine, and buffers didn't fit in memory.

This works very well to insert or delete lines, you only have to copy 
the line references in the buffer array. It's also convenient for undo, 
since putting things back just means restoring the original array. My 
enhanced ed (hed at Wisconsin, Portable ed at Bell Labs) had a second 
array for undo, and a full copy of the array was made before a change. 
Bill Joy wrote a fancier implementation in vi, where only lines added or 
deleted were saved, command-specific. Undo knew which command it had to 
undo and special-cased each one.

The big disadvantage to this structure is that ends of lines are not 
just newline characters. You can't backspace over a newline, or change 
newlines to something else.

     Mary Ann

On 4/1/21 5:56 AM, Tony Finch wrote:
> David C. Brock <dbrock@computerhistory.org> wrote:
>> I’d like to read similar discussions of the data structures for ed, em,
>> ex/vi. If anyone has suggestions of references, they would be very
>> welcome!
> A curious one is nvi, which uses the Berkeley DB RECNO interface to access
> a text file as an array of lines (RECNO = record number).
>
> Tony.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [TUHS] Data structures in Unix editors
  2021-04-01 20:12 Noel Chiappa
@ 2021-04-01 20:57 ` John Cowan
  0 siblings, 0 replies; 15+ messages in thread
From: John Cowan @ 2021-04-01 20:57 UTC (permalink / raw)
  To: Noel Chiappa; +Cc: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 1358 bytes --]

On Thu, Apr 1, 2021 at 4:31 PM Noel Chiappa <jnc@mercury.lcs.mit.edu> wrote:


> The first is a TECO from the fourth floor V6 machine (DSSR/RTS) at Tech Sq
> at
> MIT:
>

I happen to remember how PDP-8 Teco stored its content.  In Teco, for those
who have never made its acquaintance, there is the current buffer (which
does not typically contain all of the current file) and a bunch of named
Q-registers that store blobs of text.  Since the PDP-8 is a 12-bit machine,
the normal way to store ASCII uses three characters in two words: one in
the low 8 bits of the first word, one in the low 8 bits of the second, and
one in the top four bits of both words, split big-endian.

In Teco, however, the current buffer is stored in the bottom 8 bits of a
single 4KW memory field.  The Q-registers are stored in alphabetical order
by name (a single character) using the straddling top-4-bits method
described above.  Because the processor cycle time was the same as memory
access time anyway, this storage method was perfectly adequate.

You can also look at the source for Teco-C, which is available in several
places on the net, the last I looked.



John Cowan          http://vrici.lojban.org/~cowan        cowan@ccil.org
Heckler: "Go on, Al, tell 'em all you know.  It won't take long."
Al Smith: "I'll tell 'em all we *both* know.  It won't take any longer."

[-- Attachment #2: Type: text/html, Size: 2976 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [TUHS] Data structures in Unix editors
@ 2021-04-01 20:12 Noel Chiappa
  2021-04-01 20:57 ` John Cowan
  0 siblings, 1 reply; 15+ messages in thread
From: Noel Chiappa @ 2021-04-01 20:12 UTC (permalink / raw)
  To: arnold, dbrock, tuhs; +Cc: jnc

    > From: David C. Brock

    > I'd like to read similar discussions of the data structures for ed, em,
    > ex/vi. ... Similarly, if there are any pointers to references on some
    > other data structures in editors like TECO, QED and E, I'd welcome them
    > as well.

I don't have any discussions I can point you at, but I do have source - for
two things which are somewhat older than most of the ones you mention
(ex/vi/etc).

The first is a TECO from the fourth floor V6 machine (DSSR/RTS) at Tech Sq at
MIT:

  http://ana-3.lcs.mit.edu/~jnc/tech/unix/teco

There's some rudimentary documentation in there, in teco.doc, but don't expect
too much. You'll have to rely on the source, which is in MACRO-11 - but it
seems to be reasonably well commented. This actually predates V6; it was
originally written for an MIT OS called DELPHI, which ran on an -11/45 which
was the main EECS undergrad machine. At some point (probably post the Unix
port), it was modified to have '^R mode', which was a WYSIWYG display mode a
lot like the one in the ITS TECO in which EMACS was first written.

I have also put up the Montgomery Emacs for Unix:

  http://ana-3.lcs.mit.edu/~jnc/tech/unix/emacs

This is the version we were running on the 5th floor MIT V6 machine (CSR),
which by that point have absorbed a few V7isms (e.g. some ioctl() stuff). So
don't expect to be able to compile and run it, without a fair amount of
work. (I vaguely recall that it needs I+D space, so maybe not on a /23 at
all.) But at least the source is in C, so you can read it. I don't think
there's an un-modified version online (i.e. the original Montgomery source),
alas.

	Noel

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-04-05 23:23 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-31 17:39 [TUHS] Data structures in Unix editors David C. Brock
2021-03-31 18:07 ` arnold
2021-03-31 18:23   ` Richard Salz
2021-03-31 18:20 ` Lars Brinkhoff
2021-03-31 18:49 ` Bakul Shah
2021-04-01 12:56 ` Tony Finch
2021-04-01 14:24   ` Richard Salz
2021-04-01 21:25   ` Dave Horsfall
2021-04-01 21:32     ` John Cowan
2021-04-02 22:40       ` Dave Horsfall
2021-04-02 23:20         ` Adam Thornton
2021-04-03  0:34           ` Jon Forrest
2021-04-05 23:23   ` Mary Ann Horton
2021-04-01 20:12 Noel Chiappa
2021-04-01 20:57 ` John Cowan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).