[-- Attachment #1: Type: text/plain, Size: 935 bytes --] All of the great discussion on this list about editors has made me curious about the data structures used in the various Unix editors. I found a great discussion of this for sam in Rob Pike’s publication “The Text Editor sam.” I’d like to read similar discussions of the data structures for ed, em, ex/vi. If anyone has suggestions of references, they would be very welcome! Similarly, if there are any pointers to references on some other data structures in editors like TECO, QED and E, I’d welcome them as well. All the best, David ........... David C. Brock Director and Curator Software History Center Computer History Museum computerhistory.org/softwarehistory<http://computerhistory.org/softwarehistory> Email: dbrock@computerhistory.org Twitter: @dcbrock Skype: dcbrock 1401 N. Shoreline Blvd. Mountain View, CA 94943 (650) 810-1010 main (650) 810-1886 direct Pronouns: he, him, his [-- Attachment #2: Type: text/html, Size: 4356 bytes --]
The "Software Tools" books used very simple arrays of characters
to manage the text buffers for their version of 'ed'. Those books
are worth reading in any case. :-)
Arnold
"David C. Brock" <dbrock@computerhistory.org> wrote:
> All of the great discussion on this list about editors has made me curious about the data structures used in the various Unix editors.
>
> I found a great discussion of this for sam in Rob Pike’s publication “The Text Editor sam.”
>
> I’d like to read similar discussions of the data structures for ed, em, ex/vi. If anyone has suggestions of references, they would be very welcome!
>
> Similarly, if there are any pointers to references on some other data structures in editors like TECO, QED and E, I’d welcome them as well.
>
> All the best,
>
> David
> ...........
> David C. Brock
> Director and Curator
> Software History Center
> Computer History Museum
> computerhistory.org/softwarehistory<http://computerhistory.org/softwarehistory>
> Email: dbrock@computerhistory.org
> Twitter: @dcbrock
> Skype: dcbrock
> 1401 N. Shoreline Blvd.
> Mountain View, CA 94943
> (650) 810-1010 main
> (650) 810-1886 direct
> Pronouns: he, him, his
>
>
David C. Brock wrote: > Similarly, if there are any pointers to references on some other data > structures in editors like TECO, QED and E, I’d welcome them as well. http://web.mit.edu/~yandros/doc/craft-text-editing/Chapter-6.html
[-- Attachment #1: Type: text/plain, Size: 453 bytes --] > Similarly, if there are any pointers to references on some other data structures in editors like TECO, QED and E, I’d welcome them as well. The classic is "cookbook for an emacs" but it's from 1980 https://dspace.mit.edu/handle/1721.1/15905 There's also this that's 25 years more recent but 16 years old: https://ecc-comp.blogspot.com/2015/05/a-brief-glance-at-how-5-text-editors.html And this: https://nullprogram.com/blog/2017/09/07/ [-- Attachment #2: Type: text/html, Size: 799 bytes --]
[-- Attachment #1: Type: text/plain, Size: 504 bytes --] On Mar 31, 2021, at 10:46 AM, David C. Brock <dbrock@computerhistory.org> wrote: > > I’d like to read similar discussions of the data structures for ed, em, ex/vi. If anyone has suggestions of references, they would be very welcome! > > Similarly, if there are any pointers to references on some other data structures in editors like TECO, QED and E, I’d welcome them as well. Charles Crowley’s “Data Structures of Text Sequences”: https://www.cs.unm.edu/~crowley/papers/sds.pdf [-- Attachment #2: Type: text/html, Size: 947 bytes --]
[-- Attachment #1: Type: text/plain, Size: 717 bytes --] David C. Brock <dbrock@computerhistory.org> wrote: > > I’d like to read similar discussions of the data structures for ed, em, > ex/vi. If anyone has suggestions of references, they would be very > welcome! A curious one is nvi, which uses the Berkeley DB RECNO interface to access a text file as an array of lines (RECNO = record number). Tony. -- f.anthony.n.finch <dot@dotat.at> https://dotat.at/ Sole, Lundy, Fastnet, Irish Sea, Shannon: East or northeast 4 to 6, occasionally 7 in Lundy. Rough at first in northwest Shannon, otherwise slight or moderate, then occasionally rough later in east Sole. Fog patches at first in Sole. Moderate or good, occasionally very poor at first in Sole.
[-- Attachment #1: Type: text/plain, Size: 226 bytes --] > > A curious one is nvi, which uses the Berkeley DB RECNO interface to access > a text file as an array of lines (RECNO = record number). > > Not so curious: Keith Bostic was the principal author of both nvi and Berkeley DB. [-- Attachment #2: Type: text/html, Size: 466 bytes --]
> From: David C. Brock > I'd like to read similar discussions of the data structures for ed, em, > ex/vi. ... Similarly, if there are any pointers to references on some > other data structures in editors like TECO, QED and E, I'd welcome them > as well. I don't have any discussions I can point you at, but I do have source - for two things which are somewhat older than most of the ones you mention (ex/vi/etc). The first is a TECO from the fourth floor V6 machine (DSSR/RTS) at Tech Sq at MIT: http://ana-3.lcs.mit.edu/~jnc/tech/unix/teco There's some rudimentary documentation in there, in teco.doc, but don't expect too much. You'll have to rely on the source, which is in MACRO-11 - but it seems to be reasonably well commented. This actually predates V6; it was originally written for an MIT OS called DELPHI, which ran on an -11/45 which was the main EECS undergrad machine. At some point (probably post the Unix port), it was modified to have '^R mode', which was a WYSIWYG display mode a lot like the one in the ITS TECO in which EMACS was first written. I have also put up the Montgomery Emacs for Unix: http://ana-3.lcs.mit.edu/~jnc/tech/unix/emacs This is the version we were running on the 5th floor MIT V6 machine (CSR), which by that point have absorbed a few V7isms (e.g. some ioctl() stuff). So don't expect to be able to compile and run it, without a fair amount of work. (I vaguely recall that it needs I+D space, so maybe not on a /23 at all.) But at least the source is in C, so you can read it. I don't think there's an un-modified version online (i.e. the original Montgomery source), alas. Noel
[-- Attachment #1: Type: text/plain, Size: 1358 bytes --] On Thu, Apr 1, 2021 at 4:31 PM Noel Chiappa <jnc@mercury.lcs.mit.edu> wrote: > The first is a TECO from the fourth floor V6 machine (DSSR/RTS) at Tech Sq > at > MIT: > I happen to remember how PDP-8 Teco stored its content. In Teco, for those who have never made its acquaintance, there is the current buffer (which does not typically contain all of the current file) and a bunch of named Q-registers that store blobs of text. Since the PDP-8 is a 12-bit machine, the normal way to store ASCII uses three characters in two words: one in the low 8 bits of the first word, one in the low 8 bits of the second, and one in the top four bits of both words, split big-endian. In Teco, however, the current buffer is stored in the bottom 8 bits of a single 4KW memory field. The Q-registers are stored in alphabetical order by name (a single character) using the straddling top-4-bits method described above. Because the processor cycle time was the same as memory access time anyway, this storage method was perfectly adequate. You can also look at the source for Teco-C, which is available in several places on the net, the last I looked. John Cowan http://vrici.lojban.org/~cowan cowan@ccil.org Heckler: "Go on, Al, tell 'em all you know. It won't take long." Al Smith: "I'll tell 'em all we *both* know. It won't take any longer." [-- Attachment #2: Type: text/html, Size: 2976 bytes --]
On Thu, 1 Apr 2021, Tony Finch wrote:
> A curious one is nvi, which uses the Berkeley DB RECNO interface to
> access a text file as an array of lines (RECNO = record number).
On FreeBSD as least, "nvi" is used instead of "vi" (and is linked).
And oddly enough, BDB is exactly how I would have implemented it... The
basic datum is a line after all (I have no idea about EMACS, and don't
want to know) so it makes sense to use a structure that can rapidly access
arbitrary line numbers.
-- Dave
[-- Attachment #1: Type: text/plain, Size: 1020 bytes --] On Thu, Apr 1, 2021 at 5:26 PM Dave Horsfall <dave@horsfall.org> wrote: > And oddly enough, BDB is exactly how I would have implemented it... The > basic datum is a line after all (I have no idea about EMACS, and don't > want to know) so it makes sense to use a structure that can rapidly access > arbitrary line numbers. > I'd use SQLite nowadays, because it takes extraordinary care to make sure that no data is lost short of disk failure. It is considerably more robust than the underlying filesystem, and runs embedded in its process. It also means you can readily carry about arbitrary data in additional columns; for example, you could make line marks persistent, including dot. John Cowan http://vrici.lojban.org/~cowan cowan@ccil.org Well, I have news for our current leaders and the leaders of tomorrow: the Bill of Rights is not a frivolous luxury, in force only during times of peace and prosperity. We don't just push it to the side when the going gets tough. --Molly Ivins (pbuh) [-- Attachment #2: Type: text/html, Size: 2207 bytes --]
[-- Attachment #1: Type: text/plain, Size: 690 bytes --] On Thu, 1 Apr 2021, John Cowan wrote: [ Me thinking of using BDB for editor data structures ] > I'd use SQLite nowadays, because it takes extraordinary care to make > sure that no data is lost short of disk failure. It is considerably > more robust than the underlying filesystem, and runs embedded in its > process. It also means you can readily carry about arbitrary data in > additional columns; for example, you could make line marks persistent, > including dot. Good point; thanks. I'd forgotten about SQLite... I doubt if I'll be writing a new editor any time soon though (VI works just fine) but was planning on incorporating it in a project I'm working on. -- Dave
[-- Attachment #1: Type: text/plain, Size: 1339 bytes --] 's' from Webb Miller's _A Software Tools Sampler_ is an exhaustively documented sort-of-stripped-down vi. Admittedly it's a little tricky to get your hands on the source document (I got it from interlibrary loan back at the end of the Before Times, and archive.org lets you check it out for a limited time, but there may be a waitlist). It's interesting in that it's not just studying the source to see how it works, but an actual (large) chapter of a book where he steps through the construction of the individual functions and their keybindings. Adam On Fri, Apr 2, 2021 at 3:42 PM Dave Horsfall <dave@horsfall.org> wrote: > On Thu, 1 Apr 2021, John Cowan wrote: > > [ Me thinking of using BDB for editor data structures ] > > > I'd use SQLite nowadays, because it takes extraordinary care to make > > sure that no data is lost short of disk failure. It is considerably > > more robust than the underlying filesystem, and runs embedded in its > > process. It also means you can readily carry about arbitrary data in > > additional columns; for example, you could make line marks persistent, > > including dot. > > Good point; thanks. I'd forgotten about SQLite... I doubt if I'll be > writing a new editor any time soon though (VI works just fine) but was > planning on incorporating it in a project I'm working on. > > -- Dave [-- Attachment #2: Type: text/html, Size: 1833 bytes --]
On 4/2/2021 4:20 PM, Adam Thornton wrote:
> 's' from Webb Miller's _A Software Tools Sampler_ is an exhaustively
> documented sort-of-stripped-down vi. Admittedly it's a little tricky to
> get your hands on the source document (I got it from interlibrary loan
> back at the end of the Before Times, and archive.org
> <http://archive.org> lets you check it out for a limited time, but there
> may be a waitlist).
I have a copy of this book for sale. It's in excellent
condition. It wasn't sold with the software that's printed in
the book so you'll have to somehow get the software yourself.
I'd like $45 plus shipping. (I'll only ship to the
US). If you're interested please contact me directly offlist.
Cordially,
Jon Forrest
This is actually what ed, ex, and vi do as well, albeit not with a
standard interface.
The buffer is an array of integer offsets into a file. Line n in the
buffer is pointed at by buffer[n], whose contents are an offset into the
temp file. That offset points to a null terminated line of buffer text.
When a new line of text is created (or changed) the new line is appended
to the temp file and the corresponding offset goes into the buffer array.
One big reason for this was that ed was originally written for a 16 bit
machine, and buffers didn't fit in memory.
This works very well to insert or delete lines, you only have to copy
the line references in the buffer array. It's also convenient for undo,
since putting things back just means restoring the original array. My
enhanced ed (hed at Wisconsin, Portable ed at Bell Labs) had a second
array for undo, and a full copy of the array was made before a change.
Bill Joy wrote a fancier implementation in vi, where only lines added or
deleted were saved, command-specific. Undo knew which command it had to
undo and special-cased each one.
The big disadvantage to this structure is that ends of lines are not
just newline characters. You can't backspace over a newline, or change
newlines to something else.
Mary Ann
On 4/1/21 5:56 AM, Tony Finch wrote:
> David C. Brock <dbrock@computerhistory.org> wrote:
>> I’d like to read similar discussions of the data structures for ed, em,
>> ex/vi. If anyone has suggestions of references, they would be very
>> welcome!
> A curious one is nvi, which uses the Berkeley DB RECNO interface to access
> a text file as an array of lines (RECNO = record number).
>
> Tony.