The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
* [TUHS] cut, paste, join, etc.
@ 2021-02-16 20:33 Will Senn
  2021-02-16 21:02 ` Dave Horsfall
  2021-02-16 21:06 ` Dennis Boone
  0 siblings, 2 replies; 18+ messages in thread
From: Will Senn @ 2021-02-16 20:33 UTC (permalink / raw)
  To: TUHS main list

[-- Attachment #1: Type: text/plain, Size: 835 bytes --]

All,

I'm tooling along during our newfangled rolling blackouts and frigid 
temperatures (in Texas!) and reading some good old unix books. I keep 
coming across the commands cut and paste and join and suchlike. I use 
cut all the time for stuff like:

ls -l | tr -s ' '| cut -f1,4,9 -d \
...
-rw-r--r-- staff main.rs

and

who | grep wsenn | cut -c 1-8,10-17
wsenn   console
wsenn   ttys000

but that's just cuz it's convenient and useful.

To my knowledge, I've never used paste or join outside of initially 
coming across them. But, they seem to 'fit' with cut. My question for 
y'all is, was there a subset of related utilities that these were part 
of that served some common purpose? On a related note, join seems like 
part of an aborted (aka never fully realized) attempt at a text based 
rdb to me...

What say you?

Will


[-- Attachment #2: Type: text/html, Size: 1275 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread
* Re: [TUHS] cut, paste, join, etc.
@ 2021-02-17  1:08 M Douglas McIlroy
  2021-02-17  1:16 ` Will Senn
  0 siblings, 1 reply; 18+ messages in thread
From: M Douglas McIlroy @ 2021-02-17  1:08 UTC (permalink / raw)
  To: tuhs

Will Senn wrote,
> join seems like part of an aborted (aka never fully realized) attempt at a text based rdb to me

As the original author of join, I can attest that there was no thought
of parlaying join into a database system. It was inspired by
databases, but liberated from them, much as grep was liberated from an
editor.

Doug

^ permalink raw reply	[flat|nested] 18+ messages in thread
* Re: [TUHS] cut, paste, join, etc.
@ 2021-02-18 20:20 Brian Walden
  2021-02-18 20:41 ` Anthony Martin
  0 siblings, 1 reply; 18+ messages in thread
From: Brian Walden @ 2021-02-18 20:20 UTC (permalink / raw)
  To: tuhs

The last group before I left the labs in 1992 was on was the
POST team.

pq stood for "post query," but POST consisted of -
- mailx: (from SVR3.1) as the mail user agent
- UPAS: (from research UNIX) as the mail delivery agent
- pq: the program to query the database
- EV: (pronounced like the biblical name) the database (and the
  genesis program to create indices)
- post: program to combine all the above to read email and to send mail via queries

pq by default would looku up people
  pq lastname:     find all people with lastname, same as pq last=lastname
  pq first.last:   find all people with first last, same as pq first=first/last=last
  pq first.m.last: find all people with first m last, same as pq first=first/middle=m/last=last

this how email to dennis.m.ritchie @ att.com worked to send it on to research!dmr

you could send mail to a whole department via /org=45267 or the whole division
via /org=45 or a whole location via /loc=mh or just the two people in a specific
office via /loc=mh/room=2f-164
these are "AND"s an "OR" is just another query after it on the same line

There were some special extentions -
- prefix, e.g.  pq mackin* got all mackin, mackintosh, mackinson, etc
- soundex, e.g. pq mackin~ got all with the last name that sounding like mackin,
    so names such as mackin, mckinney, mckinnie, mickin, mikami, etc
    (mackintosh and mackinson did not match the soundex, therefore not included)

The EV database was general and fairly simple. It was directory with
files called "Data" and "Proto" in it.
"Data" was plain text, pipe delineated fields, newline separated records -

 123456|ritchie|dennis|m||r320|research!dmr|11273|mh|2c-517|908|582|3770

   (used data from preserved at https://www.bell-labs.com/usr/dmr/www/)

"Proto" defined the fields in a record (I didn't remember exact syntax anymore) -

 id       n i
 last     a i
 first    a i
 middle   a -
 suffix   a -
 soundex  a i
 email    a i
 org      n i
 loc      a i
 room     a i
 area     n i
 exch     n i
 ext      n i

"n" means a number so 00001 was the same as 1, and "a" means alpha, the "i" or "-"
told genesis if an index should be generated or not. I think is had more but
that has faded with the years.

If indices are generated it would then point to the block number in Data, so an lseek(2)
could get to the record quick. I beleive there was two levels of block pointing indices.
(sort of like inode block pointers had direct and indirect blocks)
So everytime you added records to Data you had to regenerate all the indices, that was
very time consuming.

The nice thing about text Data was grep(1) worked just fine, or cut -d'|' or awk -F'|'
but pq was much faster with a large numer of records.


-Brian

Dan Cross <crossd at gmail.com> wrote:
> It seems that Andrew has addressed Daytona, but there was a small database
> package called `pq` that shipped with plan9 at one point that I believe
> started life on Unix. It was based on "flat" text files as the underlying
> data source, and one would describe relations internally using some
> mechanism (almost certainly another special file). An interesting feature
> was that it was "implicitly relational": you specified the data you wanted
> and it constructed and executed a query internally: no need to "JOIN"
> tables on attributes and so forth. I believe it supported indices that were
> created via a special command. I think it was used as the data source for
> the AT&T internal "POST" system. A big downside was that you could not add
> records to the database in real time.
>
> It was taken to Cibernet Inc (they did billing reconciliation for wireless
> carriers. That is, you have an AT&T phone but make a call that's picked up
> by T-Mobile's tower: T-Mobile lets you make the call but AT&T has to pay
> them for the service. I contracted for them for a short time when I got out
> of the Marine Corps---the first time) and enhanced and renamed "Eteron" and
> the record append issue was, I believe, solved. Sadly, I think that
> technology was lost when Cibernet was acquired. It was kind of cool.
>
>         - Dan C.
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-02-22  6:07 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-16 20:33 [TUHS] cut, paste, join, etc Will Senn
2021-02-16 21:02 ` Dave Horsfall
2021-02-16 21:15   ` Will Senn
2021-02-16 21:26     ` Dave Horsfall
2021-02-16 21:06 ` Dennis Boone
2021-02-17  1:08 M Douglas McIlroy
2021-02-17  1:16 ` Will Senn
2021-02-17  1:43   ` Grant Taylor via TUHS
2021-02-17  2:26     ` Will Senn
2021-02-17  4:08       ` Grant Taylor via TUHS
2021-02-17 10:14         ` John Gilmore
2021-02-17 14:52           ` Andrew Hume
2021-02-17 23:58           ` Dan Cross
2021-02-17 20:49         ` Dave Horsfall
2021-02-22  5:57       ` Tomasz Rola
2021-02-17  3:29   ` John Cowan
2021-02-18 20:20 Brian Walden
2021-02-18 20:41 ` Anthony Martin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).