From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=0.2 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,HTML_MESSAGE,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE,RDNS_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 Received: (qmail 25179 invoked from network); 11 Mar 2020 17:42:45 -0000 Received-SPF: pass (minnie.tuhs.org: domain of minnie.tuhs.org designates 45.79.103.53 as permitted sender) receiver=inbox.vuxu.org; client-ip=45.79.103.53 envelope-from= Received: from unknown (HELO minnie.tuhs.org) (45.79.103.53) by inbox.vuxu.org with ESMTP; 11 Mar 2020 17:42:45 -0000 Received: by minnie.tuhs.org (Postfix, from userid 112) id 0A09B9BB82; Thu, 12 Mar 2020 03:42:42 +1000 (AEST) Received: from minnie.tuhs.org (localhost [127.0.0.1]) by minnie.tuhs.org (Postfix) with ESMTP id CBB979BB47; Thu, 12 Mar 2020 03:42:06 +1000 (AEST) Authentication-Results: minnie.tuhs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="ZQopviTD"; dkim-atps=neutral Received: by minnie.tuhs.org (Postfix, from userid 112) id 44CA49BB47; Thu, 12 Mar 2020 03:42:04 +1000 (AEST) Received: from mail-vk1-f173.google.com (mail-vk1-f173.google.com [209.85.221.173]) by minnie.tuhs.org (Postfix) with ESMTPS id B33579BB46 for ; Thu, 12 Mar 2020 03:42:02 +1000 (AEST) Received: by mail-vk1-f173.google.com with SMTP id b187so768659vkh.12 for ; Wed, 11 Mar 2020 10:42:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=kCDO6Wr2tjbxYZEXxVK07Bc1UB+am6CMzz/S5MVsLd8=; b=ZQopviTDKUOHNu1R+q5S+WKm1G0hQXdubWeOHUaMfI8WAviKjHJ9At6dwdGjYvoaoZ IuJPB9HZVcjcPcFEGUYzmIowDpXn9Ufk5duXCYBg91+2a0JYNLsXHUbzqMbedEdTiXpz MzLzde/MdO0nKlWsC2TTAJ8ZY+7TeWndkRWkXSptYwpR5BzBtWg2seWy7LSa8sdqUElb 9q+n+fRMX2Cl10BFBX33MxEQ758gwULReeSsKoItfaDSIzqBdm8bQkvTawtNo3bNbiUP iA4a3QcCr2lNcNkjskQhrzqijr+nNdLlO1KaJauDnzbQp5PCZPkily7Sm1z/+t20mxRV c/mA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=kCDO6Wr2tjbxYZEXxVK07Bc1UB+am6CMzz/S5MVsLd8=; b=hhiqlpxbkBitSbyzs48JGa9ktcIje3GP3+vV6aIR98yKlMwpMELBIqXad5z8TcMqkz lYw0BfmW6VNaBDovn6I/D7F0Xf2IucL+p6w/b2gYMFmXxAhCGxMyE6QDrf2qE55k6TNP JXQbWOZs4SrARrRfJlWUtOxv9PNb++FeMIRvG6Ktl98XXa/FCeDgQb5dRmntD+czTsQq zkTLp5AOdWOrp6l73ecZUl+KxShV51jfFhi8PXMbi/Nkmrt8V75XK1moXOC8fpMYhrxF +tTBe4RRuVyo5yn+5Ds1aPMwsdNMGXVVIomKwKZdXwquq0yxSCUkiDuHkKNULTMpNFph LrGQ== X-Gm-Message-State: ANhLgQ1RzlY5SQXG77muEUbovf3K9IKvXtrb0wW4ZBIYOwzRTI4LAc22 TbmBLkwI41UliFFARwuGE9WNDH35ST8d87Bn9sc= X-Google-Smtp-Source: ADFU+vuR39MHSjBTCez0g1PRY5eCD1qLkdBT2POCYAOa0Sy0Q4XqH/7zHo3Ppvt61qjSSQG6vd+QVdZFdw3Ye1UkOZ4= X-Received: by 2002:a1f:9646:: with SMTP id y67mr2839356vkd.21.1583948521422; Wed, 11 Mar 2020 10:42:01 -0700 (PDT) MIME-Version: 1.0 References: <20200305020517.GA13872@wopr> <20200308052632.GD20478@eureka.lemis.com> <202003080532.0285WcWn1544496@darkstar.fourwinds.com> <20200309212257.GB39634@wopr> In-Reply-To: <20200309212257.GB39634@wopr> From: "John P. Linderman" Date: Wed, 11 Mar 2020 13:41:48 -0400 Message-ID: To: "John P. Linderman" , Tyler Adams , The Unix Heritage Society Content-Type: multipart/alternative; boundary="000000000000f6436605a097c014" Subject: Re: [TUHS] Command line options and complexity X-BeenThere: tuhs@minnie.tuhs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: The Unix Heritage Society mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: tuhs-bounces@minnie.tuhs.org Sender: "TUHS" --000000000000f6436605a097c014 Content-Type: text/plain; charset="UTF-8" This is *great*, Kurt. The source in src/runtime/hrs/src for rsort.c is their version of my external sort, modified to be a subroutine. There's some lessons to be learned about "software hygiene". I was cavalier about freeing what I allocated dynamically. As a result, their version leaks like a sieve if the subroutine is called repeatedly. Apropos of which, they came to me having noted that only the first call was acting as expected. There's a wonderful irony (I'm big on irony). I had replaced my do-it-yourself argument processing with getopt. The code has the following comment ** Use getopt() for portability. A few lines later, you see optind = 1; /* reset after use in Hancock program * while ((c = getopt(argc, argv, "cCiIjmrsSuvb:f:D:o:p:T:x:y:z:")) != EOF) { optind??? Seems getopt has an undocumented global flag to prevent reprocessing the arguments. How portable:-) Anyway, it should be possible to turn rsort.c back into standalone code. I'd be the obvious person to do it, but that would probably be a violation of some agreement with AT&T. However, if somebody else wants to take on the task (it would make a great summer intern project), I'd be happy to share ideas I have had since retiring that would improve the code. fc.c in the same directory is a library-ized version of a fixcut command I wrote as a fixed-length counterpart to the cut command, for fixed-length inputs (like native floats and integers, which can be tweaked to sort lexicographically). Unlike rsort, I practiced good hygiene and kept track of all allocated space so it could be freed. Too bad they didn't include the man pages for rsort and fixcut. They'd make it easier to understand them. Jon Bentley observed that "comments are love letters to your future self", and I feel a lot of love from the heavily commented rsort code. This probably should move to coff, it's not really about UNIX history (although rsort has vestigial traces of ancient days, like the code to write checkpoint files after each output temp is closed... sorting a million bytes once took hours, with slow processors and disks. It was painful to have to start from scratch if an overnight sort got interrupted. Now sorting a billion bytes is pretty quick, and the checkpoint stuff never gets used. It's one of the things that could profitably disappear.) On Mon, Mar 9, 2020 at 5:22 PM Kurt H Maier wrote: > On Mon, Mar 09, 2020 at 05:06:20PM -0400, John P. Linderman wrote: > > but the page is gone. It probably didn't help that Wired titled the > article > > > > *AT&T Invents Programming Language for Mass Surveillance* > > > > That's horse-pucky, akin to "Pitchfork makers invent device for spearing > > babies". I'm trying to track down a copy that was released publicly. I'm > > not hopeful. > > There is a copy here: https://github.com/mqudsi/hancock > > Not sure what other conclusion Wired was supposed to come to, given that > the provided "Hello World" programs in the paper were all mass > surveillance examples (tracking international calls to given numbers, > tracking data streams to given IP addresses, and tracking specific > connections to a given ISP). > > The license in the linked repository is different than the old > password-gated NSL that was applied on the research.att.com pages. I > wonder how many licenses this code was released with, over the years. > > > khm > --000000000000f6436605a097c014 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thi= s is great, Kurt. The source in=C2=A0src/runtime/hrs/src for rsort.c= is their version of my external sort, modified to be a subroutine. There&#= 39;s some lessons to be learned about "software hygiene". I was c= avalier about freeing what I allocated dynamically. As a result, their vers= ion leaks like a sieve if the subroutine is called repeatedly. Apropos of w= hich, they came to me having noted that only the first call was acting as e= xpected. There's a wonderful irony (I'm big on irony). I had replac= ed my do-it-yourself argument processing with getopt. The code has the foll= owing comment
<= br>
** Use getopt() for portability.

A few lines later, you see

=C2=A0 =C2=A0 opti= nd =3D 1; =C2=A0/* reset after use in Hancock program *
= =C2=A0 =C2=A0=C2=A0while ((c =3D getopt(argc, argv, "cCiIjmrsSuvb:f:D:= o:p:T:x:y:z:")) !=3D EOF) {
=C2=A0 =C2=A0=C2=A0
optind??? See= ms getopt has an undocumented global flag to prevent reprocessing the argum= ents. How portable:-)

Any= way, it should be possible to turn rsort.c back into standalone code. I'= ;d be the obvious person to do it, but that would probably be a violation o= f some agreement with AT&T. However, if somebody else wants to take on = the task (it would make a great summer intern project), I'd be happy to= share ideas I have had since retiring that would improve the code.

fc.c in the same directory is = a library-ized version of a fixcut command I wrote as a fixed-length counte= rpart to the cut command, for fixed-length inputs (like native floats and i= ntegers, which can be tweaked to sort lexicographically). Unlike rsort, I p= racticed good hygiene and kept track of all allocated space so it could be = freed. Too bad they didn't include the man pages for rsort and fixcut. = They'd make it easier to understand them. Jon Bentley observed that &qu= ot;comments are love letters to your future self", and I feel a lot of= love from the heavily commented rsort code.

This probably should move to coff, it's not really= about UNIX history (although rsort has vestigial traces of ancient days, l= ike the code to write checkpoint files after each output temp is closed... = sorting a million bytes once took hours, with slow processors and disks. It= was painful to have to start from scratch if an overnight sort got interru= pted. Now sorting a billion bytes is pretty quick, and the checkpoint stuff= never gets used. It's one of the things that could profitably disappea= r.)

On Mon, Mar 9, 2020 at 5:22 PM Kurt H Maier <khm@sciops.net> wrote:
On Mon, Mar 09, 2020 at 05:= 06:20PM -0400, John P. Linderman wrote:
> but the page is gone. It probably didn't help that Wired titled th= e article
>
> *AT&T Invents Programming Language for Mass Surveillance*
>
> That's horse-pucky, akin to "Pitchfork makers invent device f= or spearing
> babies". I'm trying to track down a copy that was released pu= blicly. I'm
> not hopeful.

There is a copy here:=C2=A0 https://github.com/mqudsi/hancock
Not sure what other conclusion Wired was supposed to come to, given that the provided "Hello World" programs in the paper were all mass surveillance examples (tracking international calls to given numbers,
tracking data streams to given IP addresses, and tracking specific
connections to a given ISP).

The license in the linked repository is different than the old
password-gated NSL that was applied on the research.att.com pages.=C2=A0 = I
wonder how many licenses this code was released with, over the years.


khm
--000000000000f6436605a097c014--