I wrote a letter to the ANSI C (1989) committee. Please allow malloc(0). Please allow zero-length arrays. I got two letters back, saying that malloc(0) is illegal because zero-length arrays are illegal, and the other vice versa. I fumed. -rob On Sun, Sep 29, 2024 at 8:08 AM Douglas McIlroy < douglas.mcilroy@dartmouth.edu> wrote: > I have to concede Branden's "gotcha". Struct copying is definitely not > O(1). > > A real-life hazard of non-O(1) operations. Vic Vyssotsky put bzero (I > forget what Vic called it) into Fortran II. Sometime later, he found that a > percolation simulation was running annoyingly slowly. It took some study to > discover that the real inner loop of the program was not the percolation, > which touched only a small fraction of a 2D field. More time went into the > innocuous bzero initializer. The fix was to ADD code to the inner loop to > remember what entries had been touched, and initialize only those for the > next round of the simulation. > > > for a while, every instruction has [sic] an exactly predictable and > constant cycle > > count. ...[Then] all of a sudden you had instructions with O(n) cycle > counts. > > O(n) cycle counts were nothing new. In the 1950s we had the IBM 1620 with > arbitrary-length arithmetic and the 709 with "convert" instructions whose > cycle count went up to 256. > > >> spinelessly buckled to allow malloc(0) to return 0, as some > >> implementations gratuitously did. > > > What was the alternative? There was no such thing as an exception, and > > if a pointer was an int and an int was as wide as a machine address, > > there'd be no way to indicate failure in-band, either. > > What makes you think allocating zero space should fail? If the size n of a > set is determined at run time, why should one have to special-case its > space allocation when n=0? Subsequent processing of the form for(i=0; i i++) {...} will handle it gracefully with no special code. Malloc should do > as it did in v7--return a non-null pointer different from any other active > malloc pointer, as Bakul stated. If worse comes to worst[1] this can be > done by padding up to the next feasible size. Regardless of how the pointer > is created, any access via it would of course be out of bounds and hence > wrong. > > > How does malloc(0) get this job done and what benefit does it bring? > > If I understand the "job" (about initializing structure members) > correctly, malloc(0) has no bearing on it. The benefit lies elsewhere. > > Apropos of tail calls, Rob Pike had a nice name for an explicit tail call, > "become". It's certainly reasonable, though, to make compilers recognize > tail calls implicitly. > > [1] Worse didn't come to worst in the original malloc. It attached > metadata to each block, so even blocks of size zero consumed some memory. > > Doug > > On Sat, Sep 28, 2024 at 1:59 PM Bakul Shah via TUHS wrote: > >> Just responding to random things that I noticed: >> >> You don't need special syntax for tail-call. It should be done >> transparently when a call is the last thing that gets executed. Special >> syntax will merely allow confused people to use it in the wrong place and >> get confused more. >> >> malloc(0) should return a unique ptr. So that "T* a = malloc(0); T* b = >> malloc(0); a != (T*)0 && a != b". Without this, malloc(0) acts differently >> from malloc(n) for n > 0. >> >> Note that except for arrays, function arguments & result are copied so >> copying a struct makes perfect sense. Passing arrays by reference may have >> been due to residual Fortran influence! [Just guessing] Also note: that one >> exception has been the cause of many problems. >> >> In any case you have not argued convincingly about why dynamic memory >> allocation should be in the language (runtime) :-) And adding that wouldn't >> have fixed any of the existing problems with the language. >> >> Bakul >> >> > On Sep 28, 2024, at 9:58 AM, G. Branden Robinson < >> g.branden.robinson@gmail.com> wrote: >> > >> > At 2024-09-28T09:34:14-0400, Douglas McIlroy wrote: >> >>> C's refusal to specify dynamic memory allocation in the language >> >>> runtime (as opposed to, eventually, the standard library) >> >> >> >> This complaint overlooks one tenet of C: every operation in what you >> >> call "language runtime" takes O(1) time. Dynamic memory allocation >> >> is not such an operation. >> > >> > A fair point. Let me argue about it anyway. ;-) >> > >> > I'd make three observations. First, K&R did _not_ tout this in their >> > book presenting ANSI C. I went back and checked the prefaces, >> > introduction, and the section presenting a simple malloc()/free() >> > implementation. The tenet you claim for the language is not explicitly >> > articulated and, if I squint really hard, I can only barely perceive >> > (imagine?) it deeply between the lines in some of the prefatory material >> > to which K&R mostly confine their efforts to promote the language. In >> > my view, a "tenet" is something more overt: the sort of thing U.S. >> > politicians try to get hung on the walls of every public school >> > classroom, like Henry Spencer's Ten Commandments of C[1] (which itself >> > does not mention this "core language has only O(1) features" principle). >> > >> > Second, in reviewing K&R I was reminded that structure copying is part >> > of the language. ("There are no operations that manipulate an entire >> > array or string, although structures may be copied as a unit."[2]) >> > Doesn't that break the tenet right there? >> > >> > Third, and following on from the foregoing, your point reminded me of my >> > youth programming non-pipelined machines with no caches. You could set >> > your watch by (just about) any instruction in the set--and often did, >> > because we penurious microcomputer guys often lacked hardware real-time >> > clocks, too. That is to say, for a while, every instruction has an >> > exactly predictable and constant cycle count. (The _value_ of that >> > constant might depend on the addressing mode, because that would have >> > consequences on memory fetches, but the principle stood.) When the Z80 >> > extended the 8080's instruction set, they ate from Tree of Knowledge >> > with block-copy instructions like LDIR and LDDR, and all of a sudden you >> > had instructions with O(n) cycle counts. But as a rule, programmers >> > seemed to welcome this instead of recognizing it as knowing sin, because >> > you generally knew worst-case how many bytes you'd be copying and take >> > that into account. (The worst worst case was a mere 64kB!) >> > >> > Further, Z80 home computers in low-end configurations (that is, no disk >> > drives) often did a shocking thing: they ran with all interrupts masked. >> > All the time. The one non-maskable interrupt was RESET, after which you >> > weren't going to be resuming execution of your program anyway. Not from >> > the same instruction, at least. As I recall the TRS-80 Model I/III/4 >> > didn't even have logic on the motherboard to decode the Z80's "interrupt >> > mode 2", which was vectored, I think. Even in the "high-end" >> > configurations of these tiny machines, you got a whopping ONE interrupt >> > to play with ("IM 1"). >> > >> > Later, when the Hitachi 6309 smuggled similar block-transfer decadence >> > into its extensions to the Motorola 6809 (to the excitement of we >> > semi-consciously Unix-adjacent OS-9 users) they faced a starker problem, >> > because the 6809 didn't wall off interrupts in the same way the 8080 and >> > Z80. They therefore presented the programmer with the novelty of the >> > restartable instruction, and a new generation of programmers became >> > acquainted with the hard lessons time-sharing minicomputer people were >> > familiar with. >> > >> > My point in this digression is that, in my opinion, it's tough to hold >> > fast to the O(1) tenet you claim for C's core language and to another at >> > the same time: the language's identity as a "portable assembly >> > language". Unless every programmer has control over the compiler--and >> > they don't--you can't predict when the compiler will emit an O(n) block >> > transfer instruction. You'll just have to look at the disassembly. >> > >> > _Maybe_ you can retain purity by...never copying structs. I don't think >> > lint or any other tool ever checked for this. Your advocacy of this >> > tenet is the first time I've heard it presented. >> > >> > If you were to suggest to me that most of the time I've spent in my life >> > arguing with C advocates was with rotten exemplars of the species and >> > therefore was time wasted, I would concede the point. >> > >> > There's just so danged _many_ of them... >> > >> >> Your hobbyhorse awakened one of mine. >> >> >> >> malloc was in v7, before the C standard was written. The standard >> >> spinelessly buckled to allow malloc(0) to return 0, as some >> >> implementations gratuitously did. >> > >> > What was the alternative? There was no such thing as an exception, and >> > if a pointer was an int and an int was as wide as a machine address, >> > there'd be no way to indicate failure in-band, either. >> > >> > If the choice was that or another instance of atoi()'s wincingly awful >> > "does this 0 represent an error or successful conversion of a zero >> > input?" land mine, ANSI might have made the right choice. >> > >> >> I can't imagine that any program ever actually wanted the feature. Now >> >> it's one more undefined behavior that lurks in thousands of programs. >> > >> > Hoare admitted to only one billion-dollar mistake. No one dares count >> > how many to write in C's ledger. This was predicted, wasn't it? >> > Everyone loved C because it was fast: it was performant, because it >> > never met a runtime check it didn't eschew--recall again Kernighan >> > punking Pascal on this exact point--and it was quick for the programmer >> > to write because it never met a _compile_-time check it didn't eschew. >> > C was born as a language for wizards who never made mistakes. >> > >> > The problem is that, like James Madison's fictive government of angels, >> > such entities don't exist. The staff of the CSRC itself may have been >> > overwhelmingly populated with frank, modest, and self-deprecating >> > people--and I'll emphasize here that I'm aware of no accounts that this >> > is anything but true--but C unfortunately played a part in stoking a >> > culture of pretension among software developers. "C is a language in >> > which wizards program. I program in C. Therefore I'm a wizard." is how >> > the syllogism (spot the fallacy) went. I don't know who does more >> > damage--the people who believe their own BS, or the ones who know >> > they're scamming their colleagues. >> > >> >> There are two arguments for malloc(0), Most importantly, it caters for >> >> a limiting case for aggregates generated at runtime--an instance of >> >> Kernighan's Law, "Do nothing gracefully". It also provides a way to >> >> create a distinctive pointer to impart some meta-information, e.g. >> >> "TBD" or "end of subgroup", distinct from the null pointer, which >> >> merely denotes absence. >> > >> > I think I might be confused now. I've frequently seen arrays of structs >> > initialized from lists of literals ending in a series of "NULL" >> > structure members, in code that antedates or ignores C99's wonderful >> > feature of designated initializers for aggregate types.[3] How does >> > malloc(0) get this job done and what benefit does it bring? >> > >> > Last time I posted to TUHS I mentioned a proposal for explicit tail-call >> > elimination in C. I got the syntax wrong. The proposal was "return >> > goto;". The WG14 document number is N2920 and it's by Alex Gilding. >> > Good stuff.[4] I hope we see it in C2y. >> > >> > Predictably, I must confess that I didn't make much headway on >> > Schiller's 1975 "secure kernel" paper. Maybe next time. >> > >> > Regards, >> > Branden >> > >> > [1] https://web.cs.dal.ca/~jamie/UWO/C/the10fromHenryS.html >> > >> > I can easily imagine that the tenet held at _some_ point in the >> > C's history. It's _got_ to be the reason that the language >> > relegates memset() and memcpy() to the standard library (or to the >> > programmer's own devise)! :-O >> > >> > [2] Kernighan & Ritchie, _The C Programming Language_, 2nd edition, p. 2 >> > >> > Having thus admitted the camel's nose to the tent, K&R would have >> > done the world a major service by making memset(), or at least >> > bzero(), a language feature, the latter perhaps by having "= 0" >> > validly apply to an lvalue of non-primitive type. Okay, >> > _potentially_ a major service. You'd still need the self-regarding >> > wizard programmers to bother coding it, which they wouldn't in many >> > cases "because speed". Move fast, break stuff. >> > >> > C++ screwed this up too, and stubbornly stuck by it for a long time. >> > >> > https://cplusplus.github.io/CWG/issues/178.html >> > >> > [3] https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html >> > [4] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2920.pdf >> >>