From mboxrd@z Thu Jan 1 00:00:00 1970 From: jnc@mercury.lcs.mit.edu (Noel Chiappa) Date: Wed, 16 Jul 2014 17:46:32 -0400 (EDT) Subject: [TUHS] the sin of buffering [offshoot of excise process from a pipeline] Message-ID: <20140716214632.6F3DB18C0A2@mercury.lcs.mit.edu> > From: Doug McIlroy > Process A spawns process B, which reads stdin with buffering. B gets > all it deserves from stdin and exits. What's left in the buffer, > intehded for A, is lost. Ah. Got it. The problem is not with buffering as a generic approach, the problem is that you're trying to use a buffering package intended for simple, straight-forward situations in one which doesn't fall into that category! :-) Clearly, either B has to i) be able to put back data which was not for it ('ungets' as a system call), or ii) not read the data that's not for it - but that may be incompatible with the concept of buffering the input (depending on the syntax, and thus the ability to predict the approaching of the data B wants, the only way to avoid the need for ungetc() might be to read a byte at a time). If B and its upstream (U) are written together, that could be another way to deal with it: if U knows where B's syntatical boundaries are, it can give it advance warning, and B could then use a non-trivial buffering package to do the right thing. E.g. if U emits 'records' with a header giving the record length X, B could tell its buffering package 'don't read ahead more than X bytes until I tell you to go ahead with the next record'. Of course, that's not a general solution; it only works with prepared U's. Really, the only general, efficient way to deal with that situation that I can see is to add 'ungets' to the operating system... Noel