From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/13765 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Bug in gets function? Date: Tue, 12 Feb 2019 11:30:27 -0500 Message-ID: <20190212163027.GK23599@brightrain.aerifal.cx> References: <20190212034838.GH23599@brightrain.aerifal.cx> <20190212035106.GI23599@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="144938"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-13781-gllmg-musl=m.gmane.org@lists.openwall.com Tue Feb 12 17:30:42 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1gtaxG-000bcC-Nf for gllmg-musl@m.gmane.org; Tue, 12 Feb 2019 17:30:42 +0100 Original-Received: (qmail 32649 invoked by uid 550); 12 Feb 2019 16:30:40 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 32628 invoked from network); 12 Feb 2019 16:30:40 -0000 Content-Disposition: inline In-Reply-To: Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:13765 Archived-At: On Tue, Feb 12, 2019 at 02:55:19PM +0000, Ponnuvel Palaniyappan wrote: > > Is gets(s) equivalent to scanf("%[^\n]%*1[\n]",s)? > > I think it has at least one minor issue: it doesn't null-terminate the > buffer on empty input i.e., just a newline as input. Indeed, I omitted what the logic for handling the return value of scanf would be. But it also seems more complicated than we might like. If input begins with a newline, it would also fail to consume the newline without an additional call, and the additional call would make the operation as a whole non-atomic with respect to the FILE lock, which is what I was trying to avoid. Here's an alternate proposal via direct implementation: char *gets(char *s) { size_t i=0; int c; FLOCK(stdin); while ((c=getc_unlocked(stdin)) != EOF && c != '\n') s[i++] = c; s[i] = 0; if (c != '\n' && !feof(stdin)) s = 0; FUNLOCK(stdin); return s; } Does this look ok? Of course it's slow compared to a fgets-like operation on the buffer, but gets is not a usable interface and I don't see any reason to care whether it's fast. Rich > On Tue, Feb 12, 2019 at 2:42 PM James Larrowe > wrote: > > > I could probably try patching it. That C99 specification seems descriptive > > enough. > > > > On Mon, Feb 11, 2019 at 10:51 PM Rich Felker wrote: > > > >> On Mon, Feb 11, 2019 at 10:48:38PM -0500, Rich Felker wrote: > >> > On Mon, Feb 11, 2019 at 06:55:24PM -0800, Keyhan Vakil wrote: > >> > > Hi. It seems that the gets function does not follow the C99 spec. In > >> > > particular, if the input contains a null byte in the middle of the > >> > > input, then the new-line character is not discarded. > >> > > > >> > > For reference, here's the relevant part in the C99 standard > >> > > (7.19.7.7): > >> > > > >> > > > The gets function reads characters from the input stream pointed to > >> > > > by stdin, into the array pointed to by s, until end-of-file is > >> > > > encountered or a new-line character is read. Any new-line character > >> > > > is discarded, and a null character is written immediately after the > >> > > > last character read into the array. > >> > > > >> > > Here is an example: > >> > > > >> > > #include > >> > > char s[8]; > >> > > int main() { > >> > > gets(s); > >> > > for (int i = 0; i < sizeof s; i++) { > >> > > printf("%02x ", s[i]); > >> > > } > >> > > printf("\n"); > >> > > return 0; > >> > > } > >> > > > >> > > When compiled against gcc: > >> > > > >> > > $ echo -e 'A\x00B' | ./a.out > >> > > 41 00 42 00 00 00 00 00 > >> > > > >> > > When compiled against musl: > >> > > > >> > > $ echo -e 'A\x00B' | ./a.out > >> > > 41 00 42 0a 00 00 00 00 > >> > > > >> > > Note the terminating newline, which contradicts the spec. > >> > > >> > I think this bug report is correct; however the gets function is > >> > awful, removed in C11, and should never be used. :-) > >> > > >> > I will see what can be done to fix it though. > >> > >> Is gets(s) equivalent to scanf("%[^\n]%*1[\n]",s)? If so that would be > >> an appropriately hideous way to implement it that avoids the current > >> bug? :-) > >> > >> Rich > >> > > > > -- > Regards, > Ponnuvel P