From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/13761 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: James Larrowe Newsgroups: gmane.linux.lib.musl.general Subject: Re: Bug in gets function? Date: Tue, 12 Feb 2019 09:41:40 -0500 Message-ID: References: <20190212034838.GH23599@brightrain.aerifal.cx> <20190212035106.GI23599@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="0000000000002767ec0581b36d8c" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="193254"; mail-complaints-to="usenet@blaine.gmane.org" To: musl@lists.openwall.com Original-X-From: musl-return-13777-gllmg-musl=m.gmane.org@lists.openwall.com Tue Feb 12 15:42:09 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1gtZGD-000oAr-0f for gllmg-musl@m.gmane.org; Tue, 12 Feb 2019 15:42:09 +0100 Original-Received: (qmail 28169 invoked by uid 550); 12 Feb 2019 14:42:06 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 28149 invoked from network); 12 Feb 2019 14:42:05 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=jpohb+5TW4BGAONyL1vkBfFnIDRM8Mc3JcxHqRWYD/w=; b=qLe/vFnahmjoMQfP9NVGSI5shuX8vl+k6OcPt4gNgkrD0AFR2n70dcB/zQbProubwm ZJZxrhYzL5oj0ArVDasKlLJQX5lXuoECN8t6ZjDO2kZ85EFTFCnXiBJy6acpUWbrw5Yo wy1D0tWHewSJD2rOJ1tW//4Rz4ootY6Nu2F4yraKxsGY6Njq1VW8yBR5Mv1X/pKT2ly7 koEO3TFEVuyKKsr9Hb1VKvSpd/rnn/eMHJhFEv+Wc7ZbcZJzxuNm6VuqH+Mk0aOLsLTK yZw7gIBmSRmHlNI0UNJWXepla/opwUMvlghfsXggT1qebypQpfpMHB0CKh3BqYdSVPrO 9acw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=jpohb+5TW4BGAONyL1vkBfFnIDRM8Mc3JcxHqRWYD/w=; b=QDraaxF24faRgghQDSKi98kkXd8kNwxJ7mMb+sIt4CO9VfKmc7nQ5So9nH1rDruQh5 D9AufTM9FBnbY6WsE+AiyZ4YNgClvMLh9RzlMYayiXdGCpYlrOkQajJNJSY/yseGebva d9GPfi0mHUK6b1kJgZfJnJefQHij+zQfq6UTHrYpuIY+IJIEcW86b9v02o+f6OagUb3e zhoMgN3eXlJW+rMIqjnXZo1l0MVuGL99nBC6YO5uEU4uHtZnP0XUAyVU2el1zXN2fk+g 7L6oD4Jj5gig+GrwpoQmsavyILVSQ+mEQfYOcC33q5/PiaD4Vm2ZFqbY60zmy0jD3Bob cr2g== X-Gm-Message-State: AHQUAuaXTRWXEfulDWl6udo1RdrWZgK4AFs7xH4ZraaJAjP0rdoKDFaR /7ipKKVx6RPrdSocBTVR0DlQcGpEzA/GrGLp0GLRfg== X-Google-Smtp-Source: AHgI3Ia/7actipNkUQEnlTXThMf+FHTY0tehogo0qECeSPS95/apoayDDurXlSbNhpqFuRkpF7i01q9bueUqEJjIhKk= X-Received: by 2002:a50:b623:: with SMTP id b32mr3283953ede.55.1549982513979; Tue, 12 Feb 2019 06:41:53 -0800 (PST) In-Reply-To: <20190212035106.GI23599@brightrain.aerifal.cx> Xref: news.gmane.org gmane.linux.lib.musl.general:13761 Archived-At: --0000000000002767ec0581b36d8c Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I could probably try patching it. That C99 specification seems descriptive enough. On Mon, Feb 11, 2019 at 10:51 PM Rich Felker wrote: > On Mon, Feb 11, 2019 at 10:48:38PM -0500, Rich Felker wrote: > > On Mon, Feb 11, 2019 at 06:55:24PM -0800, Keyhan Vakil wrote: > > > Hi. It seems that the gets function does not follow the C99 spec. In > > > particular, if the input contains a null byte in the middle of the > > > input, then the new-line character is not discarded. > > > > > > For reference, here's the relevant part in the C99 standard > > > (7.19.7.7): > > > > > > > The gets function reads characters from the input stream pointed to > > > > by stdin, into the array pointed to by s, until end-of-=EF=AC=81le = is > > > > encountered or a new-line character is read. Any new-line character > > > > is discarded, and a null character is written immediately after the > > > > last character read into the array. > > > > > > Here is an example: > > > > > > #include > > > char s[8]; > > > int main() { > > > gets(s); > > > for (int i =3D 0; i < sizeof s; i++) { > > > printf("%02x ", s[i]); > > > } > > > printf("\n"); > > > return 0; > > > } > > > > > > When compiled against gcc: > > > > > > $ echo -e 'A\x00B' | ./a.out > > > 41 00 42 00 00 00 00 00 > > > > > > When compiled against musl: > > > > > > $ echo -e 'A\x00B' | ./a.out > > > 41 00 42 0a 00 00 00 00 > > > > > > Note the terminating newline, which contradicts the spec. > > > > I think this bug report is correct; however the gets function is > > awful, removed in C11, and should never be used. :-) > > > > I will see what can be done to fix it though. > > Is gets(s) equivalent to scanf("%[^\n]%*1[\n]",s)? If so that would be > an appropriately hideous way to implement it that avoids the current > bug? :-) > > Rich > --0000000000002767ec0581b36d8c Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I could probably try patching it. That C99 specification s= eems descriptive enough.

On Mon, Feb 11, 2019 at 10:51 PM Rich Felker <dalias@libc.org> wrote:
On Mon, Feb 11, 2019 at 10:= 48:38PM -0500, Rich Felker wrote:
> On Mon, Feb 11, 2019 at 06:55:24PM -0800, Keyhan Vakil wrote:
> > Hi. It seems that the gets function does not follow the C99 spec.= In
> > particular, if the input contains a null byte in the middle of th= e
> > input, then the new-line character is not discarded.
> >
> > For reference, here's the relevant part in the C99 standard > > (7.19.7.7):
> >
> > > The gets function reads characters from the input stream poi= nted to
> > > by stdin, into the array pointed to by s, until end-of-=EF= =AC=81le is
> > > encountered or a new-line character is read. Any new-line ch= aracter
> > > is discarded, and a null character is written immediately af= ter the
> > > last character read into the array.
> >
> > Here is an example:
> >
> >=C2=A0 =C2=A0 =C2=A0#include <stdio.h>
> >=C2=A0 =C2=A0 =C2=A0char s[8];
> >=C2=A0 =C2=A0 =C2=A0int main() {
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0gets(s);
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0for (int i =3D 0; i < sizeof = s; i++) {
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0printf("%02x = ", s[i]);
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0printf("\n");
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return 0;
> >=C2=A0 =C2=A0 =C2=A0}
> >
> > When compiled against gcc:
> >
> >=C2=A0 =C2=A0 =C2=A0$ echo -e 'A\x00B' | ./a.out
> >=C2=A0 =C2=A0 =C2=A041 00 42 00 00 00 00 00
> >
> > When compiled against musl:
> >
> >=C2=A0 =C2=A0 =C2=A0$ echo -e 'A\x00B' | ./a.out
> >=C2=A0 =C2=A0 =C2=A041 00 42 0a 00 00 00 00
> >
> > Note the terminating newline, which contradicts the spec.
>
> I think this bug report is correct; however the gets function is
> awful, removed in C11, and should never be used. :-)
>
> I will see what can be done to fix it though.

Is gets(s) equivalent to scanf("%[^\n]%*1[\n]",s)? If so that wou= ld be
an appropriately hideous way to implement it that avoids the current
bug? :-)

Rich
--0000000000002767ec0581b36d8c--