From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from MTA-12-4.privateemail.com ([198.54.127.107]) by ewsd; Fri Nov 20 02:11:20 -0500 2020 Received: from mta-12.privateemail.com (localhost [127.0.0.1]) by mta-12.privateemail.com (Postfix) with ESMTP id 159CF8005A for <9front@9front.org>; Fri, 20 Nov 2020 02:11:08 -0500 (EST) Received: from localhost (unknown [10.20.151.202]) by mta-12.privateemail.com (Postfix) with ESMTPA id 052D380058 for <9front@9front.org>; Fri, 20 Nov 2020 07:11:06 +0000 (UTC) Date: Thu, 19 Nov 2020 23:11:00 -0800 From: Anthony Martin To: 9front@9front.org Subject: Re: [9front] page epub support Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Virus-Scanned: ClamAV using ClamSMTP Content-Transfer-Encoding: quoted-printable List-ID: <9front.9front.org> List-Help: X-Glyph: ➈ X-Bullshit: stateless grid configuration framework # HG changeset patch # User Anthony Martin # Date 1605855926 28800 # Thu Nov 19 23:05:26 2020 -0800 # Node ID 3d602d959da7944e6844100b536f5442e0970370 # Parent 27e5e30d3c30b64a9fb6455d81f10083a526972f awk: fix truncated input after fflush Before the "native" awk work, a call to the fflush function resulted in one or more calls to the APE fflush(2). Calling fflush on a stream open for reading has different behavior based on the environment: within APE, it's a no-op=C2=B9; on OpenBSD, it'= s an error=C2=B2; in musl, it depends on whether or not the underlying file descriptor is seekable=C2=B3; etc. I'm sure glibc is subtly different. Now that awk uses libbio, things are different: calling Bflush(2) on a file open for reading simply discards any data in the buffer. This explains why we're seeing truncated input. When awk attempts to read in the next record, there's nothing in the buffer and no more data to read so it gets EOF and exits normally. Note that this behavior is not documented in bio(2). It was added in the second edition but I haven't figured out why or what depends on it. The simple fix is to have awk only call Bflush on files that were opened for writing. You could argue that this is the only correct behavior according to the awk(1) manual and it is, in fact, how GNU awk behaves=E2=81=B4. 1. /sys/src/ape/lib/ap/stdio/fflush.c 2. https://cvsweb.openbsd.org/src/lib/libc/stdio/fflush.c?rev=3D1.9 3. https://git.musl-libc.org/cgit/musl/tree/src/stdio/fflush.c 4. https://git.savannah.gnu.org/cgit/gawk.git/tree/io.c#n1492 diff --git a/sys/src/cmd/awk/run.c b/sys/src/cmd/awk/run.c --- a/sys/src/cmd/awk/run.c +++ b/sys/src/cmd/awk/run.c @@ -1707,6 +1707,8 @@ files[2].fp =3D &stderr; } +#define writing(m) ((m) !=3D LT && (m) !=3D LE) + Biobuf *openfile(int a, char *us) { char *s =3D us; @@ -1719,8 +1721,11 @@ if (files[i].fname && strcmp(s, files[i].fname) =3D=3D 0) { if (a =3D=3D files[i].mode || (a=3D=3DAPPEND && files[i].mode=3D=3DGT= )) return files[i].fp; - if (a =3D=3D FFLUSH) + if (a =3D=3D FFLUSH) { + if(!writing(files[i].mode)) + return nil; return files[i].fp; + } } if (a =3D=3D FFLUSH) /* didn't find it, so don't create it! */ return nil; @@ -1815,7 +1820,7 @@ int i; for (i =3D 0; i < FOPEN_MAX; i++) - if (files[i].fp) + if (files[i].fp && writing(files[i].mode)) Bflush(files[i].fp); }