From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/13488
Path: news.gmane.org!.POSTED!not-for-mail
From: Xan Phung <xan.phung@gmail.com>
Newsgroups: gmane.linux.lib.musl.general
Subject: Re: stdio glitch & questions
Date: Sat, 1 Dec 2018 09:15:56 +1100
Message-ID: <CAO6moYvBB0Lb+g23hb0hc=WXsXoACtnSBjvQ7fSep-0Zky=W_g@mail.gmail.com>
References: <CAO6moYvRndzizm+7Q3TbyB_TYr6PatkkS9mw=OMszti5=iBB1w@mail.gmail.com>
 <20181130160951.GS23599@brightrain.aerifal.cx>
Reply-To: musl@lists.openwall.com
NNTP-Posting-Host: blaine.gmane.org
Mime-Version: 1.0
Content-Type: multipart/alternative; boundary="000000000000d99d25057be926be"
X-Trace: blaine.gmane.org 1543616081 14653 195.159.176.226 (30 Nov 2018 22:14:41 GMT)
X-Complaints-To: usenet@blaine.gmane.org
NNTP-Posting-Date: Fri, 30 Nov 2018 22:14:41 +0000 (UTC)
To: musl@lists.openwall.com
Original-X-From: musl-return-13504-gllmg-musl=m.gmane.org@lists.openwall.com Fri Nov 30 23:14:37 2018
Return-path: <musl-return-13504-gllmg-musl=m.gmane.org@lists.openwall.com>
Envelope-to: gllmg-musl@m.gmane.org
Original-Received: from mother.openwall.net ([195.42.179.200])
	by blaine.gmane.org with smtp (Exim 4.84_2)
	(envelope-from <musl-return-13504-gllmg-musl=m.gmane.org@lists.openwall.com>)
	id 1gSr3U-0003hV-QV
	for gllmg-musl@m.gmane.org; Fri, 30 Nov 2018 23:14:36 +0100
Original-Received: (qmail 30615 invoked by uid 550); 30 Nov 2018 22:16:45 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
List-ID: <musl.lists.openwall.com>
Original-Received: (qmail 30597 invoked from network); 30 Nov 2018 22:16:45 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20161025;
        h=mime-version:references:in-reply-to:from:date:message-id:subject:to;
        bh=1JQmTj4RX+Jted4fm47BL3VNA9tHkahsl62HXoiVUZw=;
        b=mippx1+F5t9noSJoPC+zh1539ePntPu34gW2m536cX7R7hV1UUhcptk1lTCbvJ2kTm
         UnKMcrlRlUADjUNhDkcZCV/CpRiorrwZTaNkQ6EBhEJ3n/IUX1ayYSOzQKxAEHl3dYa0
         Q9ovFX2apLWVnqC+cGqCYxe0QZssYeM01PA3RP9ZWi3Pz4/CIWiibwJpD0Gi25Mlpp3P
         p7MY2MUbid9J+L9x6THs2yR7waYUYAIKhxtdn6Fjj0pESlR5KUcdcdW9mBBp+niwJ+0J
         iFePWJqDmRYXrjUuUi2SLZroC1iWiOoVzS35lj5DM+RTEI7FRQ01Y07m1op3gepMNllK
         mTHg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:mime-version:references:in-reply-to:from:date
         :message-id:subject:to;
        bh=1JQmTj4RX+Jted4fm47BL3VNA9tHkahsl62HXoiVUZw=;
        b=G0sUoZjAwt5pZV955bmGYJQbY1TJusQ71KQ3uMMcE9arNqZVBV7RGjHUjYaQ3iM7VR
         wvqgGuZwNmfFm6rU5+tgxSuDS0DJ9yorvSEC5QeABNkWrPACecAB+LA/AuCmnPeXjrwt
         hR29w5GH5d0+T9+5hubazEWs48wVfiIAx84CbQe9og1oCZ+152G38vDwP3tCs8vzM4DQ
         zHQ6xAdz1wXPkJpgpWz3y7Efo+fx7Y53wvJb/WHItcPnhcZVS+t6Wga7irzEO6gxffN4
         EvT4rX3V9YqMPFmzxh2cdxoiPBUD4I/h0DA54YR0hGW+ZiWOwar9Zh3MfcR3cOflyy6K
         Ej9A==
X-Gm-Message-State: AA+aEWatJWN3xc0oHvJ+MD9V6UEr2oiZaUDpVJ3KPJpIqPkk/b+vlB9N
	r1C5g553vI4CgGFh40B7c56auIKWtaZg2OaJaIi/yBHP
X-Google-Smtp-Source: AFSGD/Vm8rN/D3TweEs0FtZMtxSlmICuAXpdb16/EaV43569Tp+GMFnmsGO8A20ZCsWtApKi4dhnKJrzjhSliPxSKAs=
X-Received: by 2002:a67:4a96:: with SMTP id e22mr3185252vsg.92.1543616192943;
 Fri, 30 Nov 2018 14:16:32 -0800 (PST)
In-Reply-To: <20181130160951.GS23599@brightrain.aerifal.cx>
Xref: news.gmane.org gmane.linux.lib.musl.general:13488
Archived-At: <http://permalink.gmane.org/gmane.linux.lib.musl.general/13488>

--000000000000d99d25057be926be
Content-Type: text/plain; charset="UTF-8"

Thanks for the quick answer, and I've taken a look at the pre-2011 fwrite.c
code, and using SYS_writev is indeed much cleaner code!

See below on my proposed answer to your question about what the "cutoff"
should be for copying.  (SYS_writev is fully retained, but the 2nd iovec
element will very often be zero length in this proposal, which makes
emulation of SYS_writev much more efficient).

On Sat, 1 Dec 2018 at 03:10, Rich Felker <dalias@libc.org> wrote:

>
> It would probably be welcome to make __stdio_write make use of
> SYS_write when it would be expected to be faster (len very small), but
> I'm not sure what the exact cutoff should be.
>
>
My proposal is the cutoff be 5-8 bytes (on 32 bit CPUs) , and on 64 bit
CPUs, 9-16 bytes.

The cutoffs are selected in such a way that the "no copy" loop (searching
for '\n') always ends on a word aligned position (opening the door to
future optimisations by using word-at-a-time search for '\n' instead of
byte-at-a-time).  The "copy" branch is also guaranteed to only be a double
word at most, but a minimum of a single word (allowing a two word memcpy to
be done with just a 2x load/mask/store word code sequence).  Some example
code is shown to give a general idea of word-aligning the cutoff amount
(but not yet doing word-at-a-time searching of '\n', or optimised two word
memcpy).

*CURRENT __fwritex.c CODE (lines 12-20)*:


if (f->lbf >= 0) {

/* Match /^(.*\n|)/ */

for (i=l; i && s[i-1] != '\n'; i--);

if (i) {

size_t n = f->write(f, s, i);

if (n < i) return n;

s += i;

l -= i;

}

}


*PROPOSED*:

	size_t i, len;
	if (f->lbf >= 0) {
		const unsigned char *t = ALIGN(s+sizeof(size_t)*2);
		for (i = l+s-t; ; i--) {
			if (i <= 0) {   /* SHORT LINE - copy up to 16 bytes into f->wpos
buffer and then flush line */
				for (j = t-s; j && s[j-1] != '\n'; j--);
				if (j) {
			  		memcpy(f->wpos, s, j);  f->wpos += j;
					size_t n = f->write(f, t, 0);
					if (n < 0) return n;
					s += j;
					l -= j;
				}				break;
			}
			if (t[i-1] == '\n') {
				size_t n = f->write(f, s, len = i+t-s);
				if (n < len) return n;
				s += len;
				l -= len;
				break;
			}
		}
	}

--000000000000d99d25057be926be
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr">Thanks for the quick ans=
wer, and I&#39;ve taken a look at the pre-2011 fwrite.c code, and using SYS=
_writev is indeed much cleaner code!</div><div dir=3D"ltr"><br class=3D"gma=
il-Apple-interchange-newline">See below on my proposed answer to your quest=
ion about what the &quot;cutoff&quot; should be for copying.=C2=A0 (SYS_wri=
tev is fully retained, but the 2nd iovec element will very often be zero le=
ngth in this proposal, which makes emulation of SYS_writev much more effici=
ent).</div><div dir=3D"ltr"><br><div class=3D"gmail_quote"><div dir=3D"ltr"=
>On Sat, 1 Dec 2018 at 03:10, Rich Felker &lt;<a href=3D"mailto:dalias@libc=
.org">dalias@libc.org</a>&gt; wrote:<br></div><blockquote class=3D"gmail_qu=
ote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,20=
4);padding-left:1ex"><br>
It would probably be welcome to make __stdio_write make use of<br>
SYS_write when it would be expected to be faster (len very small), but<br>
I&#39;m not sure what the exact cutoff should be.=C2=A0<br><br></blockquote=
><div><br></div><div>My proposal is the cutoff be 5-8 bytes (on 32 bit CPUs=
)=C2=A0, and on 64 bit CPUs, 9-16 bytes.</div><div><br></div><div>The cutof=
fs are selected in such a way that the &quot;no copy&quot; loop (searching =
for &#39;\n&#39;) always ends on a word aligned position (opening the door =
to future optimisations by using word-at-a-time search for &#39;\n&#39; ins=
tead of byte-at-a-time).=C2=A0 The &quot;copy&quot; branch is also guarante=
ed to only be a double word at most, but a minimum of a single word (allowi=
ng a two word memcpy to be done with just a 2x load/mask/store word code se=
quence).=C2=A0 Some example code is shown to give a general idea of word-al=
igning the cutoff amount (but not yet doing word-at-a-time searching of =
9;\n&#39;, or optimised two word memcpy).</div><div><br></div><div><b>CURRE=
NT __fwritex.c CODE (lines 12-20)</b>:</div><div>


<p class=3D"gmail-p1" style=3D"margin:0px;font-variant-numeric:normal;font-=
variant-east-asian:normal;font-stretch:normal;line-height:normal;color:rgb(=
0,0,0)"><font face=3D"monospace, monospace"><br></font></p><p class=3D"gmai=
l-p1" style=3D"margin:0px;font-variant-numeric:normal;font-variant-east-asi=
an:normal;font-stretch:normal;line-height:normal"><span style=3D"color:rgb(=
0,0,0);font-family:monospace,monospace;white-space:pre">	</span><span style=
=3D"color:rgb(0,0,0);font-family:monospace,monospace">if (f-&gt;lbf &gt;=3D=
 0) {</span><br></p><p class=3D"gmail-p1" style=3D"margin:0px;font-variant-=
numeric:normal;font-variant-east-asian:normal;font-stretch:normal;line-heig=
ht:normal"><font color=3D"#000000" face=3D"monospace, monospace"><span styl=
e=3D"white-space:pre">		</span>/* Match /^(.*\n|)/ */</font></p><p class=3D=
"gmail-p1" style=3D"margin:0px;font-variant-numeric:normal;font-variant-eas=
t-asian:normal;font-stretch:normal;line-height:normal"><font color=3D"#0000=
00" face=3D"monospace, monospace"><span style=3D"white-space:pre">		</span>=
for (i=3Dl; i &amp;&amp; s[i-1] !=3D &#39;\n&#39;; i--);</font></p><p class=
=3D"gmail-p1" style=3D"margin:0px;font-variant-numeric:normal;font-variant-=
east-asian:normal;font-stretch:normal;line-height:normal"><font color=3D"#0=
00000" face=3D"monospace, monospace"><span style=3D"white-space:pre">		</sp=
an>if (i) {</font></p><p class=3D"gmail-p1" style=3D"margin:0px;font-varian=
t-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;line-he=
ight:normal"><font color=3D"#000000" face=3D"monospace, monospace"><span st=
yle=3D"white-space:pre">			</span>size_t n =3D f-&gt;write(f, s, i);</font>=
</p><p class=3D"gmail-p1" style=3D"margin:0px;font-variant-numeric:normal;f=
ont-variant-east-asian:normal;font-stretch:normal;line-height:normal"><font=
 color=3D"#000000" face=3D"monospace, monospace"><span style=3D"white-space=
:pre">			</span>if (n &lt; i) return n;</font></p><p class=3D"gmail-p1" sty=
le=3D"margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal=
;font-stretch:normal;line-height:normal"><font color=3D"#000000" face=3D"mo=
nospace, monospace"><span style=3D"white-space:pre">			</span>s +=3D i;</fo=
nt></p><p class=3D"gmail-p1" style=3D"margin:0px;font-variant-numeric:norma=
l;font-variant-east-asian:normal;font-stretch:normal;line-height:normal"><f=
ont color=3D"#000000" face=3D"monospace, monospace"><span style=3D"white-sp=
ace:pre">			</span>l -=3D i;</font></p><p class=3D"gmail-p1" style=3D"margi=
n:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stret=
ch:normal;line-height:normal"><font color=3D"#000000" face=3D"monospace, mo=
nospace"><span style=3D"white-space:pre">		</span>}</font></p><p class=3D"g=
mail-p1" style=3D"margin:0px;font-variant-numeric:normal;font-variant-east-=
asian:normal;font-stretch:normal;line-height:normal"><font color=3D"#000000=
" face=3D"monospace, monospace"><span style=3D"white-space:pre">	</span>}</=
font></p><p class=3D"gmail-p1" style=3D"margin:0px;font-variant-numeric:nor=
mal;font-variant-east-asian:normal;font-stretch:normal;line-height:normal">=
<br></p><p class=3D"gmail-p1" style=3D"margin:0px;font-variant-numeric:norm=
al;font-variant-east-asian:normal;font-stretch:normal;line-height:normal"><=
font color=3D"#000000"><br></font></p></div><div><b>PROPOSED</b>:</div><div=
><br></div><div><pre style=3D"padding:0px;margin-top:0px;margin-bottom:0px"=
><pre style=3D"padding:0px;margin-top:0px;margin-bottom:0px"><pre style=3D"=
padding:0px;margin-top:0px;margin-bottom:0px"><p class=3D"gmail-p1" style=
=3D"margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;f=
ont-stretch:normal;line-height:normal"><font face=3D"monospace, monospace">=
	size_t i, len;
	if (f-&gt;lbf &gt;=3D 0) {
		const unsigned char *t =3D ALIGN(s+sizeof(size_t)*2);
		for (i =3D l+s-t; ; i--) {
			if (i &lt;=3D 0) {   /* SHORT LINE - copy up to 16 bytes into f-&gt;wpos=
 buffer and then flush line */
				for (j =3D t-s; j &amp;&amp; s[j-1] !=3D &#39;\n&#39;; j--);
				if (j) {
			  		memcpy(f-&gt;wpos, s, j);  f-&gt;wpos +=3D j;
					size_t n =3D f-&gt;write(f, t, 0);
					if (n &lt; 0) return n;
					s +=3D j;
					l -=3D j;
				}
</font><span style=3D"font-family:monospace,monospace">				break;</span><br=
 class=3D"gmail-Apple-interchange-newline"><font face=3D"monospace, monospa=
ce">			}
			if (t[i-1] =3D=3D &#39;\n&#39;) {
				size_t n =3D f-&gt;write(f, s, len =3D i+t-s);
				if (n &lt; len) return n;
				s +=3D len;
				l -=3D len;
				break;
			}
		}
	}</font><br></p></pre></pre></pre><pre style=3D"padding:0px;margin-top:0px=
;margin-bottom:0px"><pre style=3D"padding:0px;margin-top:0px;margin-bottom:=
0px"><pre style=3D"padding:0px;margin-top:0px;margin-bottom:0px"><p class=
=3D"gmail-p1" style=3D"color:rgb(0,0,0);margin:0px;font-variant-numeric:nor=
mal;font-variant-east-asian:normal;font-stretch:normal;line-height:normal">=
<br></p></pre></pre><span style=3D"color:rgb(0,0,0);font-size:13.3333px">


</span></pre><pre style=3D"padding:0px;margin-top:0px;margin-bottom:0px;col=
or:rgb(0,0,0);font-size:13.3333px"><code><br></code></pre></div><div><br></=
div></div></div></div></div>

--000000000000d99d25057be926be--