From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/13490
Path: news.gmane.org!.POSTED!not-for-mail
From: Xan Phung <xan.phung@gmail.com>
Newsgroups: gmane.linux.lib.musl.general
Subject: Re: stdio glitch & questions
Date: Sat, 1 Dec 2018 13:42:30 +1100
Message-ID: <CAO6moYsYGOhGTzdrqRLj=J6cxidjmid2YaSnAOP6Gz5zPK24fw@mail.gmail.com>
References: <CAO6moYvRndzizm+7Q3TbyB_TYr6PatkkS9mw=OMszti5=iBB1w@mail.gmail.com>
 <20181130160951.GS23599@brightrain.aerifal.cx> <CAO6moYvBB0Lb+g23hb0hc=WXsXoACtnSBjvQ7fSep-0Zky=W_g@mail.gmail.com>
 <20181201000229.GT23599@brightrain.aerifal.cx>
Reply-To: musl@lists.openwall.com
NNTP-Posting-Host: blaine.gmane.org
Mime-Version: 1.0
Content-Type: multipart/alternative; boundary="0000000000003203c4057bece0d6"
X-Trace: blaine.gmane.org 1543632078 9475 195.159.176.226 (1 Dec 2018 02:41:18 GMT)
X-Complaints-To: usenet@blaine.gmane.org
NNTP-Posting-Date: Sat, 1 Dec 2018 02:41:18 +0000 (UTC)
To: musl@lists.openwall.com
Original-X-From: musl-return-13506-gllmg-musl=m.gmane.org@lists.openwall.com Sat Dec 01 03:41:13 2018
Return-path: <musl-return-13506-gllmg-musl=m.gmane.org@lists.openwall.com>
Envelope-to: gllmg-musl@m.gmane.org
Original-Received: from mother.openwall.net ([195.42.179.200])
	by blaine.gmane.org with smtp (Exim 4.84_2)
	(envelope-from <musl-return-13506-gllmg-musl=m.gmane.org@lists.openwall.com>)
	id 1gSvDT-0002M0-Hc
	for gllmg-musl@m.gmane.org; Sat, 01 Dec 2018 03:41:11 +0100
Original-Received: (qmail 25721 invoked by uid 550); 1 Dec 2018 02:43:20 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
List-ID: <musl.lists.openwall.com>
Original-Received: (qmail 25700 invoked from network); 1 Dec 2018 02:43:19 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20161025;
        h=mime-version:references:in-reply-to:from:date:message-id:subject:to;
        bh=5LXQxbZPQcKlZWd9Q50XUXUiV9BrwRlFmTHz/QGd+To=;
        b=o9xZEKfXKWS3ZJlRcHt9EUCSHjymoeMPl9oDeWuDHvtYdb1vjC8vxBOuo+tv+p1y90
         gnFW+srzMcmPU90kqqkZgQtI6UoAmnSEQw0m0qAI7wiTnUWGOAry1PewXiJDDkIvjJMd
         6sZrijT9ftppp9WsyFIry3sX9iHvT2UlolvyWSpnV23o8RLskGEhzyPzrBxFV9ck+bDR
         oy+5G6SFKpiAXjOhB2AinQFASs1sMUdeKXsfsfZJsgM1UqVOAO4dI5z9irUKMWvVSrUE
         qZFvOVdoaMcxAQmyYKQgaQVuYUlBmpvbZcYx88G0l8AG7TjmBPF4A/4picz7pRUw9744
         x3IA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:mime-version:references:in-reply-to:from:date
         :message-id:subject:to;
        bh=5LXQxbZPQcKlZWd9Q50XUXUiV9BrwRlFmTHz/QGd+To=;
        b=JbFAM4jcBxP91PR6RRPaPNbEYIEbCG2FA9vU0Wg3lXC3AvIRoD6xAPcq509xAK1uob
         5Yx7/Fw6h9BaCN6tLnl9GgM3ogdCxPH+Kw2yb/ydjhLUGuGY3yYB38xzR9nnvSrK33fz
         ZJAanH0Dopxjf4b26j203nQP6njAVEanH2Cldt7No4qWws53Wuv1DNLD89pCL3l3PevA
         jo942rhA/T1I2pXYMxkUGFhfIgQ6HdqaeRkxe17sGMl/3aihZMAx/AnzzL6eZS9u0+Em
         c9RSrmqZVdZsk1aGZjOrbfDqW9FhzCE5YLraeZVBwp7GsLerljfKIbfjBvKYuOwnxvzM
         YYfg==
X-Gm-Message-State: AA+aEWZHWGoFW6L/L4LeFXqSvCP/xaPUtEIcwCwWo+dTvNoV6u3Rhl3H
	txkkOeIOG5smGSebAyGehY9vCeX+JIkcUebjuxgDXZ9k
X-Google-Smtp-Source: AFSGD/U/yEN4S4nfoWkGfvYU2XD128purLscEvDA5tXFFMGa//Tlcb6+9fMKucxpMEadYEqPei8sdNpRqY/yghU+Yoo=
X-Received: by 2002:a9f:314c:: with SMTP id n12mr3697392uab.33.1543632187424;
 Fri, 30 Nov 2018 18:43:07 -0800 (PST)
In-Reply-To: <20181201000229.GT23599@brightrain.aerifal.cx>
Xref: news.gmane.org gmane.linux.lib.musl.general:13490
Archived-At: <http://permalink.gmane.org/gmane.linux.lib.musl.general/13490>

--0000000000003203c4057bece0d6
Content-Type: text/plain; charset="UTF-8"

On Sat, 1 Dec 2018 at 11:02, Rich Felker <dalias@libc.org> wrote:

>
> I've been trying to understand what you're trying to do. It seems you
> chose to work at the point of line-buffered flush logic, since that
> happens to be the only case where f->write is called with an argument
> that might fit in the remaining buffer space.
>
>
Yes that's correct... Also, the other reason I chose __fwrite.c is it's the
only place where the search for '\n' is done.

An additional objective I had was to split the loop searching for '\n' into
two stages:
(i) Stage 1: search for '\n' word by word ie: 8 bytes at a time ... if '\n'
found at >~16 bytes before start of "s" array then go straight to f->write
without copying bytes
(I don't show it in my code, but the algorithm to do stage 1 would be
similar to memchr.c)
(ii) Stage 2: in final 9~16 bytes, drop down into byte-by-byte searching
for  '\n' (also doing copy of residual bytes into buffer when '\n' found)


> As written the alignment logic and pointer arithmetic is invalid; the
> sums/differences are out of bounds of the array, and i<=0 is not
> meaningful since i has an unsigned type (and so does l+s-t). But even
> if it could be made correct, it's all completely unnecessary and just
> making the code slower and less readable. If __fwritex were the right
> place for this code, all you would need to
>
do is check whether i<16 (or whatever threshold) before calling
> f->write, and if so, memcpy'ing it to the buffer then calling f->write
> with a length of 0.


I agree, the check for i <= X is a simpler way of expressing the algorithm.
[X is a value between 9 and 16 that guarantees (s + X) will be word aligned]

In my version, I introduced a new array base "t" and rebase i to index into
"t" because "t" is a guaranteed word aligned pointer.
However, this word alignment of "t" is not exploited in the current
byte-at-a-time searching for '\n', so yes at present it's unnecessary.


> However, then you could not use the return value
> of f->write to determine if it succeeded (see how fflush and fseek
> have to deal with this case). Contrary to what your code assumes,
> f->write does not (and cannot, since the type is unsigned) return a
> negative value on error.
>
>
Ok, noted, the expression to check for error should be (!f->wpos) and not n
< 0

Instead, I think it probably makes more sense to put the logic in
> __stdio_write, but this will also be somewhat nontrivial to work in.
> At least the "iovcnt == 2 ? ..." logic needs to be adapted to
> something like "rem > len ? ...". Before the loop should probably be
> something like "if (len < f->wend-f->wpos && len <= 16) ..." to
> conditionally copy the new data into the buffer.
>
> Do you see any reason to prefer doing it in __fwritex?
>
> Wouldn't putting check for i < X in __fwrite.c better because it:
(a) facilitates word-by-word searching objective outlined above
(b) check for space in buffer is already done in line 10 of __fwrite.c,
hence avoids need for any more buffer length checks
(c) ?? speeds up writes which doesn't use __stdio_write

--0000000000003203c4057bece0d6
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><br><br><div class=3D"gmail_quote"><div d=
ir=3D"ltr">On Sat, 1 Dec 2018 at 11:02, Rich Felker &lt;<a href=3D"mailto:d=
alias@libc.org">dalias@libc.org</a>&gt; wrote:<br></div><blockquote class=
=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rg=
b(204,204,204);padding-left:1ex"><br>
I&#39;ve been trying to understand what you&#39;re trying to do. It seems y=
ou<br>
chose to work at the point of line-buffered flush logic, since that<br>
happens to be the only case where f-&gt;write is called with an argument<br=
>
that might fit in the remaining buffer space.<br>
<br></blockquote><div><br></div><div>Yes that&#39;s correct... Also, the ot=
her reason I chose __fwrite.c is it&#39;s the only place where the search f=
or &#39;\n&#39; is done.</div><div><br></div><div>An additional objective I=
 had was to split the loop searching for &#39;\n&#39; into two stages:</div=
><div>(i) Stage 1: search for &#39;\n&#39; word by word ie: 8 bytes at a ti=
me ... if &#39;\n&#39; found at &gt;~16 bytes before start of &quot;s&quot;=
 array then go straight to f-&gt;write without copying bytes</div><div>(I d=
on&#39;t show it in my code, but the algorithm to do stage 1 would be simil=
ar to memchr.c)</div><div>(ii) Stage 2: in final 9~16 bytes, drop down into=
 byte-by-byte searching for=C2=A0 &#39;\n&#39; (also doing copy of residual=
 bytes into buffer when &#39;\n&#39; found)</div><div>=C2=A0<br></div></div=
></div><div dir=3D"ltr"><div class=3D"gmail_quote"><blockquote class=3D"gma=
il_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,2=
04,204);padding-left:1ex">
As written the alignment logic and pointer arithmetic is invalid; the<br>
sums/differences are out of bounds of the array, and i&lt;=3D0 is not<br>
meaningful since i has an unsigned type (and so does l+s-t). But even<br>
if it could be made correct, it&#39;s all completely unnecessary and just<b=
r>
making the code slower and less readable. If __fwritex were the right place=
 for this code, all you would need to<br></blockquote><blockquote class=3D"=
gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(20=
4,204,204);padding-left:1ex">
do is check whether i&lt;16 (or whatever threshold) before calling<br>
f-&gt;write, and if so, memcpy&#39;ing it to the buffer then calling f-&gt;=
write<br>
with a length of 0.</blockquote><div><br></div><div>I agree, the check for =
i &lt;=3D X is a simpler way of expressing the algorithm.</div><div>[X is a=
 value between 9 and 16 that guarantees (s + X) will be word aligned]</div>=
<div><br></div><div>In my version, I introduced a new array base &quot;t&qu=
ot; and rebase i to index into &quot;t&quot; because &quot;t&quot; is a gua=
ranteed word aligned pointer.</div><div>However, this word alignment of &qu=
ot;t&quot; is not exploited in the current=C2=A0 byte-at-a-time searching f=
or &#39;\n&#39;, so yes at present it&#39;s unnecessary.</div><div>=C2=A0<b=
r></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex=
;border-left:1px solid rgb(204,204,204);padding-left:1ex">However, then you=
 could not use the return value<br>
of f-&gt;write to determine if it succeeded (see how fflush and fseek<br>
have to deal with this case). Contrary to what your code assumes,<br>
f-&gt;write does not (and cannot, since the type is unsigned) return a<br>
negative value on error.<br>
<br></blockquote><div><br></div><div>Ok, noted, the expression to check for=
 error should be (!f-&gt;wpos) and not n &lt; 0</div><div><br></div><blockq=
uote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1p=
x solid rgb(204,204,204);padding-left:1ex">
Instead, I think it probably makes more sense to put the logic in<br>
__stdio_write, but this will also be somewhat nontrivial to work in.<br>
At least the &quot;iovcnt =3D=3D 2 ? ...&quot; logic needs to be adapted to=
<br>
something like &quot;rem &gt; len ? ...&quot;. Before the loop should proba=
bly be<br>
something like &quot;if (len &lt; f-&gt;wend-f-&gt;wpos &amp;&amp; len &lt;=
=3D 16) ...&quot; to<br>
conditionally copy the new data into the buffer.<br>
<br>
Do you see any reason to prefer doing it in __fwritex?<br><br></blockquote>=
<div>Wouldn&#39;t putting check for i &lt; X in __fwrite.c better because i=
t:</div><div>(a) facilitates word-by-word searching objective outlined abov=
e<br></div><div>(b) check for space in buffer is already done in line 10 of=
 __fwrite.c, hence avoids need for any more buffer length checks<br></div><=
div><div>(c) ?? speeds up writes which doesn&#39;t use __stdio_write</div><=
br class=3D"gmail-Apple-interchange-newline"></div><div>=C2=A0</div></div><=
/div></div>

--0000000000003203c4057bece0d6--