On Sat, 1 Dec 2018 at 11:02, Rich Felker <dalias@libc.org> wrote:

I've been trying to understand what you're trying to do. It seems you
chose to work at the point of line-buffered flush logic, since that
happens to be the only case where f->write is called with an argument
that might fit in the remaining buffer space.


Yes that's correct... Also, the other reason I chose __fwrite.c is it's the only place where the search for '\n' is done.

An additional objective I had was to split the loop searching for '\n' into two stages:
(i) Stage 1: search for '\n' word by word ie: 8 bytes at a time ... if '\n' found at >~16 bytes before start of "s" array then go straight to f->write without copying bytes
(I don't show it in my code, but the algorithm to do stage 1 would be similar to memchr.c)
(ii) Stage 2: in final 9~16 bytes, drop down into byte-by-byte searching for  '\n' (also doing copy of residual bytes into buffer when '\n' found)
 
As written the alignment logic and pointer arithmetic is invalid; the
sums/differences are out of bounds of the array, and i<=0 is not
meaningful since i has an unsigned type (and so does l+s-t). But even
if it could be made correct, it's all completely unnecessary and just
making the code slower and less readable. If __fwritex were the right place for this code, all you would need to
do is check whether i<16 (or whatever threshold) before calling
f->write, and if so, memcpy'ing it to the buffer then calling f->write
with a length of 0.

I agree, the check for i <= X is a simpler way of expressing the algorithm.
[X is a value between 9 and 16 that guarantees (s + X) will be word aligned]

In my version, I introduced a new array base "t" and rebase i to index into "t" because "t" is a guaranteed word aligned pointer.
However, this word alignment of "t" is not exploited in the current  byte-at-a-time searching for '\n', so yes at present it's unnecessary.
 
However, then you could not use the return value
of f->write to determine if it succeeded (see how fflush and fseek
have to deal with this case). Contrary to what your code assumes,
f->write does not (and cannot, since the type is unsigned) return a
negative value on error.


Ok, noted, the expression to check for error should be (!f->wpos) and not n < 0

Instead, I think it probably makes more sense to put the logic in
__stdio_write, but this will also be somewhat nontrivial to work in.
At least the "iovcnt == 2 ? ..." logic needs to be adapted to
something like "rem > len ? ...". Before the loop should probably be
something like "if (len < f->wend-f->wpos && len <= 16) ..." to
conditionally copy the new data into the buffer.

Do you see any reason to prefer doing it in __fwritex?

Wouldn't putting check for i < X in __fwrite.c better because it:
(a) facilitates word-by-word searching objective outlined above
(b) check for space in buffer is already done in line 10 of __fwrite.c, hence avoids need for any more buffer length checks
(c) ?? speeds up writes which doesn't use __stdio_write