stdio review

mailing list of musl libc
 help / color / mirror / code / Atom feed

* stdio review
@ 2018-02-13 16:19 Dale Weiler
  2018-02-13 16:59 ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: Dale Weiler @ 2018-02-13 16:19 UTC (permalink / raw)
  To: musl

The following has been a several work week code review of musl's stdio
code and supplementary internal code for stdio. While I would supply
the review as is inline with this email I'm concerned about the email
RFC line width for plain text emails and gmail's handling of plain
text so instead I've hosted it on gist.github.com and can be seen
here.

https://gist.githubusercontent.com/graphitemaster/927316c85587d4ad7b6870857ff681d9/raw/0ec1a90f788d3188ae14b40347af69eeca745a23/musl%2520stdio%2520review.txt

Thanks

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: stdio review
  2018-02-13 16:19 stdio review Dale Weiler
@ 2018-02-13 16:59 ` Rich Felker
  2018-02-13 19:21   ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: Rich Felker @ 2018-02-13 16:59 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 657 bytes --]

On Tue, Feb 13, 2018 at 11:19:26AM -0500, Dale Weiler wrote:
> The following has been a several work week code review of musl's stdio
> code and supplementary internal code for stdio. While I would supply
> the review as is inline with this email I'm concerned about the email
> RFC line width for plain text emails and gmail's handling of plain
> text so instead I've hosted it on gist.github.com and can be seen
> here.
> 
> https://gist.githubusercontent.com/graphitemaster/927316c85587d4ad7b6870857ff681d9/raw/0ec1a90f788d3188ae14b40347af69eeca745a23/musl%2520stdio%2520review.txt

I'm moving it to email so it's available for comment/discussion. Here:

[-- Attachment #2: musl_stdio_review.txt --]
[-- Type: text/plain, Size: 28404 bytes --]

musl stdio review

All work presented in here was based off the last effective stdio
related commit b64539ae06aa91a407359238f4e909adb9bfab3d

Unless otherwise stated, the approach taken in the review was to view
each stdio related file in alphabeticalized order and setup test
harnesses for each with no pre-conditions or invariants until a unit
made explicit reference to it, for instance if foo.c had function foo
taking pointer to FILE*, and did not check for FILE* to be NULL, but
references to foo did, then those were noted and the harness was
updated with that invariant. This approach, while tedious ensures that
all references maintain the same invariants throughout.

The review results are layed out in a tree like structure where the
first most column line is the file and each indentation level within
it is a function within that translation unit proceeded with a status
that is right justified on the 80th column, of either [nothing],
indicating that nothing of merit was discovered, [question] for
something I'm unsure about, [style] for tricky statements or expressions
that made it difficult to understand what was happening, [bug] for
things that I precieved as potential bugs, [comment] for things that
should be commented and [optimize] for things that can be optimized.

Similarly, if a function has multiple tags like [bug] and [comment] then
sections pertaining to each are in separate sections delimited by a new
line. Functions that are aliases, or wrappers for existing functions are
treated together as two cases with an implied fallthrough much like
cases in a switch statement.

There is two parts to this review. The initial stdio infastructure that
is in src/stdio which encompasses the first part of the review and which
is the longest. Then there is the supplemental part for the undocumented
named "scan helper" interface and stdio implementation details which
can be found in src/internal.

========================================================================
    src/stdio
========================================================================

__fclose_ca.c:
    __fclose_ca:                                               [nothing]

__fdopen.c:
    __fdopen:                                                  [nothing]
    fdopen:                                                    [nothing]

__fmodeflags.c:
    __fmodeflags:                                              [nothing]

__fopen_rb_ca.c:
    __fopen_rb_ca:
                                                              [question]
        Is it possible for this function to be called with a length less
        than UNGET?
                                                               [comment]
        I was unsure what the ca meant until finding a reference in a
        deeply nested internal header that it was "caller allocated",
        maybe emphasis that in the source file to make review easier.

__lockfile.c:
    __lockfile:                                                [nothing]
    __unlockfile:                                              [nothing]

__overflow.c:
    __overflow:                                               [question]
        The interface contract of this function is it takes int and
        returns int, yet the passed in character is truncated through
        narrowing conversion to unsigned char for the purposes of
        stdio, should that same narrowed conversion be returned or
        should the unnarrowed result be returned?

__stdio_close.c:
    __dummy:                                                   [nothing]
    __aio_close:                                               [nothing]
    __stdio_close:                                             [nothing]

__stdio_exit.c:
    close_file:                                               [question]
        Function acquires lock on FILE but never unlocks it, this looks
        intended since we're exiting stdio? If that's the case this
        should probably be commented.
    __stdio_exit:                                             [question]
    __stdio_exit_needed:                                      [question]
                                                               [comment]
        Function acquires lock via __ofl_lock but never calls
        __ofl_unlock, this looks intended since we're exiting stdio?
        If that's the case this should probably be commented.

__stdio_read.c:
    __stdio_read:                                                [style]
        Tricky expression involving F_EOF/F_ERR and xor masks.

__stdio_seek.c:
    __stdio_seek:                                              [nothing]

__stdio_write.c:
    __stdio_write:                                            [question]
        Is it possible for len to be less than f->wpos-f->wbase?

__stdout_write.c:
    __stdout_write:                                           [question]
        Is it possible for len to be less than f->wpos-f->wbase?

__string_read.c:
    __string_read:                                             [nothing]

__toread.c:
    __toread:                                                  [comment]
        It's sister function __towrite comments its' intent, maybe
        this should as well for consistency.

__towrite.c:
    __towrite:                                                   [style]
        What is the purpose of parenthesis around F_NOWR? does it imply
        a missing flag in the list, looks suspicious.

__uflow.c:
    __uflow:                                                   [nothing]

asprintf.c:
    asprintf:                                                  [nothing]

clearerr.c:
    clearerr:                                                  [nothing]
    clearerr_unlocked:                                        [question]
        An alias for clearerr but clearerr locks, is this correct?
        or should clearerr_unlocked just remove EOF/ERR flags, and
        clearerr hold a lock calling clearerr_unlocked?

dprintf.c:
    dprintf:                                                   [nothing]

ext.c:
    __flushlbf:                                                [nothing]
    __fsetlocking:                                             [nothing]
    __fwriting:                                                [nothing]
    __freading:                                                [nothing]
    __freadable:                                               [nothing]
    __fwritable:                                               [nothing]
    __flbf:                                                    [nothing]
    __fbufsize:                                                [nothing]
    __fpending:                                                [nothing]
    __fpurge:                                                      [bug]
    fpurge:                                                        [bug]
        The interface contract for this function is that it returns void
        and not int, whereas fpurge returns int, the latter does not
        even appear to be part of Linux yet musl aliases them to the
        same. Should musl's return void?

ext2.c:
    __freadahead:                                              [nothing]
    __freadptr:                                                [nothing]
    __freadptrinc:                                             [nothing]
    __fseterr:                                                 [nothing]

fclose.c:
    fclose:                                                      [style]
        Checking for null pointer before calling free for f->getln_buf

feof.c:                                                          [style]
    There's two instances of feof involved, one provided by stdio_impl
    that just checks flags without a lock held and this one which means
    this file has to #undef. That just seems tricky to me, you could see
    an feof somewhere that is using the macro definition where others
    could be using a function (depending on includes and include order.)

    feof:                                                      [nothing]
    feof_unlocked:                                            [question]
    _IO_feof_unlocked:                                        [question]
        Similar to clearerr_unlocked, is this correct? The unlocked
        variants are actually locked.

ferror.c:                                                        [style]
    There's two instances of ferror involved, one provided by stdio_impl
    that just checks flags without a lock held and this one which means
    this file has to #undef. That just seems tricky to me, you could see
    an ferror somewhere that is using the macro definition where others
    could be using a function (depending on includes and include oder.)

    ferror:                                                    [nothing]
    ferror_unlocked:                                          [question]
    _IO_ferror_unlocked:                                      [question]
        Similar to clearerr_unlocked and feof_unlocked, is this correct?
        The unlocked variants are actually locked.

fflush.c:
    fflush:                                                    [nothing]

fgetc.c:
    fgetc:                                                     [nothing]

fgetln.c:
    fgetln:                                                    [nothing]

fgetpos.c:
    fgetpos:                                                       [bug]
        using *(off_t*) to write _Int64 data into fpos_t structure when
        memcpy should be used, this breaks strict aliasing. Maybe add
        an off_t to the fpos_t union?

                                                               [comment]
        POSIX states this function shall fail if the data cannot be
        stored in the fpos_t structure but the structure is 16-byte in
        size and this is a narrowing conversion. Maybe just comment that
        it can never fail. This came up in the test harness several
        times.

fgets.c:
    fgets:                                                     [nothing]

fgetwc.c:
    __fgetwc_unlocked_internal:                               [question]
        I don't understand the conditional if (l+1 >= 1), is that not
        the same as just if (l) since the only time when l+1 >=1 is
        if l==0, even for overflow. In which case the l += !l part
        in the body is even more concerning because !l will always
        be false?
    __fgetwc_unlocked:                                         [nothing]
    fgetwc:                                                    [nothing]
    fgetwc_unlocked:                                           [nothing]
    getwc_unlocked:                                            [nothing]

fgetws.c:
    fgetws:                                                    [nothing]
    __fgetws_unlocked:                                        [question]
        Similar to clearerr_unlocked, feof_unlocked and ferror_unlocked
        is this correct? The unlocked variants are actually locked.

fileno.c:
    fileno:                                                    [nothing]
    fileno_unlocked:                                          [question]
        Similar to clearerr_unlocked, feof_unlocked, ferror_unlocked
        and __fetws_unlocked is this correct? The unlocked variants are
        actually locked.

flockfile.c:
    flockfile:                                                 [nothing]

fmemopen.c:
    mseek:                                                       [style]
        It does goto upwards.

        Compound literal table to reference whence as a lookup table
        as a single expression.

        The function would be less tricky if just rewritten.
    mread:                                                     [nothing]
    mwrite:                                                    [nothing]
    fmemopen:                                                      [bug]
        The size check for fmemopen cookie to report ENOMEM is incorrect
        with the calloc below.

                                                                 [style]
        The cookie allocation handling can be greatly simplified with a
        nother structure that embeds everything and using the sizeof
        that structure instead.

fopen.c:
    fopen:                                                     [nothing]

fopencookie.c:
    fopencookie:                                                   [bug]
        Several other functions which utilize __ofl_add do the following
        if (!libc.threaded) f->lock = -1; However this one does not.

fprintf.c:
    fprintf:                                                   [nothing]

fput.c:
    fput:                                                      [nothing]

fputs.c:
    fputs:                                                     [nothing]
    fputs_unlocked:                                           [question]
        See all other instances of this question.

fputwc.c:
    __fputwc_unlocked:                                         [nothing]
    fputwc:                                                    [nothing]
    fputwc_unlocked:                                           [nothing]
    putwc_unlocked:                                            [nothing]

fputws.c:
    fputws:                                                    [nothing]
    fputws_unlocked:                                          [question]
        See all other instances of this question.

fread.c:
    fread:                                                     [nothing]
    fread_unlocked:                                           [question]
        See all other instances of this question.

freopen.c:
    freopen:                                                       [bug]
        In the case of no filename and the F_SETFL system call failing,
        and also __dup3 failing, the file is still locked.

fscanf.c:
    fscanf:                                                    [nothing]
    __isoc99_fscanf:                                           [nothing]

fseek.c:
    __fseeko_unlocked:                                         [nothing]
    __fseeko:                                                  [nothing]
    fseek:                                                     [nothing]
    fseeko:                                                    [nothing]

fsetpos.c:
    fsetpos:                                                       [bug]
        off_t is written into here incorrectly and also read incorrectly
        as this breaks strict aliasing. Maybe consider putting off_t in
        the fpos_t union and referencing it that way to avoid memcpy?

ftell.c:
    __ftello_unlocked:                                           [style]
        The whole thing is formatted in a confusing to read manner.
    __ftello:                                                  [nothing]
    ftell:                                                     [nothing]
    ftello:                                                    [nothing]

ftrylockfile.c:
    __do_orphaned_stdio_locks:                                 [nothing]
    __unlocked_locked_file:                                    [nothing]
    ftrylockfile:                                              [nothing]

funlockfile.c:
    funlockfile:                                               [nothing]

fwide.c:
    fwide:                                                     [nothing]

fwprintf.c:
    fwprintf:                                                  [nothing]

fwrite.c:
    fwrite:                                                   [question]
        Should there be a check for size*nmemb overflow?
    fwrite_unlocked:                                          [question]
        See all other instances of this question.

fwscanf.c:
    fwscanf:                                                   [nothing]
    __isoc99_fwscanf:                                          [nothing]

getc.c:
    getc:                                                      [nothing]
    _IO_getc:                                                  [nothing]

getc_unlocked.c:
    getc_unlocked:                                             [nothing]
    fgetc_unlocked:                                            [nothing]
    _IO_getc_unlocked:                                         [nothing]

getchar.c:
    getchar:                                                   [nothing]

getchar_unlocked.c:
    getchar_unlocked:                                          [nothing]

getdelim.c:                                                      [style]
    The MIN macro is never used in this translation unit.

    getdelim:                                                  [nothing]
    __getdelim:                                                [nothing]

getline.c:
    getline:                                                   [nothing]

gets.c:
    gets:                                                     [optimize]
        The length of the string is calculated twice to strip the
        newline character off it. Why not rewrite it as:
        if (ret) { size_t i = strlen(s)-1; if (s[i] == '\n') s[i] = 0; }

getw.c:
    getw:                                                      [nothing]

getwc.c:
    getwc:                                                     [nothing]

getwchar.c:
    getwchar:                                                  [nothing]
    getwchar_unlocked:                                        [question]
        This isn't actually unlocked.

ofl.c:
    __ofl_lock:                                                [nothing]
    __ofl_unlock:                                              [nothing]

ofl_add.c:
    __ofl_add:                                                 [nothing]

open_memstream.c:
    ms_seek:                                                     [style]
        It does goto upwards.

        Compound literal table to reference whence as a lookup table
        as a single expression.

        The function would be less tricky if just rewritten.
    ms_write:                                                  [nothing]
    ms_close:                                                  [nothing]
    open_memstream:                                           [question]
        Why does this allocate a 1 byte buffer? That seems small.

                                                                 [style]
        This is also doing the weird arithmetic as well for the cookie
        This function can be rewritten much nicer.

open_wmemstream.c:
    This has all the same problems as openmem_stream.c

pclose.c:
    pclose:                                                    [nothing]

perror.c:
    perror:                                                    [nothing]

popen.c:
    popen:                                                     [nothing]

printf.c:
    printf:                                                    [nothing]

putc.c:
    putc:                                                      [nothing]
    _IO_putc:                                                  [nothing]

putc_unlocked.c:
    putc_unlocked:                                             [nothing]
    fputc_unlocked:                                            [nothing]
    _IO_putc_unlocked:                                         [nothing]

putchar.c
    putchar:                                                   [nothing]

putchar_unlocked.c
    putchar_unlocked:                                          [nothing]

puts.c:
    puts:                                                      [nothing]

putw.c:
    putw:                                                      [nothing]

putwc.c:
    putwc:                                                     [nothing]

putwchar.c:
    putwchar:                                                  [nothing]
    putwchar_unlocked:                                        [question]
        This isn't actually unlocked.

remove.c:
    remove:                                                    [nothing]

rename.c:
    rename:                                                    [nothing]

rewind.c:
    rewind:                                                    [nothing]

scanf.c:
    scanf:                                                     [nothing]

setbuf.c:
    setbuf:                                                    [nothing]

setbuffer.c:
    setbuffer:                                                 [nothing]

setlinebuf.c:
    setlinebuf:                                                [nothing]

setvbuf.c:
    setvbuf:                                                       [bug]
        No locks are ever held on the file when changing fields of it.

snprintf.c:
    snprintf:                                                  [nothing]

sprintf.c:
    sprintf:                                                   [nothing]

sscanf.c:
    sscanf:                                                    [nothing]

stdderr.c:                                                     [nothing]

stdin.c:                                                       [nothing]

stdout.c:                                                      [nothing]

swprintf.c:
    swprintf:                                                  [nothing]

swscanf.c:
    swscanf:                                                   [nothing]
    __isoc99_swscanf:                                          [nothing]

tempnam.c:
    tempnam:                                                   [nothing]

tmpfile.c:
    tmpfile:                                                   [nothing]

tmpnam.c;
    tmpnam:                                                    [nothing]

ungetc.c:
    ungetc:                                                    [nothing]

ungetwc.c:
    ungetwc:                                                     [style]
        Ignoring the overly long if statement, the non-ascii route below
        does memcpy(f->rpos -= l, mbc, l) and that is just odd since I
        read that as f->rpos - l initially which almost looks correct
        but failing to update rpos which is when I noticed it actually
        does -=.

vasprintf.c:
    vasprintf:                                                 [nothing]

vdprintf.c:
    wrap_write:                                                [nothing]
    vdprintf:                                                  [nothing]

vfprintf.c:
    out:                                                       [nothing]
    pad:                                                       [nothing]
    fmt_x:                                                     [nothing]
    fmt_o:                                                     [nothing]
    fmt_u:                                                     [nothing]
    fmt_fp:                                                   [question]
    getint:                                                        [bug]
        This function overflows on INT_MIN for the else case.
    printf_core:                                                 [style]
        if (0) cutting into switch statements specified in a very smart
        order to make for really tiny code. This whole function is
        impossible to audit. I tried, I really did. Things like this:
        *(a=z-(p=1))=arg.i; are really tricky.
    vfprintf:                                                  [nothing]

vfscanf.c:
    store_int:                                                 [nothing]
    arg_n:                                                     [nothing]
    vfscanf:                                                       [bug]
    __isoc99_vfscanf:
        Calculating width using the same 10*n+*c-'0' trick can overflow
        on INT_MIN.
                                                                 [style]
        This function gets really difficult to audit further down. It's
        just very dense and should be split up somehow.

vfwprintf.c:
    Has all the same problems as vfprintf.c

vfwscanf.c:
    Has all the same problems as vfscanf.c

vprintf.c:
    vprintf:                                                   [nothing]

vscanf.c:
    vscanf:                                                    [nothing]
    __isoc99_vscanf:                                           [nothing]

vsnprintf.c:
    sn_write:                                                  [nothing]
    vsnprintf:                                                 [nothing]

vsprintf.c:
    vsprintf:                                                  [nothing]

vsscanf.c:
    do_read:                                                   [nothing]
    vsscanf:                                                   [nothing]
    __isoc99_vsscanf:                                          [nothing]

vswprintf.c:
    sw_write:                                                  [nothing]
    vswprintf:                                                   [style]
        sizeof(FILE) should be sizeof f
        compound initializer should be used for file instead of what
        ever is done here, every other function that creates local FILE
        struct does it as FILE f = { .member = value, ... }, is this
        different because of the memset? is the memset even needed?

vswscanf.c:
    wstring_read:                                              [nothing]
    vswscanf:                                                  [nothing]
    __isoc_vswscanf:                                           [nothing]

vwprintf.c:
    vwprintf:                                                  [nothing]

wvscanf.c:
    vwscanf:                                                   [nothing]
    __isoc99_vwscanf:                                          [nothing]

wprintf.c:
    wprintf:                                                   [nothing]

wscanf.c:
    wscanf:                                                    [nothing]

========================================================================
    src/internal
========================================================================

stdio_impl.h:                                                    [style]
    FUNLOCK macro has a trailing else which prompted me to look at every
    single instance of FUNLOCK to make sure all of them contained a
    semicolon. This is just dangerous, why not use the more common
    idiom of do { } while (0).

                                                               [comment]
    The _IO_FILE is ABI compat with glibc but this isn't documented any
    where in the header which makes members like mustbezero_{1,2} really
    confusion and the layout of members within the structure as well,
    since the size of a FILE can be made much smaller just by sorting
    the members which is something someone may want to do.

                                                                 [style]
    feof/ferror macros which are unlocked, this requires that code
    inside src/stdio actually #undef them when implementing and anything
    inside there that requires these _must_ avoid including stdio.h over
    this to ensure it uses the right ones. I would just call these
    something else like __feof_ui/__ferror_ui (unlocked internal) then
    no weird #undef is needed and the intent is clear.

shgetc.h:                                                      [comment]
    Maybe document this is the "scan helper" interface because it was
    not obvious what the naming was.

shgetc.c:
    __shlim:                                                   [nothing]
    __shgetc:                                                  [nothing]

intscan.h:                                                       [style]
    It isn't apparent for why <stdio.h> needs to be included. Should
    just forward declare struct FILE; here instead.

intscan.c:
    __intscan:                                                 [nothing]

floatscan.h:                                                     [style]
    It isn't apparent for why <stdio.h> needs to be included. Should
    just forward declare struct FILE; here instead.

floatscan.c:
    scanexp:                                                   [nothing]
    decfloat:                                                  [nothing]
    hexfloat:                                                  [nothing]
    __floatscan:                                               [nothing]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: stdio review
  2018-02-13 16:59 ` Rich Felker
@ 2018-02-13 19:21   ` Rich Felker
  2018-02-14  3:49     ` Dale Weiler
  0 siblings, 1 reply; 5+ messages in thread
From: Rich Felker @ 2018-02-13 19:21 UTC (permalink / raw)
  To: musl

On Tue, Feb 13, 2018 at 11:59:53AM -0500, Rich Felker wrote:
> __fopen_rb_ca.c:
>     __fopen_rb_ca:
>                                                               [question]
>         Is it possible for this function to be called with a length less
>         than UNGET?

No. Perhaps it should be checked, or perhaps we should just drop this
function. Obviously it doesn't give new no-fail guarantees since open
itself can fail for resource reasons. I think the original intent was
just to make it possible to do DNS lookups without linking free; if
that still works maybe it's worth keeping.

>                                                                [comment]
>         I was unsure what the ca meant until finding a reference in a
>         deeply nested internal header that it was "caller allocated",
>         maybe emphasis that in the source file to make review easier.

Yes, sounds good.

> __lockfile.c:
>     __lockfile:                                                [nothing]
>     __unlockfile:                                              [nothing]
> 
> __overflow.c:
>     __overflow:                                               [question]
>         The interface contract of this function is it takes int and
>         returns int, yet the passed in character is truncated through
>         narrowing conversion to unsigned char for the purposes of
>         stdio, should that same narrowed conversion be returned or
>         should the unnarrowed result be returned?

This function is actually part of glibc ABI-compat; using it
internally too was probably a mistake, and will probably be changed in
the future -- I think we can make getc/putc slightly more efficient by
doing things in a different way. In any case since the argument _c is
being used as a character unconditionally (not special-casing EOF or
anything) I think it's clear that returning the value interpreted as a
character (unsigned char) is the right thing.

> __stdio_exit.c:
>     close_file:                                               [question]
>         Function acquires lock on FILE but never unlocks it, this looks
>         intended since we're exiting stdio? If that's the case this
>         should probably be commented.

It's documented by the macro name FFINALLOCK() instead of FLOCK().

>     __stdio_exit:                                             [question]
>     __stdio_exit_needed:                                      [question]
>                                                                [comment]
>         Function acquires lock via __ofl_lock but never calls
>         __ofl_unlock, this looks intended since we're exiting stdio?
>         If that's the case this should probably be commented.

Yes it's intended, but I'm not sure why it's unclear. Of course if you
ever unlock it the state could become inconsistent (new files needing
flushing) after you already finished flushing, resulting in
nonconforming behavior (their not being flushed).

> __stdio_read.c:
>     __stdio_read:                                                [style]
>         Tricky expression involving F_EOF/F_ERR and xor masks.

Yes that's silly.

> __stdio_write.c:
>     __stdio_write:                                            [question]
>         Is it possible for len to be less than f->wpos-f->wbase?
> 
> __stdout_write.c:
>     __stdout_write:                                           [question]
>         Is it possible for len to be less than f->wpos-f->wbase?

This question isn't meaningful; they're not comparable. I think this
was a misunderstanding you raised on irc before due to overlooking the
iov++.

> __toread.c:
>     __toread:                                                  [comment]
>         It's sister function __towrite comments its' intent, maybe
>         this should as well for consistency.
> 
> __towrite.c:
>     __towrite:                                                   [style]
>         What is the purpose of parenthesis around F_NOWR? does it imply
>         a missing flag in the list, looks suspicious.

Good question. I think it comes from commit
e3cd6c5c265cd481db6e0c5b529855d99f0bda30 where this code was derived
from old code in __overflow that did:

	if (f->flags & (F_ERR|F_NOWR)) return EOF;

> clearerr.c:
>     clearerr:                                                  [nothing]
>     clearerr_unlocked:                                        [question]
>         An alias for clearerr but clearerr locks, is this correct?
>         or should clearerr_unlocked just remove EOF/ERR flags, and
>         clearerr hold a lock calling clearerr_unlocked?

It's intentional; the nonstandard *_unlocked functions are just
provided for compat purposes, and there is no reason that they have to
omit locking to behave as specified.

> dprintf.c:
>     dprintf:                                                   [nothing]
> 
> ext.c:
>     __flushlbf:                                                [nothing]
>     __fsetlocking:                                             [nothing]
>     __fwriting:                                                [nothing]
>     __freading:                                                [nothing]
>     __freadable:                                               [nothing]
>     __fwritable:                                               [nothing]
>     __flbf:                                                    [nothing]
>     __fbufsize:                                                [nothing]
>     __fpending:                                                [nothing]
>     __fpurge:                                                      [bug]
>     fpurge:                                                        [bug]
>         The interface contract for this function is that it returns void
>         and not int, whereas fpurge returns int, the latter does not
>         even appear to be part of Linux yet musl aliases them to the
>         same. Should musl's return void?

I'm not sure why we even have fpurge. I agree that these are mistakes
but I'm not sure whether fixing them or leaving them is better. They
don't seem to hurt anything as-is.

> fclose.c:
>     fclose:                                                      [style]
>         Checking for null pointer before calling free for f->getln_buf

Agree, the free should be unconditional.

> feof.c:                                                          [style]
>     There's two instances of feof involved, one provided by stdio_impl
>     that just checks flags without a lock held and this one which means
>     this file has to #undef. That just seems tricky to me, you could see
>     an feof somewhere that is using the macro definition where others
>     could be using a function (depending on includes and include order.)

Include order is not an issue (stdio_impl.h includes stdio.h so there
is only one possible order) but I agree this is problematic and could
introduce bugs into files where stdio_impl.h has been included for an
unrelated reason and a libc function calls feof() without the FILE
lock already being held.

It should be researched whether there are places that actually rely on
having the macros to get reasonable/efficient code. If so we should
probably rename the macros to feof_unlocked() and ferror_unlocked().

>     feof:                                                      [nothing]
>     feof_unlocked:                                            [question]
>     _IO_feof_unlocked:                                        [question]
>         Similar to clearerr_unlocked, is this correct? The unlocked
>         variants are actually locked.

Yes, same reason. If this one actually affects performance somewhere
that matters we could change it. clearerr certainly should not.

> ferror.c:                                                        [style]
>     There's two instances of ferror involved, one provided by stdio_impl
>     that just checks flags without a lock held and this one which means
>     this file has to #undef. That just seems tricky to me, you could see
>     an ferror somewhere that is using the macro definition where others
>     could be using a function (depending on includes and include oder.)

Likewise.

>     ferror:                                                    [nothing]
>     ferror_unlocked:                                          [question]
>     _IO_ferror_unlocked:                                      [question]
>         Similar to clearerr_unlocked and feof_unlocked, is this correct?
>         The unlocked variants are actually locked.

Likewise.

> fgetpos.c:
>     fgetpos:                                                       [bug]
>         using *(off_t*) to write _Int64 data into fpos_t structure when
>         memcpy should be used, this breaks strict aliasing. Maybe add
>         an off_t to the fpos_t union?

My leaning would be to add the off_t, but the type might not be
exposed and thus we would need to find a matching type that is
exposed. memcpy would be the nicest solution, but only if we had a way
of allowing the compiler to use builtin memcpy; otherwise it's a
gratuitous call.

>                                                                [comment]
>         POSIX states this function shall fail if the data cannot be
>         stored in the fpos_t structure but the structure is 16-byte in
>         size and this is a narrowing conversion. Maybe just comment that
>         it can never fail. This came up in the test harness several
>         times.

This is purely a matter of implementations that support 32-bit-off_t
environments where a 32-bit process's fpos_t might be too small to
store a large file offset. musl always uses 64-bit offsets and never
has that type of error condition.

> fgetwc.c:
>     __fgetwc_unlocked_internal:                               [question]
>         I don't understand the conditional if (l+1 >= 1), is that not
>         the same as just if (l) since the only time when l+1 >=1 is

I don't see how you think it's the same as l, since 0 is false but
0+1>=1 is clearly true. But the real point is that l could be
(size_t)-1, in which case l+1>=1 is false. It would be equivalent to
if(l+1) or if(l!=(size_t)-1), though.

>         if l==0, even for overflow. In which case the l += !l part
>         in the body is even more concerning because !l will always
>         be false?

No, l==0 is a true case; see above.

Also see commit 4000b0107ddd7fe733fa31d4f078c6fcd35851d6. The old code
used mbrtowc where both (size_t)-2 and (size_t)-1 were possibilities,
so the unsigned change check here made more sense. Now there's only
one possible exceptional value, not two, so it's not really needed.

> fgetws.c:
>     fgetws:                                                    [nothing]
>     __fgetws_unlocked:                                        [question]
>         Similar to clearerr_unlocked, feof_unlocked and ferror_unlocked
>         is this correct? The unlocked variants are actually locked.

Again, it's intentional. In the past we got burned by factoring
unlocked versions of complex functions -- see commits
c002668eb0352e619ea7064e4940b397b4a6e68d and
670d6d01f53b4e85be6b333bf8a137e2be6d3fc3. So I've tried to avoid
actually making separate unlocked entry points except for the
functions where it could actually matter.

> fileno.c:
>     fileno:                                                    [nothing]
>     fileno_unlocked:                                          [question]
>         Similar to clearerr_unlocked, feof_unlocked, ferror_unlocked
>         and __fetws_unlocked is this correct? The unlocked variants are
>         actually locked.

Similar.

> fmemopen.c:
>     mseek:                                                       [style]
>         It does goto upwards.

I guess you could call it that, but it's into a block with no path
out, so I don't think I would.

>         Compound literal table to reference whence as a lookup table
>         as a single expression.

I thought this was cute.

>     mread:                                                     [nothing]
>     mwrite:                                                    [nothing]
>     fmemopen:                                                      [bug]
>         The size check for fmemopen cookie to report ENOMEM is incorrect
>         with the calloc below.

Yes, already fixed in a commit pending push.

>                                                                  [style]
>         The cookie allocation handling can be greatly simplified with a
>         nother structure that embeds everything and using the sizeof
>         that structure instead.

Agreed. I want to convert it to that style later.

> fopencookie.c:
>     fopencookie:                                                   [bug]
>         Several other functions which utilize __ofl_add do the following
>         if (!libc.threaded) f->lock = -1; However this one does not.

This was discussed when it was implemented. I think I might have
requested a comment added there that got overlooked. The issue is that
it's not clear that was can safely omit locks for "cookie" FILEs in a
single-threaded process, since it's not clear that the process will
still be single-threaded when the cookie callback returns.

> freopen.c:
>     freopen:                                                       [bug]
>         In the case of no filename and the F_SETFL system call failing,
>         and also __dup3 failing, the file is still locked.

On failure, freopen closes the FILE. Yes this an awful design flaw
that makes freopen essentially always-unsafe to use, and especially
unsafe if there may be concurrent operations from other threads.
Anyway, yes, the file is still locked going into fclose, and then its
lifetime ends in fclose.

> fsetpos.c:
>     fsetpos:                                                       [bug]
>         off_t is written into here incorrectly and also read incorrectly
>         as this breaks strict aliasing. Maybe consider putting off_t in
>         the fpos_t union and referencing it that way to avoid memcpy?

Yes this needs to be fixed. Not sure of the best way; see above with
fgetpos.

> ftell.c:
>     __ftello_unlocked:                                           [style]
>         The whole thing is formatted in a confusing to read manner.

Would you have preferred the whence expression be written out in a
temp var? Or something else?

> fwrite.c:
>     fwrite:                                                   [question]
>         Should there be a check for size*nmemb overflow?

This is actually a complicated topic. Formally, I think the C standard
reads such that it would be valid for size*nmemb to exceed the size of
the data object to be written if you somehow know you'll hit a write
error before that happens. However real world implementations don't
work like that. In particular, the kernel will error out with EFAULT
if the buffer length extends past the valid userspace address range,
even if the writes would never happen; the only way to avoid this
would be to break longer-than-page writes down into separate
page-sized writes. So I think for practical purposes, we have to
interpret the standard as requiring that size*nmemb actually reflect
the size of the object passed in, and in that case, the multiplication
necessarily does not overflow. If there's an interpretation from WG14
contrary to this, we'll have to revisit it.

See also https://sourceware.org/bugzilla/show_bug.cgi?id=19165

> getdelim.c:                                                      [style]
>     The MIN macro is never used in this translation unit.

Good catch. It was actually never used, even in old revisions, so its
use must predate the original check-in...

> gets.c:
>     gets:                                                     [optimize]
>         The length of the string is calculated twice to strip the
>         newline character off it. Why not rewrite it as:
>         if (ret) { size_t i = strlen(s)-1; if (s[i] == '\n') s[i] = 0; }

Seriously, this is gets. It's always unsafe, deprecated, removed from
the current C standard. If it's gratuitously slow too, great. :-)

> open_memstream.c:
>     ms_seek:                                                     [style]
>         It does goto upwards.
> 
>         Compound literal table to reference whence as a lookup table
>         as a single expression.
> 
>         The function would be less tricky if just rewritten.

See replies on fmemopen.

>     ms_write:                                                  [nothing]
>     ms_close:                                                  [nothing]
>     open_memstream:                                           [question]
>         Why does this allocate a 1 byte buffer? That seems small.

This?

	if (!(buf=malloc(sizeof *buf))) {

It needs to be at least 1 to be valid. The way size tracking is done,
I'm not sure it would help to allocate larger now, since realloc will
get called anyway on the first write.

>                                                                  [style]
>         This is also doing the weird arithmetic as well for the cookie
>         This function can be rewritten much nicer.

There are a lot of awful requirements on open_memstream, and we meet
most but not all of them. It probably does need to be rewritten, but
I'm doubtful that the result will be very simple.

> setvbuf.c:
>     setvbuf:                                                       [bug]
>         No locks are ever held on the file when changing fields of it.

	"The setvbuf function may be used only after the stream
	pointed to by stream has been associated with an open file and
	before any other operation (other than an unsuccessful call to
	setvbuf) is performed on the stream." (7.21.5.6, ¶ 2)

If there are other concurrent calls on the FILE possible, the above
requirement has been violated and the program has undefined behavior.

> ungetwc.c:
>     ungetwc:                                                     [style]
>         Ignoring the overly long if statement, the non-ascii route below
>         does memcpy(f->rpos -= l, mbc, l) and that is just odd since I
>         read that as f->rpos - l initially which almost looks correct
>         but failing to update rpos which is when I noticed it actually
>         does -=.

It's analogous to the *-- in the line immediately above.

> vfprintf.c:
>     out:                                                       [nothing]
>     pad:                                                       [nothing]
>     fmt_x:                                                     [nothing]
>     fmt_o:                                                     [nothing]
>     fmt_u:                                                     [nothing]
>     fmt_fp:                                                   [question]
>     getint:                                                        [bug]
>         This function overflows on INT_MIN for the else case.

I don't understand. Where does INT_MIN come from? It's reading
nonnegative values.

>     printf_core:                                                 [style]
>         if (0) cutting into switch statements specified in a very smart
>         order to make for really tiny code. This whole function is
>         impossible to audit. I tried, I really did. Things like this:
>         *(a=z-(p=1))=arg.i; are really tricky.
>     vfprintf:                                                  [nothing]

Yes. It's known that there's a good deal of ugly style, but also that
simple changes to it just make the code gratuitously larger without
making it perform better or have any other better properties aside
from "less ugly". So it's hard to justify spending effort making and
testing changes that end up outwardly looking like "pure size
regressions".

> vfscanf.c:
>     store_int:                                                 [nothing]
>     arg_n:                                                     [nothing]
>     vfscanf:                                                       [bug]
>     __isoc99_vfscanf:
>         Calculating width using the same 10*n+*c-'0' trick can overflow
>         on INT_MIN.

I don't understand this either.

>                                                                  [style]
>         This function gets really difficult to audit further down. It's
>         just very dense and should be split up somehow.

Yes, maybe so. Similar issues to printf.

> vswprintf.c:
>     sw_write:                                                  [nothing]
>     vswprintf:                                                   [style]
>         sizeof(FILE) should be sizeof f

Rather memset shouldn't be used here anyway.

>         compound initializer should be used for file instead of what
>         ever is done here, every other function that creates local FILE
>         struct does it as FILE f = { .member = value, ... }, is this
>         different because of the memset? is the memset even needed?

Exactly.

> ========================================================================
>     src/internal
> ========================================================================
> 
> stdio_impl.h:                                                    [style]
>     FUNLOCK macro has a trailing else which prompted me to look at every
>     single instance of FUNLOCK to make sure all of them contained a
>     semicolon. This is just dangerous, why not use the more common
>     idiom of do { } while (0).

Indeed that should be fine.

>                                                                [comment]
>     The _IO_FILE is ABI compat with glibc but this isn't documented any
>     where in the header which makes members like mustbezero_{1,2} really
>     confusion and the layout of members within the structure as well,
>     since the size of a FILE can be made much smaller just by sorting
>     the members which is something someone may want to do.

I don't see how it would be smaller by sorting. There are only a few
gaps. int fields come in pairs (relevant on 64-bit) and the signed
char fields have some padding around them but not much. Of course it
still would be nice to add a comment that the layout is ABI if glibc
compat is desired.

>                                                                  [style]
>     feof/ferror macros which are unlocked, this requires that code
>     inside src/stdio actually #undef them when implementing and anything
>     inside there that requires these _must_ avoid including stdio.h over
>     this to ensure it uses the right ones. I would just call these
>     something else like __feof_ui/__ferror_ui (unlocked internal) then
>     no weird #undef is needed and the intent is clear.

See comments on them above.

> shgetc.h:                                                      [comment]
>     Maybe document this is the "scan helper" interface because it was
>     not obvious what the naming was.

Agreed.

> intscan.h:                                                       [style]
>     It isn't apparent for why <stdio.h> needs to be included. Should
>     just forward declare struct FILE; here instead.

That would not work, because it's *not* struct FILE, it's FILE, which
happens to be defined as "struct _IO_FILE", but that's an
implementation detail. Including <stdio.h> is the clean way to have
that.

> floatscan.h:                                                     [style]
>     It isn't apparent for why <stdio.h> needs to be included. Should
>     just forward declare struct FILE; here instead.

Same.

I think that covers almost everything; I omitted all the [nothing]
findings to make this email less dense. Some of the above are things I
can start improving right away after getting a release out; others
require more thought on how to do them without making things worse in
some aspect or another.

Thanks for taking the time to do this and posting your findings.

Rich

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: stdio review
  2018-02-13 19:21   ` Rich Felker
@ 2018-02-14  3:49     ` Dale Weiler
  2018-02-14  4:33       ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: Dale Weiler @ 2018-02-14  3:49 UTC (permalink / raw)
  To: musl

>> fgetpos.c:
>>     fgetpos:                                                       [bug]
>>         using *(off_t*) to write _Int64 data into fpos_t structure when
>>         memcpy should be used, this breaks strict aliasing. Maybe add
>>         an off_t to the fpos_t union?

> My leaning would be to add the off_t, but the type might not be
> exposed and thus we would need to find a matching type that is
> exposed. memcpy would be the nicest solution, but only if we had a way
> of allowing the compiler to use builtin memcpy; otherwise it's a
> gratuitous call.

Seeing as the type _is_ exposed, adding the off_t to the union is likewise
the nicest solution. Getting the compiler to use it's builtin memcpy, while
using things like -fno-builtin seems more challenging here. If the type did
need to be hidden, there's always the possibility of using the __may_alias__
stuff that memcpy/memset do but that seems more gratuitous to me.

>> fmemopen.c:
>>     mseek:                                                       [style]
>>         It does goto upwards.

> I guess you could call it that, but it's into a block with no path
> out, so I don't think I would.

It's one of the rarer instances where goto is used unconventionally. I say
that because most uses of goto, especially in the case where they're
in response to an error go down.

>         Compound literal table to reference whence as a lookup table
>         as a single expression.

> I thought this was cute.

It's definitely cute, but it does depend on the seek argument being
one of the macro definitions in the [0, 2] range which I had to check,
obviously those have no reason to ever be anything but those values;
ABI and all but it's just additional mental load to ensure they weren't
hence why I brought it up.

>> fwrite.c:
>>     fwrite:                                                   [question]
>>         Should there be a check for size*nmemb overflow?

> This is actually a complicated topic. Formally, I think the C standard
> reads such that it would be valid for size*nmemb to exceed the size of
> the data object to be written if you somehow know you'll hit a write
> error before that happens. However real world implementations don't
> work like that. In particular, the kernel will error out with EFAULT
> if the buffer length extends past the valid userspace address range,
> even if the writes would never happen; the only way to avoid this
> would be to break longer-than-page writes down into separate
> page-sized writes. So I think for practical purposes, we have to
> interpret the standard as requiring that size*nmemb actually reflect
> the size of the object passed in, and in that case, the multiplication
> necessarily does not overflow. If there's an interpretation from WG14
> contrary to this, we'll have to revisit it.

> See also https://sourceware.org/bugzilla/show_bug.cgi?id=19165

That is an interesting and somewhat odd edge case. Maybe for the time
being a comment within here w.r.t it maybe needing to be revisited
wouldn't hurt. In either case it doesn't appear to be harming anything.

>> gets.c:
>>     gets:                                                     [optimize]
>>         The length of the string is calculated twice to strip the
>>         newline character off it. Why not rewrite it as:
>>         if (ret) { size_t i = strlen(s)-1; if (s[i] == '\n') s[i] = 0; }

> Seriously, this is gets. It's always unsafe, deprecated, removed from
> the current C standard. If it's gratuitously slow too, great. :-)

Yes it's gets, but fixing it for O(n) instead of O(n*2) does make the musl
static set slightly smaller, also makes programs using it crash twice as
fast ;-)

>> stdio_impl.h:                                                    [style]
>>     FUNLOCK macro has a trailing else which prompted me to look at every
>>     single instance of FUNLOCK to make sure all of them contained a
>>     semicolon. This is just dangerous, why not use the more common
>>     idiom of do { } while (0).

> Indeed that should be fine.

I think it's better understood by most folks as well, glad we're on the same
page w.r.t this one. At least then you cannot fail to forget the semicolon.

>> intscan.h:                                                       [style]
>>     It isn't apparent for why <stdio.h> needs to be included. Should
>>     just forward declare struct FILE; here instead.

> That would not work, because it's *not* struct FILE, it's FILE, which
> happens to be defined as "struct _IO_FILE", but that's an
> implementation detail. Including <stdio.h> is the clean way to have
> that.

I don't understand why you couldn't replicate that behavior. It's what
stdio.h already _does_ and seeing as the associated translation unit
already includes stdio.h it seems gratuitously excessive. It's just an
opaque pointer type being passed, how is a forward declaration
incorrect. Does C distinguish between "opaque T" and "opaque T"
with different underlying struct? If so I have many of code that needs
to be changed on my end.

>> floatscan.h:                                                     [style]
>>     It isn't apparent for why <stdio.h> needs to be included. Should
>>     just forward declare struct FILE; here instead.

> Same.

Same


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: stdio review
  2018-02-14  3:49     ` Dale Weiler
@ 2018-02-14  4:33       ` Rich Felker
  0 siblings, 0 replies; 5+ messages in thread
From: Rich Felker @ 2018-02-14  4:33 UTC (permalink / raw)
  To: musl

On Tue, Feb 13, 2018 at 10:49:04PM -0500, Dale Weiler wrote:
> >> fgetpos.c:
> >>     fgetpos:                                                       [bug]
> >>         using *(off_t*) to write _Int64 data into fpos_t structure when
> >>         memcpy should be used, this breaks strict aliasing. Maybe add
> >>         an off_t to the fpos_t union?
> 
> > My leaning would be to add the off_t, but the type might not be
> > exposed and thus we would need to find a matching type that is
> > exposed. memcpy would be the nicest solution, but only if we had a way
> > of allowing the compiler to use builtin memcpy; otherwise it's a
> > gratuitous call.
> 
> Seeing as the type _is_ exposed, adding the off_t to the union is likewise

It's not exposed in plain C profile, only with POSIX or higher feature
test macros. We could just use long long though since the actual type
doesn't need to match; it only needs to be an integer type capable of
holding the range of off_t.

> >         Compound literal table to reference whence as a lookup table
> >         as a single expression.
> 
> > I thought this was cute.
> 
> It's definitely cute, but it does depend on the seek argument being
> one of the macro definitions in the [0, 2] range which I had to check,
> obviously those have no reason to ever be anything but those values;
> ABI and all but it's just additional mental load to ensure they weren't
> hence why I brought it up.

The check right above demonstrates the assumption of those
definitions. I suppose we could use designated initializers to make it
so the order is clear without changing the generated code...

> >> fwrite.c:
> >>     fwrite:                                                   [question]
> >>         Should there be a check for size*nmemb overflow?
> 
> > This is actually a complicated topic. Formally, I think the C standard
> > reads such that it would be valid for size*nmemb to exceed the size of
> > the data object to be written if you somehow know you'll hit a write
> > error before that happens. However real world implementations don't
> > work like that. In particular, the kernel will error out with EFAULT
> > if the buffer length extends past the valid userspace address range,
> > even if the writes would never happen; the only way to avoid this
> > would be to break longer-than-page writes down into separate
> > page-sized writes. So I think for practical purposes, we have to
> > interpret the standard as requiring that size*nmemb actually reflect
> > the size of the object passed in, and in that case, the multiplication
> > necessarily does not overflow. If there's an interpretation from WG14
> > contrary to this, we'll have to revisit it.
> 
> > See also https://sourceware.org/bugzilla/show_bug.cgi?id=19165
> 
> That is an interesting and somewhat odd edge case. Maybe for the time
> being a comment within here w.r.t it maybe needing to be revisited
> wouldn't hurt. In either case it doesn't appear to be harming anything.

Yes, a short comment here (and in fread.c) might be nice.

> >> gets.c:
> >>     gets:                                                     [optimize]
> >>         The length of the string is calculated twice to strip the
> >>         newline character off it. Why not rewrite it as:
> >>         if (ret) { size_t i = strlen(s)-1; if (s[i] == '\n') s[i] = 0; }
> 
> > Seriously, this is gets. It's always unsafe, deprecated, removed from
> > the current C standard. If it's gratuitously slow too, great. :-)
> 
> Yes it's gets, but fixing it for O(n) instead of O(n*2) does make the musl
> static set slightly smaller, also makes programs using it crash twice as
> fast ;-)

:-)

> >> stdio_impl.h:                                                    [style]
> >>     FUNLOCK macro has a trailing else which prompted me to look at every
> >>     single instance of FUNLOCK to make sure all of them contained a
> >>     semicolon. This is just dangerous, why not use the more common
> >>     idiom of do { } while (0).
> 
> > Indeed that should be fine.
> 
> I think it's better understood by most folks as well, glad we're on the same
> page w.r.t this one. At least then you cannot fail to forget the semicolon.

Yes, exactly.

> >> intscan.h:                                                       [style]
> >>     It isn't apparent for why <stdio.h> needs to be included. Should
> >>     just forward declare struct FILE; here instead.
> 
> > That would not work, because it's *not* struct FILE, it's FILE, which
> > happens to be defined as "struct _IO_FILE", but that's an
> > implementation detail. Including <stdio.h> is the clean way to have
> > that.
> 
> I don't understand why you couldn't replicate that behavior. It's what
> stdio.h already _does_ and seeing as the associated translation unit
> already includes stdio.h it seems gratuitously excessive. It's just an
> opaque pointer type being passed, how is a forward declaration
> incorrect. Does C distinguish between "opaque T" and "opaque T"
> with different underlying struct? If so I have many of code that needs
> to be changed on my end.

See 6.2.7 Compatible type and composite type ¶1:

	"Moreover, two structure, union, or enumerated types declared
	in separate translation units are compatible if their tags and
	members satisfy the following requirements: If one is declared
	with a tag, the other shall be declared with the same tag. If
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	both are completed anywhere within their respective
	translation units, then the following additional requirements
	apply: there shall be a one-to-one correspondence between
	their members such that each pair of corresponding members are
	declared with compatible types; if one member of the pair is
	declared with an alignment specifier, the other is declared
	with an equivalent alignment specifier; and if one member of
	the pair is declared with a name, the other is declared with
	the same name. For two structures, corresponding members shall
	be declared in the same order. For two structures or unions,
	corresponding bit-fields shall have the same widths. For two
	enumerations, corresponding members shall have the same
	values."

Without fancy linker-level checks for UB or LTO, you might say the UB
from declarations with incompatible types is purely theoretical, but
the bigger issue is that, when you did include both intscan.h with
"struct FILE *" and stdio_impl.h in the implementation file intscan.c,
the conflicting definitions would be visible and you'd get an error.

Yes of course we could declare it explicitly using "struct _IO_FILE *"
in intscan.h, but that would violate DRY and would break if the tag
were ever changed (e.g. in a future ABI not using glibc compat cruft).
Including a header that defines FILE really is the right way to do
this.

Rich


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-02-14  4:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-13 16:19 stdio review Dale Weiler
2018-02-13 16:59 ` Rich Felker
2018-02-13 19:21   ` Rich Felker
2018-02-14  3:49     ` Dale Weiler
2018-02-14  4:33       ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).