mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Dale Weiler <weilercdale@gmail.com>
To: musl@lists.openwall.com
Subject: Re: stdio review
Date: Tue, 13 Feb 2018 22:49:04 -0500	[thread overview]
Message-ID: <CAE5zrZm-nVZVRs5nNOOEiw=247+AFXtJ6u20vj2mxnrv4B4bEw@mail.gmail.com> (raw)
In-Reply-To: <20180213192131.GO1627@brightrain.aerifal.cx>

>> fgetpos.c:
>>     fgetpos:                                                       [bug]
>>         using *(off_t*) to write _Int64 data into fpos_t structure when
>>         memcpy should be used, this breaks strict aliasing. Maybe add
>>         an off_t to the fpos_t union?

> My leaning would be to add the off_t, but the type might not be
> exposed and thus we would need to find a matching type that is
> exposed. memcpy would be the nicest solution, but only if we had a way
> of allowing the compiler to use builtin memcpy; otherwise it's a
> gratuitous call.

Seeing as the type _is_ exposed, adding the off_t to the union is likewise
the nicest solution. Getting the compiler to use it's builtin memcpy, while
using things like -fno-builtin seems more challenging here. If the type did
need to be hidden, there's always the possibility of using the __may_alias__
stuff that memcpy/memset do but that seems more gratuitous to me.

>> fmemopen.c:
>>     mseek:                                                       [style]
>>         It does goto upwards.

> I guess you could call it that, but it's into a block with no path
> out, so I don't think I would.

It's one of the rarer instances where goto is used unconventionally. I say
that because most uses of goto, especially in the case where they're
in response to an error go down.

>         Compound literal table to reference whence as a lookup table
>         as a single expression.

> I thought this was cute.

It's definitely cute, but it does depend on the seek argument being
one of the macro definitions in the [0, 2] range which I had to check,
obviously those have no reason to ever be anything but those values;
ABI and all but it's just additional mental load to ensure they weren't
hence why I brought it up.

>> fwrite.c:
>>     fwrite:                                                   [question]
>>         Should there be a check for size*nmemb overflow?

> This is actually a complicated topic. Formally, I think the C standard
> reads such that it would be valid for size*nmemb to exceed the size of
> the data object to be written if you somehow know you'll hit a write
> error before that happens. However real world implementations don't
> work like that. In particular, the kernel will error out with EFAULT
> if the buffer length extends past the valid userspace address range,
> even if the writes would never happen; the only way to avoid this
> would be to break longer-than-page writes down into separate
> page-sized writes. So I think for practical purposes, we have to
> interpret the standard as requiring that size*nmemb actually reflect
> the size of the object passed in, and in that case, the multiplication
> necessarily does not overflow. If there's an interpretation from WG14
> contrary to this, we'll have to revisit it.

> See also https://sourceware.org/bugzilla/show_bug.cgi?id=19165

That is an interesting and somewhat odd edge case. Maybe for the time
being a comment within here w.r.t it maybe needing to be revisited
wouldn't hurt. In either case it doesn't appear to be harming anything.

>> gets.c:
>>     gets:                                                     [optimize]
>>         The length of the string is calculated twice to strip the
>>         newline character off it. Why not rewrite it as:
>>         if (ret) { size_t i = strlen(s)-1; if (s[i] == '\n') s[i] = 0; }

> Seriously, this is gets. It's always unsafe, deprecated, removed from
> the current C standard. If it's gratuitously slow too, great. :-)

Yes it's gets, but fixing it for O(n) instead of O(n*2) does make the musl
static set slightly smaller, also makes programs using it crash twice as
fast ;-)

>> stdio_impl.h:                                                    [style]
>>     FUNLOCK macro has a trailing else which prompted me to look at every
>>     single instance of FUNLOCK to make sure all of them contained a
>>     semicolon. This is just dangerous, why not use the more common
>>     idiom of do { } while (0).

> Indeed that should be fine.

I think it's better understood by most folks as well, glad we're on the same
page w.r.t this one. At least then you cannot fail to forget the semicolon.

>> intscan.h:                                                       [style]
>>     It isn't apparent for why <stdio.h> needs to be included. Should
>>     just forward declare struct FILE; here instead.

> That would not work, because it's *not* struct FILE, it's FILE, which
> happens to be defined as "struct _IO_FILE", but that's an
> implementation detail. Including <stdio.h> is the clean way to have
> that.

I don't understand why you couldn't replicate that behavior. It's what
stdio.h already _does_ and seeing as the associated translation unit
already includes stdio.h it seems gratuitously excessive. It's just an
opaque pointer type being passed, how is a forward declaration
incorrect. Does C distinguish between "opaque T" and "opaque T"
with different underlying struct? If so I have many of code that needs
to be changed on my end.

>> floatscan.h:                                                     [style]
>>     It isn't apparent for why <stdio.h> needs to be included. Should
>>     just forward declare struct FILE; here instead.

> Same.

Same


  reply	other threads:[~2018-02-14  3:49 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-13 16:19 Dale Weiler
2018-02-13 16:59 ` Rich Felker
2018-02-13 19:21   ` Rich Felker
2018-02-14  3:49     ` Dale Weiler [this message]
2018-02-14  4:33       ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAE5zrZm-nVZVRs5nNOOEiw=247+AFXtJ6u20vj2mxnrv4B4bEw@mail.gmail.com' \
    --to=weilercdale@gmail.com \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).