From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.4 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,HTML_MESSAGE,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 23090 invoked from network); 13 Jun 2022 18:46:30 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 13 Jun 2022 18:46:30 -0000 Received: (qmail 3095 invoked by uid 550); 13 Jun 2022 18:46:27 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 31867 invoked from network); 13 Jun 2022 18:39:08 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=eayalfMsP5BnOfgxGI/MoB4sGjdJ6Po709PF3+5f5Vs=; b=TIRfVzi8aqPXTkwNM93VhqjUNMjoEyHPbjhdh1e8gTPzC0UsN+kGCzU19vITUF9953 IA52HYTHSDjKjS8SFKyKvRTte1IociWrkYBNZ27NTX4SpEHPeAN+FLjXZoCdUQUJ6xPb z6S6CVJhXpkh5h0Jy0idBITBbYZh9m8HlFMoJNR9Cbnq//fJp5js5XlODJOP3EX2ou9l 2rdnprH+0kQHjKizL0agW1BfrdTrQtbW390avlwQ7jqcQOJaEek2QSYVvcNerE1fJHHk D4/gtEwlIWDxLlMQRnRjh2+mxq6hKVV4MoFX5Dl/DG4UzCdwO6Oe54UFbBMw2wJt5MtJ 5ERQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=eayalfMsP5BnOfgxGI/MoB4sGjdJ6Po709PF3+5f5Vs=; b=FtBKii1Vc4TxU4TQdkEKpE9TWDQrBM83HKQWEN1ip0BsXbmBl6vZx/RM6V7YwS+hWJ JD+ZMJJrWqPKpZGoRRVrQCevU7cTGzx5mc2vJ81tSIm/92zLBoCMcvJyaUcrNN2y05JP QFRa4rdSRUpIivQTSoj6S0ncargEBGJSSpcw+kM2EYjR9B3GdJp11LIGza0cKe+T3A62 IAEVJAuiik9fl8/MKlPzKV+3oGlnPmUsLo7GiCMThXaC/y0BcWCK8W3OPnShNyEERwlj qtp3iizch5SoyJQyP9cgzaWHzT1bwBbjunPUqNTjyTvS3mDX45HVp8d8qvxOvEqTR0rg SrUg== X-Gm-Message-State: AJIora9wUjtwdJuWSMyIhXTq90t/CUJYq6e2Q8fUMmFROpmht9i3KOK/ GgMD4b2/gwmFXFfGX/OlVWGHMkVb1PV1J0rniZ/8ZinjrKU= X-Google-Smtp-Source: AGRyM1uc1CeCADlKLWrXiMPlTHErbEDI1IFdbURnL5QmQdFHnFlCXQE987S67ucfx3FwskcbmFSzF2t0NyNfLtdUIiA= X-Received: by 2002:a19:f203:0:b0:479:50a4:c925 with SMTP id q3-20020a19f203000000b0047950a4c925mr696185lfh.329.1655145536454; Mon, 13 Jun 2022 11:38:56 -0700 (PDT) MIME-Version: 1.0 References: <20220613170849.GG7074@brightrain.aerifal.cx> In-Reply-To: <20220613170849.GG7074@brightrain.aerifal.cx> From: Waldek Kozaczuk Date: Mon, 13 Jun 2022 14:38:45 -0400 Message-ID: To: Rich Felker Cc: musl@lists.openwall.com Content-Type: multipart/alternative; boundary="000000000000c0884105e1589982" Subject: Re: [musl] netlink.c: missing handling of EAGAIN and EWOULDBLOCK --000000000000c0884105e1589982 Content-Type: text/plain; charset="UTF-8" Hi, I appreciate your reply. Unfortunately, I am not an expert on netlink but another thing I have found is that for comparison Golang uses a 0 flags argument when reading from netlink socket - https://github.com/golang/go/blob/master/src/syscall/netlink_linux.go#L79. On a side note, I could not replicate the EAGAIN errors in Linux so you may be right that Linux has some special handling of recv() for netlink sockets which OSv might be missing. But I could not find anything in the official netlink docs - https://man7.org/linux/man-pages/man7/netlink.7.html and https://man7.org/linux/man-pages/man7/rtnetlink.7.html. And if that is undocumented, shall we rely on it implicitly? Waldek On Mon, Jun 13, 2022 at 1:08 PM Rich Felker wrote: > On Mon, Jun 13, 2022 at 11:41:57AM -0400, Waldek Kozaczuk wrote: > > Hi, > > > > Very recently we implemented minimal rnetlink support on OSv side which > > allowed us to finally switch to the netlink-based implementation of > > getifaddrs() and if_nameindex(). > > > > However, I noticed that the function __netlink_enumerate() in > > https://github.com/ifduyue/musl/blob/master/src/network/netlink.c uses > > MSG_DONTWAIT flag when calling recv() which may fail with EAGAIN or > > EWOULDBLOCK and there is no error/retry handling for that. I actually saw > > both functions fail occasionally on OSv. > > > > One way to fix is to add missing error handling. But another simpler > > solution is to stop using MSG_DONTWAIT altogether and force recv() to > > block. In other words, the line: > > > > r = recv(fd, u.buf, sizeof(u.buf), MSG_DONTWAIT); > > > > should change to: > > > > r = recv(fd, u.buf, sizeof(u.buf), 0); > > > > For time being we are applying a header trick on OSv side to re-define > > MSG_DONTWAIT as 0 when compiling those specific musl sources. > > Thanks! I'll try to track this down. One concern is that I'm not sure > how MSG_DONTWAIT is supposed to interact with "short reads" -- is it > needed (for netlink) to prevent blocking when some data has been read > but there is still buffer space for more? > > On a related issue, I'm pretty sure the netlink API doesn't allow for > partial reads with some data remaining buffered on the kernel side, > but we should probably verify that too. > > Rich > --000000000000c0884105e1589982 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,

I appreciate your reply.
=
Unfortunately, I am not an expert on netlink but another thi= ng I have found is that for comparison Golang uses a 0 flags argument when = reading from netlink socket -=C2=A0https://github.com/golang/go/= blob/master/src/syscall/netlink_linux.go#L79.

= On a side note, I could not replicate the EAGAIN errors in Linux so you may= be right that Linux has some special handling of recv() for netlink socket= s which OSv might be missing. But I could not find anything=C2=A0in the off= icial=C2=A0netlink docs -=C2=A0https://man7.org/linux/man-pages/man7/netlink.7.html and=C2=A0https://man7.org/linux/man-pages/man7/rtnetlink.7.html. And if that= is undocumented, shall we rely on it implicitly?

= Waldek

On Mon, Jun 13, 2022 at 1:08 PM Rich Felker <dalias@libc.org> wrote:
On Mon, Jun 13, 2022 at 11:41:57AM -0400= , Waldek Kozaczuk wrote:
> Hi,
>
> Very recently we implemented minimal rnetlink support on OSv side whic= h
> allowed us to finally switch to the netlink-based implementation of > getifaddrs() and if_nameindex().
>
> However, I noticed that the function __netlink_enumerate() in
> https://github.com/ifduyue/mus= l/blob/master/src/network/netlink.c uses
> MSG_DONTWAIT flag when calling recv() which may fail with EAGAIN or > EWOULDBLOCK and there is no error/retry handling for that. I actually = saw
> both functions fail occasionally on OSv.
>
> One way to fix is to add missing error handling. But another simpler > solution is to stop using MSG_DONTWAIT altogether and force recv() to<= br> > block. In other words, the line:
>
> r =3D recv(fd, u.buf, sizeof(u.buf), MSG_DONTWAIT);
>
> should change to:
>
> r =3D recv(fd, u.buf, sizeof(u.buf), 0);
>
> For time being we are applying a header trick on OSv side to re-define=
> MSG_DONTWAIT as 0 when compiling those specific musl sources.

Thanks! I'll try to track this down. One concern is that I'm not su= re
how MSG_DONTWAIT is supposed to interact with "short reads" -- is= it
needed (for netlink) to prevent blocking when some data has been read
but there is still buffer space for more?

On a related issue, I'm pretty sure the netlink API doesn't allow f= or
partial reads with some data remaining buffered on the kernel side,
but we should probably verify that too.

Rich
--000000000000c0884105e1589982--