Hi,

I appreciate your reply.

Unfortunately, I am not an expert on netlink but another thing I have found is that for comparison Golang uses a 0 flags argument when reading from netlink socket - https://github.com/golang/go/blob/master/src/syscall/netlink_linux.go#L79.

On a side note, I could not replicate the EAGAIN errors in Linux so you may be right that Linux has some special handling of recv() for netlink sockets which OSv might be missing. But I could not find anything in the official netlink docs - https://man7.org/linux/man-pages/man7/netlink.7.html and https://man7.org/linux/man-pages/man7/rtnetlink.7.html. And if that is undocumented, shall we rely on it implicitly?

Waldek

On Mon, Jun 13, 2022 at 1:08 PM Rich Felker <dalias@libc.org> wrote:
On Mon, Jun 13, 2022 at 11:41:57AM -0400, Waldek Kozaczuk wrote:
> Hi,
>
> Very recently we implemented minimal rnetlink support on OSv side which
> allowed us to finally switch to the netlink-based implementation of
> getifaddrs() and if_nameindex().
>
> However, I noticed that the function __netlink_enumerate() in
> https://github.com/ifduyue/musl/blob/master/src/network/netlink.c uses
> MSG_DONTWAIT flag when calling recv() which may fail with EAGAIN or
> EWOULDBLOCK and there is no error/retry handling for that. I actually saw
> both functions fail occasionally on OSv.
>
> One way to fix is to add missing error handling. But another simpler
> solution is to stop using MSG_DONTWAIT altogether and force recv() to
> block. In other words, the line:
>
> r = recv(fd, u.buf, sizeof(u.buf), MSG_DONTWAIT);
>
> should change to:
>
> r = recv(fd, u.buf, sizeof(u.buf), 0);
>
> For time being we are applying a header trick on OSv side to re-define
> MSG_DONTWAIT as 0 when compiling those specific musl sources.

Thanks! I'll try to track this down. One concern is that I'm not sure
how MSG_DONTWAIT is supposed to interact with "short reads" -- is it
needed (for netlink) to prevent blocking when some data has been read
but there is still buffer space for more?

On a related issue, I'm pretty sure the netlink API doesn't allow for
partial reads with some data remaining buffered on the kernel side,
but we should probably verify that too.

Rich