Hi, I appreciate your reply. Unfortunately, I am not an expert on netlink but another thing I have found is that for comparison Golang uses a 0 flags argument when reading from netlink socket - https://github.com/golang/go/blob/master/src/syscall/netlink_linux.go#L79. On a side note, I could not replicate the EAGAIN errors in Linux so you may be right that Linux has some special handling of recv() for netlink sockets which OSv might be missing. But I could not find anything in the official netlink docs - https://man7.org/linux/man-pages/man7/netlink.7.html and https://man7.org/linux/man-pages/man7/rtnetlink.7.html. And if that is undocumented, shall we rely on it implicitly? Waldek On Mon, Jun 13, 2022 at 1:08 PM Rich Felker wrote: > On Mon, Jun 13, 2022 at 11:41:57AM -0400, Waldek Kozaczuk wrote: > > Hi, > > > > Very recently we implemented minimal rnetlink support on OSv side which > > allowed us to finally switch to the netlink-based implementation of > > getifaddrs() and if_nameindex(). > > > > However, I noticed that the function __netlink_enumerate() in > > https://github.com/ifduyue/musl/blob/master/src/network/netlink.c uses > > MSG_DONTWAIT flag when calling recv() which may fail with EAGAIN or > > EWOULDBLOCK and there is no error/retry handling for that. I actually saw > > both functions fail occasionally on OSv. > > > > One way to fix is to add missing error handling. But another simpler > > solution is to stop using MSG_DONTWAIT altogether and force recv() to > > block. In other words, the line: > > > > r = recv(fd, u.buf, sizeof(u.buf), MSG_DONTWAIT); > > > > should change to: > > > > r = recv(fd, u.buf, sizeof(u.buf), 0); > > > > For time being we are applying a header trick on OSv side to re-define > > MSG_DONTWAIT as 0 when compiling those specific musl sources. > > Thanks! I'll try to track this down. One concern is that I'm not sure > how MSG_DONTWAIT is supposed to interact with "short reads" -- is it > needed (for netlink) to prevent blocking when some data has been read > but there is still buffer space for more? > > On a related issue, I'm pretty sure the netlink API doesn't allow for > partial reads with some data remaining buffered on the kernel side, > but we should probably verify that too. > > Rich >