mailing list of musl libc
 help / color / mirror / code / Atom feed
* Reviewing if_nameindex and getifaddrs patch
@ 2014-07-28  0:49 Rich Felker
  2014-07-28  8:13 ` Timo Teras
  0 siblings, 1 reply; 6+ messages in thread
From: Rich Felker @ 2014-07-28  0:49 UTC (permalink / raw)
  To: musl

In regards to:

http://git.alpinelinux.org/cgit/aports/plain/main/musl/1002-reimplement-if_nameindex-and-getifaddrs-using-netlin.patch?id=3227b4ad816f850f655b6f44dc497926cb2cdcd1

I don't see any remaining major issues except that it would obviously
be nice if this could be smaller. A few minor details:

For __netlink.h, I'd been asking whether everything in this file is
used. I did some tests just commenting things out and only found a few
items that could be removed:

//#define NLM_F_MULTI   2
//#define NLM_F_ACK     4
//#define NLM_F_ATOMIC  0x400
//#define NLMSG_NOOP    0x1
//#define NLMSG_OVERRUN 0x4
//#define RTM_NEWADDR   20

I don't think it makes sense to selectively omit such a small set of
lines, so I'm fine with just leaving the unneeded ones.

One thing that seems rather out-of-place is:

#define IFADDRS_HASH_SIZE 64

This seems to be an implementation detail of the two functions using
netlink and not part of the netlink API (either the kernel's API or
musl's internal __rtnetlink_enumerate API for it). So I think it would
make sense to move this into the files that use it.

In general, static functions need not/should not have __-prefixed
names. It's not really a problem but it makes it less obvious that
they're static. It would be nice to change these and perhaps give them
slightly more descriptive names (although admittedly that hasn't been
done elsewhere in musl) just so they make sense in a debugger, etc.)

I really don't understand the 'hash' logic for getifaddrs yet, but the
function seems to work. Some general description of what data the
callback receives and what it's doing with it could be helpful for
reviewing this part of the patch.

Finally, I've been trying to reduce the unnecessary usage of
__-prefixed filenames for implementation-internal purposes, except
when they implement a function whose name is __-prefixed. So perhaps
netlink.h and netlink.c, or netlink.h and __rtnetlink_enumerate.c,
would be better names for these files (I tend to prefer netlink.c I
think).

For if_nameindex, we discussed the 'hash' buckets on IRC and whether
they are necessary. I noted that it would be nice if there were a way
to build the result array directly in the callback, but this is
difficult without breaking up the allocation into lots of small pieces
since there are strings that need to be located outside of the array,
and "relocating" them on realloc requires nontrivial code. So I'm not
sure if there's anything to improve here, but it's an idea someone
else could look at.

Rich


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Reviewing if_nameindex and getifaddrs patch
  2014-07-28  0:49 Reviewing if_nameindex and getifaddrs patch Rich Felker
@ 2014-07-28  8:13 ` Timo Teras
  2014-07-29 14:34   ` Rich Felker
  2014-07-29 14:49   ` Rich Felker
  0 siblings, 2 replies; 6+ messages in thread
From: Timo Teras @ 2014-07-28  8:13 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

On Sun, 27 Jul 2014 20:49:06 -0400
Rich Felker <dalias@libc.org> wrote:

> In regards to:
> 
> http://git.alpinelinux.org/cgit/aports/plain/main/musl/1002-reimplement-if_nameindex-and-getifaddrs-using-netlin.patch?id=3227b4ad816f850f655b6f44dc497926cb2cdcd1

Updated patch in the end.

> I don't see any remaining major issues except that it would obviously
> be nice if this could be smaller. A few minor details:
> 
> For __netlink.h, I'd been asking whether everything in this file is
> used. I did some tests just commenting things out and only found a few
> items that could be removed:
> 
> //#define NLM_F_MULTI   2
> //#define NLM_F_ACK     4
> //#define NLM_F_ATOMIC  0x400
> //#define NLMSG_NOOP    0x1
> //#define NLMSG_OVERRUN 0x4
> //#define RTM_NEWADDR   20
> 
> I don't think it makes sense to selectively omit such a small set of
> lines, so I'm fine with just leaving the unneeded ones

Yes, I wanted to add full "groups" of defines in those cases.

> One thing that seems rather out-of-place is:
> 
> #define IFADDRS_HASH_SIZE 64
> 
> This seems to be an implementation detail of the two functions using
> netlink and not part of the netlink API (either the kernel's API or
> musl's internal __rtnetlink_enumerate API for it). So I think it would
> make sense to move this into the files that use it.

Done. The only reason I put it into the header, as it affects the
internal netlink parsing code in two .c files. So I would've preferred
to have one #define instead of two.

> In general, static functions need not/should not have __-prefixed
> names. It's not really a problem but it makes it less obvious that
> they're static. It would be nice to change these and perhaps give them
> slightly more descriptive names (although admittedly that hasn't been
> done elsewhere in musl) just so they make sense in a debugger, etc.)

Names updated.

> I really don't understand the 'hash' logic for getifaddrs yet, but the
> function seems to work. Some general description of what data the
> callback receives and what it's doing with it could be helpful for
> reviewing this part of the patch.

The rtnetlink enumerate does two netlink 'dumps'. One of 'link' type
that dumps the physical interfaces, and one of 'addr' type that lists
all network addresses.

The 'link' dump is used to get the PF_PACKET ifaddrs with mac
addresses, as well as to get the ifindex<->ifname mappings and the
logical flags of each physical interface (SIOCGIFFLAGS equivalent).
These put into the hash based on ifindex.

The 'addr' dump then does the ipv4/ipv6 address (easy to extend if we
need to support something new in future). These lookup the physical
'link' info using ifindex. This is needed to fill ifa_flags with valid
data. It is also used to lookup the interface in non-ipv4 case. (In
ipv4 case it seems that kernel always sends IFA_LABEL containing the
real or aliased interface name. Though this probably might change if
the alias interface system is removed at some point.)

> Finally, I've been trying to reduce the unnecessary usage of
> __-prefixed filenames for implementation-internal purposes, except
> when they implement a function whose name is __-prefixed. So perhaps
> netlink.h and netlink.c, or netlink.h and __rtnetlink_enumerate.c,
> would be better names for these files (I tend to prefer netlink.c I
> think).

Done.

> For if_nameindex, we discussed the 'hash' buckets on IRC and whether
> they are necessary. I noted that it would be nice if there were a way
> to build the result array directly in the callback, but this is
> difficult without breaking up the allocation into lots of small pieces
> since there are strings that need to be located outside of the array,
> and "relocating" them on realloc requires nontrivial code. So I'm not
> sure if there's anything to improve here, but it's an idea someone
> else could look at.

The hash in if_indexname() is used to suppress duplicates
ifindex<->ifname mappings. This could happen for IPv4 addresses due to
kernel reporting IFA_LABEL always. Even if we parsed AF_INET only for
'addr' messages, this would be the case, because an interface can have
multiple address ('ip addr add' or 'ifconfig ethX add') that are not on
alias interface - in that case IFA_LABEL is the interface name.

-Timo

Updated patch:

From 4788fd92ed8c83bae2f1b429c6ea370fdf557dff Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Timo=20Ter=C3=A4s?= <timo.teras@iki.fi>
Date: Tue, 8 Apr 2014 14:03:16 +0000
Subject: [PATCH] reimplement if_nameindex and getifaddrs using netlink

---
 src/network/getifaddrs.c   | 314 ++++++++++++++++++++++++---------------------
 src/network/if_nameindex.c | 132 +++++++++++++------
 src/network/netlink.c      |  52 ++++++++
 src/network/netlink.h      |  94 ++++++++++++++
 4 files changed, 407 insertions(+), 185 deletions(-)
 create mode 100644 src/network/netlink.c
 create mode 100644 src/network/netlink.h

diff --git a/src/network/getifaddrs.c b/src/network/getifaddrs.c
index 5a94cc7..89a8f72 100644
--- a/src/network/getifaddrs.c
+++ b/src/network/getifaddrs.c
@@ -1,181 +1,203 @@
-/* (C) 2013 John Spencer. released under musl's standard MIT license. */
-#undef _GNU_SOURCE
 #define _GNU_SOURCE
-#include <ifaddrs.h>
-#include <stdlib.h>
-#include <net/if.h> /* IFNAMSIZ, ifreq, ifconf */
-#include <stdio.h>
-#include <ctype.h>
-#include <string.h>
 #include <errno.h>
-#include <arpa/inet.h> /* inet_pton */
+#include <string.h>
+#include <stdlib.h>
 #include <unistd.h>
-#include <sys/ioctl.h>
-#include <sys/socket.h>
+#include <ifaddrs.h>
+#include <syscall.h>
+#include <net/if.h>
+#include <netinet/in.h>
+#include "netlink.h"
 
-typedef union {
-	struct sockaddr_in6 v6;
+#define IFADDRS_HASH_SIZE 64
+
+/* getifaddrs() reports hardware addresses with PF_PACKET that implies
+ * struct sockaddr_ll.  But e.g. Infiniband socket address length is
+ * longer than sockaddr_ll.ssl_addr[8] can hold. Use this hack struct
+ * to extend ssl_addr - callers should be able to still use it. */
+struct sockaddr_ll_hack {
+	unsigned short sll_family, sll_protocol;
+	int sll_ifindex;
+	unsigned short sll_hatype;
+	unsigned char sll_pkttype, sll_halen;
+	unsigned char sll_addr[24];
+};
+
+union sockany {
+	struct sockaddr sa;
+	struct sockaddr_ll_hack ll;
 	struct sockaddr_in v4;
-} soa;
+	struct sockaddr_in6 v6;
+};
 
-typedef struct ifaddrs_storage {
+struct ifaddrs_storage {
 	struct ifaddrs ifa;
-	soa addr;
-	soa netmask;
-	soa dst;
+	struct ifaddrs_storage *hash_next;
+	union sockany addr, netmask, ifu;
+	unsigned int index;
 	char name[IFNAMSIZ+1];
-} stor;
-#define next ifa.ifa_next
+};
 
-static stor* list_add(stor** list, stor** head, char* ifname)
+struct ifaddrs_ctx {
+	struct ifaddrs_storage *first;
+	struct ifaddrs_storage *last;
+	struct ifaddrs_storage *hash[IFADDRS_HASH_SIZE];
+};
+
+void freeifaddrs(struct ifaddrs *ifp)
 {
-	stor* curr = calloc(1, sizeof(stor));
-	if(curr) {
-		strcpy(curr->name, ifname);
-		curr->ifa.ifa_name = curr->name;
-		if(*head) (*head)->next = (struct ifaddrs*) curr;
-		*head = curr;
-		if(!*list) *list = curr;
+	struct ifaddrs *n;
+	while (ifp) {
+		n = ifp->ifa_next;
+		free(ifp);
+		ifp = n;
 	}
-	return curr;
 }
 
-void freeifaddrs(struct ifaddrs *ifp)
+static void copy_addr(struct sockaddr **r, int af, union sockany *sa, void *addr, size_t addrlen, int ifindex)
 {
-	stor *head = (stor *) ifp;
-	while(head) {
-		void *p = head;
-		head = (stor *) head->next;
-		free(p);
+	uint8_t *dst;
+	int len;
+
+	switch (af) {
+	case AF_INET:
+		dst = (uint8_t*) &sa->v4.sin_addr;
+		len = 4;
+		break;
+	case AF_INET6:
+		dst = (uint8_t*) &sa->v6.sin6_addr;
+		len = 16;
+		if (IN6_IS_ADDR_LINKLOCAL(addr) || IN6_IS_ADDR_MC_LINKLOCAL(addr))
+			sa->v6.sin6_scope_id = ifindex;
+		break;
+	default:
+		return;
 	}
+	if (addrlen < len) return;
+	sa->sa.sa_family = af;
+	memcpy(dst, addr, len);
+	*r = &sa->sa;
 }
 
-static void ipv6netmask(unsigned prefix_length, struct sockaddr_in6 *sa)
+static void gen_netmask(struct sockaddr **r, int af, union sockany *sa, int prefixlen)
 {
-	unsigned char* hb = sa->sin6_addr.s6_addr;
-	unsigned onebytes = prefix_length / 8;
-	unsigned bits = prefix_length % 8;
-	unsigned nullbytes = 16 - onebytes;
-	memset(hb, -1, onebytes);
-	memset(hb+onebytes, 0, nullbytes);
-	if(bits) {
-		unsigned char x = -1;
-		x <<= 8 - bits;
-		hb[onebytes] = x;
-	}
+	uint8_t addr[16] = {0};
+	int i;
+
+	if (prefixlen > 8*sizeof(addr)) prefixlen = 8*sizeof(addr);
+	i = prefixlen / 8;
+	memset(addr, 0xff, i);
+	if (i < sizeof(addr)) addr[i++] = 0xff << (8 - (prefixlen % 8));
+	copy_addr(r, af, sa, addr, sizeof(addr), 0);
 }
 
-static void dealwithipv6(stor **list, stor** head)
+static void copy_lladdr(struct sockaddr **r, union sockany *sa, void *addr, size_t addrlen, int ifindex, unsigned short hatype)
 {
-	FILE* f = fopen("/proc/net/if_inet6", "rbe");
-	/* 00000000000000000000000000000001 01 80 10 80 lo
-	   A                                B  C  D  E  F
-	   all numbers in hex
-	   A = addr B=netlink device#, C=prefix length,
-	   D = scope value (ipv6.h) E = interface flags (rnetlink.h, addrconf.c)
-	   F = if name */
-	char v6conv[32 + 7 + 1], *v6;
-	char *line, linebuf[512];
-	if(!f) return;
-	while((line = fgets(linebuf, sizeof linebuf, f))) {
-		v6 = v6conv;
-		size_t i = 0;
-		for(; i < 8; i++) {
-			memcpy(v6, line, 4);
-			v6+=4;
-			*v6++=':';
-			line+=4;
-		}
-		--v6; *v6 = 0;
-		line++;
-		unsigned b, c, d, e;
-		char name[IFNAMSIZ+1];
-		if(5 == sscanf(line, "%x %x %x %x %s", &b, &c, &d, &e, name)) {
-			struct sockaddr_in6 sa = {0};
-			if(1 == inet_pton(AF_INET6, v6conv, &sa.sin6_addr)) {
-				sa.sin6_family = AF_INET6;
-				stor* curr = list_add(list, head, name);
-				if(!curr) goto out;
-				curr->addr.v6 = sa;
-				curr->ifa.ifa_addr = (struct sockaddr*) &curr->addr;
-				ipv6netmask(c, &sa);
-				curr->netmask.v6 = sa;
-				curr->ifa.ifa_netmask = (struct sockaddr*) &curr->netmask;
-				/* find ipv4 struct with the same interface name to copy flags */
-				stor* scan = *list;
-				for(;scan && strcmp(name, scan->name);scan=(stor*)scan->next);
-				if(scan) curr->ifa.ifa_flags = scan->ifa.ifa_flags;
-				else curr->ifa.ifa_flags = 0;
-			} else errno = 0;
-		}
-	}
-	out:
-	fclose(f);
+	if (addrlen > sizeof(sa->ll.sll_addr)) return;
+	sa->ll.sll_family = AF_PACKET;
+	sa->ll.sll_ifindex = ifindex;
+	sa->ll.sll_hatype = hatype;
+	sa->ll.sll_halen = addrlen;
+	memcpy(sa->ll.sll_addr, addr, addrlen);
+	*r = &sa->sa;
 }
 
-int getifaddrs(struct ifaddrs **ifap)
+static int netlink_msg_to_ifaddr(void *pctx, struct nlmsghdr *h)
 {
-	stor *list = 0, *head = 0;
-	struct if_nameindex* ii = if_nameindex();
-	if(!ii) return -1;
-	size_t i;
-	for(i = 0; ii[i].if_index || ii[i].if_name; i++) {
-		stor* curr = list_add(&list, &head, ii[i].if_name);
-		if(!curr) {
-			if_freenameindex(ii);
-			goto err2;
+	struct ifaddrs_ctx *ctx = pctx;
+	struct ifaddrs_storage *ifs, *ifs0;
+	struct ifinfomsg *ifi = NLMSG_DATA(h);
+	struct ifaddrmsg *ifa = NLMSG_DATA(h);
+	struct rtattr *rta;
+	int stats_len = 0;
+
+	if (h->nlmsg_type == RTM_NEWLINK) {
+		for (rta = NLMSG_RTA(h, sizeof(*ifi)); NLMSG_RTAOK(rta, h); rta = RTA_NEXT(rta)) {
+			if (rta->rta_type != IFLA_STATS) continue;
+			stats_len = RTA_DATALEN(rta);
+			break;
 		}
+	} else {
+		for (ifs0 = ctx->hash[ifa->ifa_index % IFADDRS_HASH_SIZE]; ifs0; ifs0 = ifs0->hash_next)
+			if (ifs0->index == ifa->ifa_index)
+				break;
+		if (!ifs0) return 0;
 	}
-	if_freenameindex(ii);
-
-	int sock = socket(PF_INET, SOCK_DGRAM|SOCK_CLOEXEC, IPPROTO_IP);
-	if(sock == -1) goto err2;
-	struct ifreq reqs[32]; /* arbitrary chosen boundary */
-	struct ifconf conf = {.ifc_len = sizeof reqs, .ifc_req = reqs};
-	if(-1 == ioctl(sock, SIOCGIFCONF, &conf)) goto err;
-	size_t reqitems = conf.ifc_len / sizeof(struct ifreq);
-	for(head = list; head; head = (stor*)head->next) {
-		for(i = 0; i < reqitems; i++) {
-			// get SIOCGIFADDR of active interfaces.
-			if(!strcmp(reqs[i].ifr_name, head->name)) {
-				head->addr.v4 = *(struct sockaddr_in*)&reqs[i].ifr_addr;
-				head->ifa.ifa_addr = (struct sockaddr*) &head->addr;
+
+	ifs = calloc(1, sizeof(struct ifaddrs_storage) + stats_len);
+	if (ifs == 0) return -1;
+
+	if (h->nlmsg_type == RTM_NEWLINK) {
+		ifs->index = ifi->ifi_index;
+		ifs->ifa.ifa_flags = ifi->ifi_flags;
+
+		for (rta = NLMSG_RTA(h, sizeof(*ifi)); NLMSG_RTAOK(rta, h); rta = RTA_NEXT(rta)) {
+			switch (rta->rta_type) {
+			case IFLA_IFNAME:
+				if (RTA_DATALEN(rta) < sizeof(ifs->name)) {
+					memcpy(ifs->name, RTA_DATA(rta), RTA_DATALEN(rta));
+					ifs->ifa.ifa_name = ifs->name;
+				}
+				break;
+			case IFLA_ADDRESS:
+				copy_lladdr(&ifs->ifa.ifa_addr, &ifs->addr, RTA_DATA(rta), RTA_DATALEN(rta), ifi->ifi_index, ifi->ifi_type);
+				break;
+			case IFLA_BROADCAST:
+				copy_lladdr(&ifs->ifa.ifa_broadaddr, &ifs->ifu, RTA_DATA(rta), RTA_DATALEN(rta), ifi->ifi_index, ifi->ifi_type);
+				break;
+			case IFLA_STATS:
+				ifs->ifa.ifa_data = (void*)(ifs+1);
+				memcpy(ifs->ifa.ifa_data, RTA_DATA(rta), RTA_DATALEN(rta));
 				break;
 			}
 		}
-		struct ifreq req;
-		snprintf(req.ifr_name, sizeof req.ifr_name, "%s", head->name);
-		if(-1 == ioctl(sock, SIOCGIFFLAGS, &req)) goto err;
-
-		head->ifa.ifa_flags = req.ifr_flags;
-		if(head->ifa.ifa_addr) {
-			/* or'ing flags with IFF_LOWER_UP on active interfaces to mimic glibc */
-			head->ifa.ifa_flags |= IFF_LOWER_UP; 
-			if(-1 == ioctl(sock, SIOCGIFNETMASK, &req)) goto err;
-			head->netmask.v4 = *(struct sockaddr_in*)&req.ifr_netmask;
-			head->ifa.ifa_netmask = (struct sockaddr*) &head->netmask;
-	
-			if(head->ifa.ifa_flags & IFF_POINTOPOINT) {
-				if(-1 == ioctl(sock, SIOCGIFDSTADDR, &req)) goto err;
-				head->dst.v4 = *(struct sockaddr_in*)&req.ifr_dstaddr;
-			} else {
-				if(-1 == ioctl(sock, SIOCGIFBRDADDR, &req)) goto err;
-				head->dst.v4 = *(struct sockaddr_in*)&req.ifr_broadaddr;
+		if (ifs->ifa.ifa_name) {
+			unsigned int bucket = ifs->index % IFADDRS_HASH_SIZE;
+			ifs->hash_next = ctx->hash[bucket];
+			ctx->hash[bucket] = ifs;
+		}
+	} else {
+		ifs->ifa.ifa_name = ifs0->ifa.ifa_name;
+		ifs->ifa.ifa_flags = ifs0->ifa.ifa_flags;
+		for (rta = NLMSG_RTA(h, sizeof(*ifa)); NLMSG_RTAOK(rta, h); rta = RTA_NEXT(rta)) {
+			switch (rta->rta_type) {
+			case IFA_ADDRESS:
+				copy_addr(&ifs->ifa.ifa_addr, ifa->ifa_family, &ifs->addr, RTA_DATA(rta), RTA_DATALEN(rta), ifa->ifa_index);
+				break;
+			case IFA_BROADCAST:
+				/* For point-to-point links this is peer, but ifa_broadaddr
+				 * and ifa_dstaddr are union, so this works for both.  */
+				copy_addr(&ifs->ifa.ifa_broadaddr, ifa->ifa_family, &ifs->ifu, RTA_DATA(rta), RTA_DATALEN(rta), ifa->ifa_index);
+				break;
+			case IFA_LABEL:
+				if (RTA_DATALEN(rta) < sizeof(ifs->name)) {
+					memcpy(ifs->name, RTA_DATA(rta), RTA_DATALEN(rta));
+					ifs->ifa.ifa_name = ifs->name;
+				}
+				break;
 			}
-			head->ifa.ifa_ifu.ifu_dstaddr = (struct sockaddr*) &head->dst;
 		}
+		if (ifs->ifa.ifa_addr)
+			gen_netmask(&ifs->ifa.ifa_netmask, ifa->ifa_family, &ifs->netmask, ifa->ifa_prefixlen);
+	}
+
+	if (ifs->ifa.ifa_name) {
+		if (!ctx->first) ctx->first = ifs;
+		if (ctx->last) ctx->last->ifa.ifa_next = &ifs->ifa;
+		ctx->last = ifs;
+	} else {
+		free(ifs);
 	}
-	close(sock);
-	void* last = 0;
-	for(head = list; head; head=(stor*)head->next) last=head;
-	head = last;
-	dealwithipv6(&list, &head);
-	*ifap = (struct ifaddrs*) list;
 	return 0;
-	err:
-	close(sock);
-	err2:
-	freeifaddrs((struct ifaddrs*) list);
-	return -1;
 }
 
+int getifaddrs(struct ifaddrs **ifap)
+{
+	struct ifaddrs_ctx _ctx, *ctx = &_ctx;
+	int r;
+	memset(ctx, 0, sizeof *ctx);
+	r = __rtnetlink_enumerate(AF_UNSPEC, AF_UNSPEC, netlink_msg_to_ifaddr, ctx);
+	if (r == 0) *ifap = &ctx->first->ifa;
+	else freeifaddrs(&ctx->first->ifa);
+	return r;
+}
diff --git a/src/network/if_nameindex.c b/src/network/if_nameindex.c
index 53b80b2..f8dda54 100644
--- a/src/network/if_nameindex.c
+++ b/src/network/if_nameindex.c
@@ -1,55 +1,109 @@
 #define _GNU_SOURCE
 #include <net/if.h>
-#include <stdlib.h>
-#include <sys/socket.h>
-#include <sys/ioctl.h>
 #include <errno.h>
-#include "syscall.h"
+#include <unistd.h>
+#include <stdlib.h>
+#include <string.h>
+#include "netlink.h"
 
-static void *do_nameindex(int s, size_t n)
-{
-	size_t i, len, k;
-	struct ifconf conf;
-	struct if_nameindex *idx;
+#define IFADDRS_HASH_SIZE 64
 
-	idx = malloc(n * (sizeof(struct if_nameindex)+sizeof(struct ifreq)));
-	if (!idx) return 0;
+struct ifnamemap {
+	unsigned int hash_next;
+	unsigned int index;
+	unsigned char namelen;
+	char name[IFNAMSIZ];
+};
 
-	conf.ifc_buf = (void *)&idx[n];
-	conf.ifc_len = len = n * sizeof(struct ifreq);
-	if (ioctl(s, SIOCGIFCONF, &conf) < 0) {
-		free(idx);
-		return 0;
-	}
-	if (conf.ifc_len == len) {
-		free(idx);
-		return (void *)-1;
+struct ifnameindexctx {
+	unsigned int num, allocated, str_bytes;
+	struct ifnamemap *list;
+	unsigned int hash[IFADDRS_HASH_SIZE];
+};
+
+static int netlink_msg_to_nameindex(void *pctx, struct nlmsghdr *h)
+{
+	struct ifnameindexctx *ctx = pctx;
+	struct ifnamemap *map;
+	struct rtattr *rta;
+	unsigned int i;
+	int index, type, namelen, bucket;
+
+	if (h->nlmsg_type == RTM_NEWLINK) {
+		struct ifinfomsg *ifi = NLMSG_DATA(h);
+		index = ifi->ifi_index;
+		type = IFLA_IFNAME;
+		rta = NLMSG_RTA(h, sizeof(*ifi));
+	} else {
+		struct ifaddrmsg *ifa = NLMSG_DATA(h);
+		index = ifa->ifa_index;
+		type = IFA_LABEL;
+		rta = NLMSG_RTA(h, sizeof(*ifa));
 	}
+	for (; NLMSG_RTAOK(rta, h); rta = RTA_NEXT(rta)) {
+		if (rta->rta_type != type) continue;
 
-	n = conf.ifc_len / sizeof(struct ifreq);
-	for (i=k=0; i<n; i++) {
-		if (ioctl(s, SIOCGIFINDEX, &conf.ifc_req[i]) < 0) {
-			k++;
-			continue;
+		namelen = RTA_DATALEN(rta) - 1;
+		if (namelen > IFNAMSIZ) return 0;
+
+		/* suppress duplicates */
+		bucket = index % IFADDRS_HASH_SIZE;
+		i = ctx->hash[bucket];
+		while (i) {
+			map = &ctx->list[i-1];
+			if (map->index == index &&
+			    map->namelen == namelen &&
+			    memcmp(map->name, RTA_DATA(rta), namelen) == 0)
+				return 0;
+			i = map->hash_next;
 		}
-		idx[i-k].if_index = conf.ifc_req[i].ifr_ifindex;
-		idx[i-k].if_name = conf.ifc_req[i].ifr_name;
-	}
-	idx[i-k].if_name = 0;
-	idx[i-k].if_index = 0;
 
-	return idx;
+		if (ctx->num >= ctx->allocated) {
+			size_t a = ctx->allocated ? ctx->allocated * 2 + 1 : 8;
+			if (a > SIZE_MAX/sizeof *map) return -1;
+			map = realloc(ctx->list, a * sizeof *map);
+			if (!map) return -1;
+			ctx->list = map;
+			ctx->allocated = a;
+		}
+		map = &ctx->list[ctx->num];
+		map->index = index;
+		map->namelen = namelen;
+		memcpy(map->name, RTA_DATA(rta), namelen);
+		ctx->str_bytes += namelen + 1;
+		ctx->num++;
+		map->hash_next = ctx->hash[bucket];
+		ctx->hash[bucket] = ctx->num;
+		return 0;
+	}
+	return 0;
 }
 
 struct if_nameindex *if_nameindex()
 {
-	size_t n;
-	void *p = 0;
-	int s = socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0);
-	if (s>=0) {
-		for (n=0; (p=do_nameindex(s, n)) == (void *)-1; n++);
-		__syscall(SYS_close, s);
+	struct ifnameindexctx _ctx, *ctx = &_ctx;
+	struct if_nameindex *ifs = 0, *d;
+	struct ifnamemap *s;
+	char *p;
+	int i;
+
+	memset(ctx, 0, sizeof(*ctx));
+	if (__rtnetlink_enumerate(AF_UNSPEC, AF_INET, netlink_msg_to_nameindex, ctx) < 0) goto err;
+
+	ifs = malloc(sizeof(struct if_nameindex[ctx->num+1]) + ctx->str_bytes);
+	if (!ifs) goto err;
+
+	p = (char*)(ifs + ctx->num + 1);
+	for (i = ctx->num, d = ifs, s = ctx->list; i; i--, s++, d++) {
+		d->if_index = s->index;
+		d->if_name = p;
+		memcpy(p, s->name, s->namelen);
+		p += s->namelen;
+		*p++ = 0;
 	}
-	errno = ENOBUFS;
-	return p;
+	d->if_index = 0;
+	d->if_name = 0;
+err:
+	free(ctx->list);
+	return ifs;
 }
diff --git a/src/network/netlink.c b/src/network/netlink.c
new file mode 100644
index 0000000..94dba7f
--- /dev/null
+++ b/src/network/netlink.c
@@ -0,0 +1,52 @@
+#include <errno.h>
+#include <string.h>
+#include <syscall.h>
+#include <sys/socket.h>
+#include "netlink.h"
+
+static int __netlink_enumerate(int fd, unsigned int seq, int type, int af,
+	int (*cb)(void *ctx, struct nlmsghdr *h), void *ctx)
+{
+	struct nlmsghdr *h;
+	union {
+		uint8_t buf[8192];
+		struct {
+			struct nlmsghdr nlh;
+			struct rtgenmsg g;
+		} req;
+		struct nlmsghdr reply;
+	} u;
+	int r, ret;
+
+	memset(&u.req, 0, sizeof(u.req));
+	u.req.nlh.nlmsg_len = sizeof(u.req);
+	u.req.nlh.nlmsg_type = type;
+	u.req.nlh.nlmsg_flags = NLM_F_DUMP | NLM_F_REQUEST;
+	u.req.nlh.nlmsg_seq = seq;
+	u.req.g.rtgen_family = af;
+	r = send(fd, &u.req, sizeof(u.req), 0);
+	if (r < 0) return r;
+
+	while (1) {
+		r = recv(fd, u.buf, sizeof(u.buf), MSG_DONTWAIT);
+		if (r <= 0) return -1;
+		for (h = &u.reply; NLMSG_OK(h, (void*)&u.buf[r]); h = NLMSG_NEXT(h)) {
+			if (h->nlmsg_type == NLMSG_DONE) return 0;
+			if (h->nlmsg_type == NLMSG_ERROR) return -1;
+			ret = cb(ctx, h);
+			if (ret) return ret;
+		}
+	}
+}
+
+int __rtnetlink_enumerate(int link_af, int addr_af, int (*cb)(void *ctx, struct nlmsghdr *h), void *ctx)
+{
+	int fd, r;
+
+	fd = socket(PF_NETLINK, SOCK_RAW|SOCK_CLOEXEC, NETLINK_ROUTE);
+	if (fd < 0) return -1;
+	r = __netlink_enumerate(fd, 1, RTM_GETLINK, link_af, cb, ctx);
+	if (!r) r = __netlink_enumerate(fd, 2, RTM_GETADDR, addr_af, cb, ctx);
+	__syscall(SYS_close,fd);
+	return r;
+}
diff --git a/src/network/netlink.h b/src/network/netlink.h
new file mode 100644
index 0000000..20700ac
--- /dev/null
+++ b/src/network/netlink.h
@@ -0,0 +1,94 @@
+#include <stdint.h>
+
+/* linux/netlink.h */
+
+#define NETLINK_ROUTE 0
+
+struct nlmsghdr {
+	uint32_t	nlmsg_len;
+	uint16_t	nlmsg_type;
+	uint16_t	nlmsg_flags;
+	uint32_t	nlmsg_seq;
+	uint32_t	nlmsg_pid;
+};
+
+#define NLM_F_REQUEST	1
+#define NLM_F_MULTI	2
+#define NLM_F_ACK	4
+
+#define NLM_F_ROOT	0x100
+#define NLM_F_MATCH	0x200
+#define NLM_F_ATOMIC	0x400
+#define NLM_F_DUMP	(NLM_F_ROOT|NLM_F_MATCH)
+
+#define NLMSG_NOOP	0x1
+#define NLMSG_ERROR	0x2
+#define NLMSG_DONE	0x3
+#define NLMSG_OVERRUN	0x4
+
+/* linux/rtnetlink.h */
+
+#define RTM_NEWLINK	16
+#define RTM_GETLINK	18
+#define RTM_NEWADDR	20
+#define RTM_GETADDR	22
+
+struct rtattr {
+	unsigned short	rta_len;
+	unsigned short	rta_type;
+};
+
+struct rtgenmsg {
+	unsigned char	rtgen_family;
+};
+
+struct ifinfomsg {
+	unsigned char	ifi_family;
+	unsigned char	__ifi_pad;
+	unsigned short	ifi_type;
+	int		ifi_index;
+	unsigned	ifi_flags;
+	unsigned	ifi_change;
+};
+
+/* linux/if_link.h */
+
+#define IFLA_ADDRESS	1
+#define IFLA_BROADCAST	2
+#define IFLA_IFNAME	3
+#define IFLA_STATS	7
+
+/* linux/if_addr.h */
+
+struct ifaddrmsg {
+	uint8_t		ifa_family;
+	uint8_t		ifa_prefixlen;
+	uint8_t		ifa_flags;
+	uint8_t		ifa_scope;
+	uint32_t	ifa_index;
+};
+
+#define IFA_ADDRESS	1
+#define IFA_LOCAL	2
+#define IFA_LABEL	3
+#define IFA_BROADCAST	4
+
+/* musl */
+
+#define NETLINK_ALIGN(len)	(((len)+3) & ~3)
+#define NLMSG_DATA(nlh)		((void*)((char*)(nlh)+sizeof(struct nlmsghdr)))
+#define NLMSG_DATALEN(nlh)	((nlh)->nlmsg_len-sizeof(struct nlmsghdr))
+#define NLMSG_DATAEND(nlh)	((char*)(nlh)+(nlh)->nlmsg_len)
+#define NLMSG_NEXT(nlh)		(struct nlmsghdr*)((char*)(nlh)+NETLINK_ALIGN((nlh)->nlmsg_len))
+#define NLMSG_OK(nlh,end)	((char*)(end)-(char*)(nlh) >= sizeof(struct nlmsghdr))
+
+#define RTA_DATA(rta)		((void*)((char*)(rta)+sizeof(struct rtattr)))
+#define RTA_DATALEN(rta)	((rta)->rta_len-sizeof(struct rtattr))
+#define RTA_DATAEND(rta)	((char*)(rta)+(rta)->rta_len)
+#define RTA_NEXT(rta)		(struct rtattr*)((char*)(rta)+NETLINK_ALIGN((rta)->rta_len))
+#define RTA_OK(nlh,end)		((char*)(end)-(char*)(rta) >= sizeof(struct rtattr))
+
+#define NLMSG_RTA(nlh,len)	((void*)((char*)(nlh)+sizeof(struct nlmsghdr)+NETLINK_ALIGN(len)))
+#define NLMSG_RTAOK(rta,nlh)	RTA_OK(rta,NLMSG_DATAEND(nlh))
+
+int __rtnetlink_enumerate(int link_af, int addr_af, int (*cb)(void *ctx, struct nlmsghdr *h), void *ctx);
-- 
2.0.3



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Reviewing if_nameindex and getifaddrs patch
  2014-07-28  8:13 ` Timo Teras
@ 2014-07-29 14:34   ` Rich Felker
  2014-07-29 14:49   ` Rich Felker
  1 sibling, 0 replies; 6+ messages in thread
From: Rich Felker @ 2014-07-29 14:34 UTC (permalink / raw)
  To: musl

On Mon, Jul 28, 2014 at 11:13:27AM +0300, Timo Teras wrote:
> > I really don't understand the 'hash' logic for getifaddrs yet, but the
> > function seems to work. Some general description of what data the
> > callback receives and what it's doing with it could be helpful for
> > reviewing this part of the patch.
> 
> The rtnetlink enumerate does two netlink 'dumps'. One of 'link' type
> that dumps the physical interfaces, and one of 'addr' type that lists
> all network addresses.
> 
> The 'link' dump is used to get the PF_PACKET ifaddrs with mac
> addresses, as well as to get the ifindex<->ifname mappings and the
> logical flags of each physical interface (SIOCGIFFLAGS equivalent).
> These put into the hash based on ifindex.
> 
> The 'addr' dump then does the ipv4/ipv6 address (easy to extend if we
> need to support something new in future). These lookup the physical
> 'link' info using ifindex. This is needed to fill ifa_flags with valid
> data. It is also used to lookup the interface in non-ipv4 case. (In
> ipv4 case it seems that kernel always sends IFA_LABEL containing the
> real or aliased interface name. Though this probably might change if
> the alias interface system is removed at some point.)

I see. One random thought I had, since you're merging information from
two sources: is there any atomicity guarantee here, or is it possible
that interfaces have changed between the link and addr dumps? I don't
think issues like this should block getting it committed since it's at
least better than what we have now with respect to atomicity, but I
noticed the unused NLM_F_ATOMIC flag and wonder if that could be used
to perform both requests atomically (if they're not already atomic).

Aside from that I'm going to apply the patch again, look at the
resulting code, and make sure I don't see anything else that needs to
be fixed. If it looks ok, I'll go ahead and apply it.

Rich


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Reviewing if_nameindex and getifaddrs patch
  2014-07-28  8:13 ` Timo Teras
  2014-07-29 14:34   ` Rich Felker
@ 2014-07-29 14:49   ` Rich Felker
  2014-07-30  0:55     ` Rich Felker
  1 sibling, 1 reply; 6+ messages in thread
From: Rich Felker @ 2014-07-29 14:49 UTC (permalink / raw)
  To: musl

On Mon, Jul 28, 2014 at 11:13:27AM +0300, Timo Teras wrote:
> On Sun, 27 Jul 2014 20:49:06 -0400
> Rich Felker <dalias@libc.org> wrote:
> 
> > In regards to:
> > 
> > http://git.alpinelinux.org/cgit/aports/plain/main/musl/1002-reimplement-if_nameindex-and-getifaddrs-using-netlin.patch?id=3227b4ad816f850f655b6f44dc497926cb2cdcd1
> 
> Updated patch in the end.

I can't get the patch to apply. Not sure whether email ate it or if
there's an underlying problem.

Rich


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Reviewing if_nameindex and getifaddrs patch
  2014-07-29 14:49   ` Rich Felker
@ 2014-07-30  0:55     ` Rich Felker
  2014-07-30  0:58       ` Rich Felker
  0 siblings, 1 reply; 6+ messages in thread
From: Rich Felker @ 2014-07-30  0:55 UTC (permalink / raw)
  To: musl

On Tue, Jul 29, 2014 at 10:49:43AM -0400, Rich Felker wrote:
> On Mon, Jul 28, 2014 at 11:13:27AM +0300, Timo Teras wrote:
> > On Sun, 27 Jul 2014 20:49:06 -0400
> > Rich Felker <dalias@libc.org> wrote:
> > 
> > > In regards to:
> > > 
> > > http://git.alpinelinux.org/cgit/aports/plain/main/musl/1002-reimplement-if_nameindex-and-getifaddrs-using-netlin.patch?id=3227b4ad816f850f655b6f44dc497926cb2cdcd1
> > 
> > Updated patch in the end.
> 
> I can't get the patch to apply. Not sure whether email ate it or if
> there's an underlying problem.

Based on the version of this patch re-sent off-list, which I'm
attaching, I've reviewed it and fixed up a few remaining issues:

- Missing pthread_setcancelstate in if_nameindex.
- Possibly wrong errno from if_nameindex (needs to set ENOBUFS).

With these changes I'm committing it; we can make further improvements
later if needed. Perhaps getifaddrs should also avoid being
cancellable, but it's non-POSIX and thus this isn't really specified,
and the old one was already cancellable so there's no regression.

Thanks for working on this and being patient.

Rich


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Reviewing if_nameindex and getifaddrs patch
  2014-07-30  0:55     ` Rich Felker
@ 2014-07-30  0:58       ` Rich Felker
  0 siblings, 0 replies; 6+ messages in thread
From: Rich Felker @ 2014-07-30  0:58 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 807 bytes --]

On Tue, Jul 29, 2014 at 08:55:44PM -0400, Rich Felker wrote:
> On Tue, Jul 29, 2014 at 10:49:43AM -0400, Rich Felker wrote:
> > On Mon, Jul 28, 2014 at 11:13:27AM +0300, Timo Teras wrote:
> > > On Sun, 27 Jul 2014 20:49:06 -0400
> > > Rich Felker <dalias@libc.org> wrote:
> > > 
> > > > In regards to:
> > > > 
> > > > http://git.alpinelinux.org/cgit/aports/plain/main/musl/1002-reimplement-if_nameindex-and-getifaddrs-using-netlin.patch?id=3227b4ad816f850f655b6f44dc497926cb2cdcd1
> > > 
> > > Updated patch in the end.
> > 
> > I can't get the patch to apply. Not sure whether email ate it or if
> > there's an underlying problem.
> 
> Based on the version of this patch re-sent off-list, which I'm
> attaching, I've reviewed it and fixed up a few remaining issues:

Forgot to attach -- here it is.

Rich

[-- Attachment #2: ZEYZ --]
[-- Type: text/plain, Size: 19542 bytes --]

From 137e4bc5ad24ba7e261483378cb8de549777fa44 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Timo=20Ter=C3=A4s?= <timo.teras@iki.fi>
Date: Tue, 8 Apr 2014 14:03:16 +0000
Subject: [PATCH] reimplement if_nameindex and getifaddrs using netlink

---
 src/network/getifaddrs.c   | 314 ++++++++++++++++++++++++---------------------
 src/network/if_nameindex.c | 132 +++++++++++++------
 src/network/netlink.c      |  52 ++++++++
 src/network/netlink.h      |  94 ++++++++++++++
 4 files changed, 407 insertions(+), 185 deletions(-)
 create mode 100644 src/network/netlink.c
 create mode 100644 src/network/netlink.h

diff --git a/src/network/getifaddrs.c b/src/network/getifaddrs.c
index 5a94cc7..89a8f72 100644
--- a/src/network/getifaddrs.c
+++ b/src/network/getifaddrs.c
@@ -1,181 +1,203 @@
-/* (C) 2013 John Spencer. released under musl's standard MIT license. */
-#undef _GNU_SOURCE
 #define _GNU_SOURCE
-#include <ifaddrs.h>
-#include <stdlib.h>
-#include <net/if.h> /* IFNAMSIZ, ifreq, ifconf */
-#include <stdio.h>
-#include <ctype.h>
-#include <string.h>
 #include <errno.h>
-#include <arpa/inet.h> /* inet_pton */
+#include <string.h>
+#include <stdlib.h>
 #include <unistd.h>
-#include <sys/ioctl.h>
-#include <sys/socket.h>
+#include <ifaddrs.h>
+#include <syscall.h>
+#include <net/if.h>
+#include <netinet/in.h>
+#include "netlink.h"
 
-typedef union {
-	struct sockaddr_in6 v6;
+#define IFADDRS_HASH_SIZE 64
+
+/* getifaddrs() reports hardware addresses with PF_PACKET that implies
+ * struct sockaddr_ll.  But e.g. Infiniband socket address length is
+ * longer than sockaddr_ll.ssl_addr[8] can hold. Use this hack struct
+ * to extend ssl_addr - callers should be able to still use it. */
+struct sockaddr_ll_hack {
+	unsigned short sll_family, sll_protocol;
+	int sll_ifindex;
+	unsigned short sll_hatype;
+	unsigned char sll_pkttype, sll_halen;
+	unsigned char sll_addr[24];
+};
+
+union sockany {
+	struct sockaddr sa;
+	struct sockaddr_ll_hack ll;
 	struct sockaddr_in v4;
-} soa;
+	struct sockaddr_in6 v6;
+};
 
-typedef struct ifaddrs_storage {
+struct ifaddrs_storage {
 	struct ifaddrs ifa;
-	soa addr;
-	soa netmask;
-	soa dst;
+	struct ifaddrs_storage *hash_next;
+	union sockany addr, netmask, ifu;
+	unsigned int index;
 	char name[IFNAMSIZ+1];
-} stor;
-#define next ifa.ifa_next
+};
 
-static stor* list_add(stor** list, stor** head, char* ifname)
+struct ifaddrs_ctx {
+	struct ifaddrs_storage *first;
+	struct ifaddrs_storage *last;
+	struct ifaddrs_storage *hash[IFADDRS_HASH_SIZE];
+};
+
+void freeifaddrs(struct ifaddrs *ifp)
 {
-	stor* curr = calloc(1, sizeof(stor));
-	if(curr) {
-		strcpy(curr->name, ifname);
-		curr->ifa.ifa_name = curr->name;
-		if(*head) (*head)->next = (struct ifaddrs*) curr;
-		*head = curr;
-		if(!*list) *list = curr;
+	struct ifaddrs *n;
+	while (ifp) {
+		n = ifp->ifa_next;
+		free(ifp);
+		ifp = n;
 	}
-	return curr;
 }
 
-void freeifaddrs(struct ifaddrs *ifp)
+static void copy_addr(struct sockaddr **r, int af, union sockany *sa, void *addr, size_t addrlen, int ifindex)
 {
-	stor *head = (stor *) ifp;
-	while(head) {
-		void *p = head;
-		head = (stor *) head->next;
-		free(p);
+	uint8_t *dst;
+	int len;
+
+	switch (af) {
+	case AF_INET:
+		dst = (uint8_t*) &sa->v4.sin_addr;
+		len = 4;
+		break;
+	case AF_INET6:
+		dst = (uint8_t*) &sa->v6.sin6_addr;
+		len = 16;
+		if (IN6_IS_ADDR_LINKLOCAL(addr) || IN6_IS_ADDR_MC_LINKLOCAL(addr))
+			sa->v6.sin6_scope_id = ifindex;
+		break;
+	default:
+		return;
 	}
+	if (addrlen < len) return;
+	sa->sa.sa_family = af;
+	memcpy(dst, addr, len);
+	*r = &sa->sa;
 }
 
-static void ipv6netmask(unsigned prefix_length, struct sockaddr_in6 *sa)
+static void gen_netmask(struct sockaddr **r, int af, union sockany *sa, int prefixlen)
 {
-	unsigned char* hb = sa->sin6_addr.s6_addr;
-	unsigned onebytes = prefix_length / 8;
-	unsigned bits = prefix_length % 8;
-	unsigned nullbytes = 16 - onebytes;
-	memset(hb, -1, onebytes);
-	memset(hb+onebytes, 0, nullbytes);
-	if(bits) {
-		unsigned char x = -1;
-		x <<= 8 - bits;
-		hb[onebytes] = x;
-	}
+	uint8_t addr[16] = {0};
+	int i;
+
+	if (prefixlen > 8*sizeof(addr)) prefixlen = 8*sizeof(addr);
+	i = prefixlen / 8;
+	memset(addr, 0xff, i);
+	if (i < sizeof(addr)) addr[i++] = 0xff << (8 - (prefixlen % 8));
+	copy_addr(r, af, sa, addr, sizeof(addr), 0);
 }
 
-static void dealwithipv6(stor **list, stor** head)
+static void copy_lladdr(struct sockaddr **r, union sockany *sa, void *addr, size_t addrlen, int ifindex, unsigned short hatype)
 {
-	FILE* f = fopen("/proc/net/if_inet6", "rbe");
-	/* 00000000000000000000000000000001 01 80 10 80 lo
-	   A                                B  C  D  E  F
-	   all numbers in hex
-	   A = addr B=netlink device#, C=prefix length,
-	   D = scope value (ipv6.h) E = interface flags (rnetlink.h, addrconf.c)
-	   F = if name */
-	char v6conv[32 + 7 + 1], *v6;
-	char *line, linebuf[512];
-	if(!f) return;
-	while((line = fgets(linebuf, sizeof linebuf, f))) {
-		v6 = v6conv;
-		size_t i = 0;
-		for(; i < 8; i++) {
-			memcpy(v6, line, 4);
-			v6+=4;
-			*v6++=':';
-			line+=4;
-		}
-		--v6; *v6 = 0;
-		line++;
-		unsigned b, c, d, e;
-		char name[IFNAMSIZ+1];
-		if(5 == sscanf(line, "%x %x %x %x %s", &b, &c, &d, &e, name)) {
-			struct sockaddr_in6 sa = {0};
-			if(1 == inet_pton(AF_INET6, v6conv, &sa.sin6_addr)) {
-				sa.sin6_family = AF_INET6;
-				stor* curr = list_add(list, head, name);
-				if(!curr) goto out;
-				curr->addr.v6 = sa;
-				curr->ifa.ifa_addr = (struct sockaddr*) &curr->addr;
-				ipv6netmask(c, &sa);
-				curr->netmask.v6 = sa;
-				curr->ifa.ifa_netmask = (struct sockaddr*) &curr->netmask;
-				/* find ipv4 struct with the same interface name to copy flags */
-				stor* scan = *list;
-				for(;scan && strcmp(name, scan->name);scan=(stor*)scan->next);
-				if(scan) curr->ifa.ifa_flags = scan->ifa.ifa_flags;
-				else curr->ifa.ifa_flags = 0;
-			} else errno = 0;
-		}
-	}
-	out:
-	fclose(f);
+	if (addrlen > sizeof(sa->ll.sll_addr)) return;
+	sa->ll.sll_family = AF_PACKET;
+	sa->ll.sll_ifindex = ifindex;
+	sa->ll.sll_hatype = hatype;
+	sa->ll.sll_halen = addrlen;
+	memcpy(sa->ll.sll_addr, addr, addrlen);
+	*r = &sa->sa;
 }
 
-int getifaddrs(struct ifaddrs **ifap)
+static int netlink_msg_to_ifaddr(void *pctx, struct nlmsghdr *h)
 {
-	stor *list = 0, *head = 0;
-	struct if_nameindex* ii = if_nameindex();
-	if(!ii) return -1;
-	size_t i;
-	for(i = 0; ii[i].if_index || ii[i].if_name; i++) {
-		stor* curr = list_add(&list, &head, ii[i].if_name);
-		if(!curr) {
-			if_freenameindex(ii);
-			goto err2;
+	struct ifaddrs_ctx *ctx = pctx;
+	struct ifaddrs_storage *ifs, *ifs0;
+	struct ifinfomsg *ifi = NLMSG_DATA(h);
+	struct ifaddrmsg *ifa = NLMSG_DATA(h);
+	struct rtattr *rta;
+	int stats_len = 0;
+
+	if (h->nlmsg_type == RTM_NEWLINK) {
+		for (rta = NLMSG_RTA(h, sizeof(*ifi)); NLMSG_RTAOK(rta, h); rta = RTA_NEXT(rta)) {
+			if (rta->rta_type != IFLA_STATS) continue;
+			stats_len = RTA_DATALEN(rta);
+			break;
 		}
+	} else {
+		for (ifs0 = ctx->hash[ifa->ifa_index % IFADDRS_HASH_SIZE]; ifs0; ifs0 = ifs0->hash_next)
+			if (ifs0->index == ifa->ifa_index)
+				break;
+		if (!ifs0) return 0;
 	}
-	if_freenameindex(ii);
-
-	int sock = socket(PF_INET, SOCK_DGRAM|SOCK_CLOEXEC, IPPROTO_IP);
-	if(sock == -1) goto err2;
-	struct ifreq reqs[32]; /* arbitrary chosen boundary */
-	struct ifconf conf = {.ifc_len = sizeof reqs, .ifc_req = reqs};
-	if(-1 == ioctl(sock, SIOCGIFCONF, &conf)) goto err;
-	size_t reqitems = conf.ifc_len / sizeof(struct ifreq);
-	for(head = list; head; head = (stor*)head->next) {
-		for(i = 0; i < reqitems; i++) {
-			// get SIOCGIFADDR of active interfaces.
-			if(!strcmp(reqs[i].ifr_name, head->name)) {
-				head->addr.v4 = *(struct sockaddr_in*)&reqs[i].ifr_addr;
-				head->ifa.ifa_addr = (struct sockaddr*) &head->addr;
+
+	ifs = calloc(1, sizeof(struct ifaddrs_storage) + stats_len);
+	if (ifs == 0) return -1;
+
+	if (h->nlmsg_type == RTM_NEWLINK) {
+		ifs->index = ifi->ifi_index;
+		ifs->ifa.ifa_flags = ifi->ifi_flags;
+
+		for (rta = NLMSG_RTA(h, sizeof(*ifi)); NLMSG_RTAOK(rta, h); rta = RTA_NEXT(rta)) {
+			switch (rta->rta_type) {
+			case IFLA_IFNAME:
+				if (RTA_DATALEN(rta) < sizeof(ifs->name)) {
+					memcpy(ifs->name, RTA_DATA(rta), RTA_DATALEN(rta));
+					ifs->ifa.ifa_name = ifs->name;
+				}
+				break;
+			case IFLA_ADDRESS:
+				copy_lladdr(&ifs->ifa.ifa_addr, &ifs->addr, RTA_DATA(rta), RTA_DATALEN(rta), ifi->ifi_index, ifi->ifi_type);
+				break;
+			case IFLA_BROADCAST:
+				copy_lladdr(&ifs->ifa.ifa_broadaddr, &ifs->ifu, RTA_DATA(rta), RTA_DATALEN(rta), ifi->ifi_index, ifi->ifi_type);
+				break;
+			case IFLA_STATS:
+				ifs->ifa.ifa_data = (void*)(ifs+1);
+				memcpy(ifs->ifa.ifa_data, RTA_DATA(rta), RTA_DATALEN(rta));
 				break;
 			}
 		}
-		struct ifreq req;
-		snprintf(req.ifr_name, sizeof req.ifr_name, "%s", head->name);
-		if(-1 == ioctl(sock, SIOCGIFFLAGS, &req)) goto err;
-
-		head->ifa.ifa_flags = req.ifr_flags;
-		if(head->ifa.ifa_addr) {
-			/* or'ing flags with IFF_LOWER_UP on active interfaces to mimic glibc */
-			head->ifa.ifa_flags |= IFF_LOWER_UP; 
-			if(-1 == ioctl(sock, SIOCGIFNETMASK, &req)) goto err;
-			head->netmask.v4 = *(struct sockaddr_in*)&req.ifr_netmask;
-			head->ifa.ifa_netmask = (struct sockaddr*) &head->netmask;
-	
-			if(head->ifa.ifa_flags & IFF_POINTOPOINT) {
-				if(-1 == ioctl(sock, SIOCGIFDSTADDR, &req)) goto err;
-				head->dst.v4 = *(struct sockaddr_in*)&req.ifr_dstaddr;
-			} else {
-				if(-1 == ioctl(sock, SIOCGIFBRDADDR, &req)) goto err;
-				head->dst.v4 = *(struct sockaddr_in*)&req.ifr_broadaddr;
+		if (ifs->ifa.ifa_name) {
+			unsigned int bucket = ifs->index % IFADDRS_HASH_SIZE;
+			ifs->hash_next = ctx->hash[bucket];
+			ctx->hash[bucket] = ifs;
+		}
+	} else {
+		ifs->ifa.ifa_name = ifs0->ifa.ifa_name;
+		ifs->ifa.ifa_flags = ifs0->ifa.ifa_flags;
+		for (rta = NLMSG_RTA(h, sizeof(*ifa)); NLMSG_RTAOK(rta, h); rta = RTA_NEXT(rta)) {
+			switch (rta->rta_type) {
+			case IFA_ADDRESS:
+				copy_addr(&ifs->ifa.ifa_addr, ifa->ifa_family, &ifs->addr, RTA_DATA(rta), RTA_DATALEN(rta), ifa->ifa_index);
+				break;
+			case IFA_BROADCAST:
+				/* For point-to-point links this is peer, but ifa_broadaddr
+				 * and ifa_dstaddr are union, so this works for both.  */
+				copy_addr(&ifs->ifa.ifa_broadaddr, ifa->ifa_family, &ifs->ifu, RTA_DATA(rta), RTA_DATALEN(rta), ifa->ifa_index);
+				break;
+			case IFA_LABEL:
+				if (RTA_DATALEN(rta) < sizeof(ifs->name)) {
+					memcpy(ifs->name, RTA_DATA(rta), RTA_DATALEN(rta));
+					ifs->ifa.ifa_name = ifs->name;
+				}
+				break;
 			}
-			head->ifa.ifa_ifu.ifu_dstaddr = (struct sockaddr*) &head->dst;
 		}
+		if (ifs->ifa.ifa_addr)
+			gen_netmask(&ifs->ifa.ifa_netmask, ifa->ifa_family, &ifs->netmask, ifa->ifa_prefixlen);
+	}
+
+	if (ifs->ifa.ifa_name) {
+		if (!ctx->first) ctx->first = ifs;
+		if (ctx->last) ctx->last->ifa.ifa_next = &ifs->ifa;
+		ctx->last = ifs;
+	} else {
+		free(ifs);
 	}
-	close(sock);
-	void* last = 0;
-	for(head = list; head; head=(stor*)head->next) last=head;
-	head = last;
-	dealwithipv6(&list, &head);
-	*ifap = (struct ifaddrs*) list;
 	return 0;
-	err:
-	close(sock);
-	err2:
-	freeifaddrs((struct ifaddrs*) list);
-	return -1;
 }
 
+int getifaddrs(struct ifaddrs **ifap)
+{
+	struct ifaddrs_ctx _ctx, *ctx = &_ctx;
+	int r;
+	memset(ctx, 0, sizeof *ctx);
+	r = __rtnetlink_enumerate(AF_UNSPEC, AF_UNSPEC, netlink_msg_to_ifaddr, ctx);
+	if (r == 0) *ifap = &ctx->first->ifa;
+	else freeifaddrs(&ctx->first->ifa);
+	return r;
+}
diff --git a/src/network/if_nameindex.c b/src/network/if_nameindex.c
index 53b80b2..f8dda54 100644
--- a/src/network/if_nameindex.c
+++ b/src/network/if_nameindex.c
@@ -1,55 +1,109 @@
 #define _GNU_SOURCE
 #include <net/if.h>
-#include <stdlib.h>
-#include <sys/socket.h>
-#include <sys/ioctl.h>
 #include <errno.h>
-#include "syscall.h"
+#include <unistd.h>
+#include <stdlib.h>
+#include <string.h>
+#include "netlink.h"
 
-static void *do_nameindex(int s, size_t n)
-{
-	size_t i, len, k;
-	struct ifconf conf;
-	struct if_nameindex *idx;
+#define IFADDRS_HASH_SIZE 64
 
-	idx = malloc(n * (sizeof(struct if_nameindex)+sizeof(struct ifreq)));
-	if (!idx) return 0;
+struct ifnamemap {
+	unsigned int hash_next;
+	unsigned int index;
+	unsigned char namelen;
+	char name[IFNAMSIZ];
+};
 
-	conf.ifc_buf = (void *)&idx[n];
-	conf.ifc_len = len = n * sizeof(struct ifreq);
-	if (ioctl(s, SIOCGIFCONF, &conf) < 0) {
-		free(idx);
-		return 0;
-	}
-	if (conf.ifc_len == len) {
-		free(idx);
-		return (void *)-1;
+struct ifnameindexctx {
+	unsigned int num, allocated, str_bytes;
+	struct ifnamemap *list;
+	unsigned int hash[IFADDRS_HASH_SIZE];
+};
+
+static int netlink_msg_to_nameindex(void *pctx, struct nlmsghdr *h)
+{
+	struct ifnameindexctx *ctx = pctx;
+	struct ifnamemap *map;
+	struct rtattr *rta;
+	unsigned int i;
+	int index, type, namelen, bucket;
+
+	if (h->nlmsg_type == RTM_NEWLINK) {
+		struct ifinfomsg *ifi = NLMSG_DATA(h);
+		index = ifi->ifi_index;
+		type = IFLA_IFNAME;
+		rta = NLMSG_RTA(h, sizeof(*ifi));
+	} else {
+		struct ifaddrmsg *ifa = NLMSG_DATA(h);
+		index = ifa->ifa_index;
+		type = IFA_LABEL;
+		rta = NLMSG_RTA(h, sizeof(*ifa));
 	}
+	for (; NLMSG_RTAOK(rta, h); rta = RTA_NEXT(rta)) {
+		if (rta->rta_type != type) continue;
 
-	n = conf.ifc_len / sizeof(struct ifreq);
-	for (i=k=0; i<n; i++) {
-		if (ioctl(s, SIOCGIFINDEX, &conf.ifc_req[i]) < 0) {
-			k++;
-			continue;
+		namelen = RTA_DATALEN(rta) - 1;
+		if (namelen > IFNAMSIZ) return 0;
+
+		/* suppress duplicates */
+		bucket = index % IFADDRS_HASH_SIZE;
+		i = ctx->hash[bucket];
+		while (i) {
+			map = &ctx->list[i-1];
+			if (map->index == index &&
+			    map->namelen == namelen &&
+			    memcmp(map->name, RTA_DATA(rta), namelen) == 0)
+				return 0;
+			i = map->hash_next;
 		}
-		idx[i-k].if_index = conf.ifc_req[i].ifr_ifindex;
-		idx[i-k].if_name = conf.ifc_req[i].ifr_name;
-	}
-	idx[i-k].if_name = 0;
-	idx[i-k].if_index = 0;
 
-	return idx;
+		if (ctx->num >= ctx->allocated) {
+			size_t a = ctx->allocated ? ctx->allocated * 2 + 1 : 8;
+			if (a > SIZE_MAX/sizeof *map) return -1;
+			map = realloc(ctx->list, a * sizeof *map);
+			if (!map) return -1;
+			ctx->list = map;
+			ctx->allocated = a;
+		}
+		map = &ctx->list[ctx->num];
+		map->index = index;
+		map->namelen = namelen;
+		memcpy(map->name, RTA_DATA(rta), namelen);
+		ctx->str_bytes += namelen + 1;
+		ctx->num++;
+		map->hash_next = ctx->hash[bucket];
+		ctx->hash[bucket] = ctx->num;
+		return 0;
+	}
+	return 0;
 }
 
 struct if_nameindex *if_nameindex()
 {
-	size_t n;
-	void *p = 0;
-	int s = socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0);
-	if (s>=0) {
-		for (n=0; (p=do_nameindex(s, n)) == (void *)-1; n++);
-		__syscall(SYS_close, s);
+	struct ifnameindexctx _ctx, *ctx = &_ctx;
+	struct if_nameindex *ifs = 0, *d;
+	struct ifnamemap *s;
+	char *p;
+	int i;
+
+	memset(ctx, 0, sizeof(*ctx));
+	if (__rtnetlink_enumerate(AF_UNSPEC, AF_INET, netlink_msg_to_nameindex, ctx) < 0) goto err;
+
+	ifs = malloc(sizeof(struct if_nameindex[ctx->num+1]) + ctx->str_bytes);
+	if (!ifs) goto err;
+
+	p = (char*)(ifs + ctx->num + 1);
+	for (i = ctx->num, d = ifs, s = ctx->list; i; i--, s++, d++) {
+		d->if_index = s->index;
+		d->if_name = p;
+		memcpy(p, s->name, s->namelen);
+		p += s->namelen;
+		*p++ = 0;
 	}
-	errno = ENOBUFS;
-	return p;
+	d->if_index = 0;
+	d->if_name = 0;
+err:
+	free(ctx->list);
+	return ifs;
 }
diff --git a/src/network/netlink.c b/src/network/netlink.c
new file mode 100644
index 0000000..94dba7f
--- /dev/null
+++ b/src/network/netlink.c
@@ -0,0 +1,52 @@
+#include <errno.h>
+#include <string.h>
+#include <syscall.h>
+#include <sys/socket.h>
+#include "netlink.h"
+
+static int __netlink_enumerate(int fd, unsigned int seq, int type, int af,
+	int (*cb)(void *ctx, struct nlmsghdr *h), void *ctx)
+{
+	struct nlmsghdr *h;
+	union {
+		uint8_t buf[8192];
+		struct {
+			struct nlmsghdr nlh;
+			struct rtgenmsg g;
+		} req;
+		struct nlmsghdr reply;
+	} u;
+	int r, ret;
+
+	memset(&u.req, 0, sizeof(u.req));
+	u.req.nlh.nlmsg_len = sizeof(u.req);
+	u.req.nlh.nlmsg_type = type;
+	u.req.nlh.nlmsg_flags = NLM_F_DUMP | NLM_F_REQUEST;
+	u.req.nlh.nlmsg_seq = seq;
+	u.req.g.rtgen_family = af;
+	r = send(fd, &u.req, sizeof(u.req), 0);
+	if (r < 0) return r;
+
+	while (1) {
+		r = recv(fd, u.buf, sizeof(u.buf), MSG_DONTWAIT);
+		if (r <= 0) return -1;
+		for (h = &u.reply; NLMSG_OK(h, (void*)&u.buf[r]); h = NLMSG_NEXT(h)) {
+			if (h->nlmsg_type == NLMSG_DONE) return 0;
+			if (h->nlmsg_type == NLMSG_ERROR) return -1;
+			ret = cb(ctx, h);
+			if (ret) return ret;
+		}
+	}
+}
+
+int __rtnetlink_enumerate(int link_af, int addr_af, int (*cb)(void *ctx, struct nlmsghdr *h), void *ctx)
+{
+	int fd, r;
+
+	fd = socket(PF_NETLINK, SOCK_RAW|SOCK_CLOEXEC, NETLINK_ROUTE);
+	if (fd < 0) return -1;
+	r = __netlink_enumerate(fd, 1, RTM_GETLINK, link_af, cb, ctx);
+	if (!r) r = __netlink_enumerate(fd, 2, RTM_GETADDR, addr_af, cb, ctx);
+	__syscall(SYS_close,fd);
+	return r;
+}
diff --git a/src/network/netlink.h b/src/network/netlink.h
new file mode 100644
index 0000000..20700ac
--- /dev/null
+++ b/src/network/netlink.h
@@ -0,0 +1,94 @@
+#include <stdint.h>
+
+/* linux/netlink.h */
+
+#define NETLINK_ROUTE 0
+
+struct nlmsghdr {
+	uint32_t	nlmsg_len;
+	uint16_t	nlmsg_type;
+	uint16_t	nlmsg_flags;
+	uint32_t	nlmsg_seq;
+	uint32_t	nlmsg_pid;
+};
+
+#define NLM_F_REQUEST	1
+#define NLM_F_MULTI	2
+#define NLM_F_ACK	4
+
+#define NLM_F_ROOT	0x100
+#define NLM_F_MATCH	0x200
+#define NLM_F_ATOMIC	0x400
+#define NLM_F_DUMP	(NLM_F_ROOT|NLM_F_MATCH)
+
+#define NLMSG_NOOP	0x1
+#define NLMSG_ERROR	0x2
+#define NLMSG_DONE	0x3
+#define NLMSG_OVERRUN	0x4
+
+/* linux/rtnetlink.h */
+
+#define RTM_NEWLINK	16
+#define RTM_GETLINK	18
+#define RTM_NEWADDR	20
+#define RTM_GETADDR	22
+
+struct rtattr {
+	unsigned short	rta_len;
+	unsigned short	rta_type;
+};
+
+struct rtgenmsg {
+	unsigned char	rtgen_family;
+};
+
+struct ifinfomsg {
+	unsigned char	ifi_family;
+	unsigned char	__ifi_pad;
+	unsigned short	ifi_type;
+	int		ifi_index;
+	unsigned	ifi_flags;
+	unsigned	ifi_change;
+};
+
+/* linux/if_link.h */
+
+#define IFLA_ADDRESS	1
+#define IFLA_BROADCAST	2
+#define IFLA_IFNAME	3
+#define IFLA_STATS	7
+
+/* linux/if_addr.h */
+
+struct ifaddrmsg {
+	uint8_t		ifa_family;
+	uint8_t		ifa_prefixlen;
+	uint8_t		ifa_flags;
+	uint8_t		ifa_scope;
+	uint32_t	ifa_index;
+};
+
+#define IFA_ADDRESS	1
+#define IFA_LOCAL	2
+#define IFA_LABEL	3
+#define IFA_BROADCAST	4
+
+/* musl */
+
+#define NETLINK_ALIGN(len)	(((len)+3) & ~3)
+#define NLMSG_DATA(nlh)		((void*)((char*)(nlh)+sizeof(struct nlmsghdr)))
+#define NLMSG_DATALEN(nlh)	((nlh)->nlmsg_len-sizeof(struct nlmsghdr))
+#define NLMSG_DATAEND(nlh)	((char*)(nlh)+(nlh)->nlmsg_len)
+#define NLMSG_NEXT(nlh)		(struct nlmsghdr*)((char*)(nlh)+NETLINK_ALIGN((nlh)->nlmsg_len))
+#define NLMSG_OK(nlh,end)	((char*)(end)-(char*)(nlh) >= sizeof(struct nlmsghdr))
+
+#define RTA_DATA(rta)		((void*)((char*)(rta)+sizeof(struct rtattr)))
+#define RTA_DATALEN(rta)	((rta)->rta_len-sizeof(struct rtattr))
+#define RTA_DATAEND(rta)	((char*)(rta)+(rta)->rta_len)
+#define RTA_NEXT(rta)		(struct rtattr*)((char*)(rta)+NETLINK_ALIGN((rta)->rta_len))
+#define RTA_OK(nlh,end)		((char*)(end)-(char*)(rta) >= sizeof(struct rtattr))
+
+#define NLMSG_RTA(nlh,len)	((void*)((char*)(nlh)+sizeof(struct nlmsghdr)+NETLINK_ALIGN(len)))
+#define NLMSG_RTAOK(rta,nlh)	RTA_OK(rta,NLMSG_DATAEND(nlh))
+
+int __rtnetlink_enumerate(int link_af, int addr_af, int (*cb)(void *ctx, struct nlmsghdr *h), void *ctx);
-- 
2.0.3



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-07-30  0:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-28  0:49 Reviewing if_nameindex and getifaddrs patch Rich Felker
2014-07-28  8:13 ` Timo Teras
2014-07-29 14:34   ` Rich Felker
2014-07-29 14:49   ` Rich Felker
2014-07-30  0:55     ` Rich Felker
2014-07-30  0:58       ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).