Oh, one more thing: we might be able to use sendmsg and IP_PKTINFO to select the outgoing interface for each send call instead of binding and requiring multiple sockets. David On Thu, Mar 7, 2024 at 5:30 PM David Schinazi wrote: > > > On Thu, Mar 7, 2024 at 4:08 PM Rich Felker wrote: > >> On Thu, Mar 07, 2024 at 02:50:53PM -0800, David Schinazi wrote: >> > On Wed, Mar 6, 2024 at 6:42 PM Rich Felker wrote: >> > >> > > On Wed, Mar 06, 2024 at 04:17:44PM -0800, David Schinazi wrote: >> > > > As Jeffrey points out, when the IETF decided to standardize mDNS, >> they >> > > > published it (RFC 6762) at the same time as the Special-Use Domain >> > > Registry >> > > > (RFC 6761) which created a process for reserving domain names for >> custom >> > > > purposes, and ".local" was one of the initial entries into that >> registry. >> > > > The UTF-8 vs punycode issue when it comes to mDNS and DNS is >> somewhat of >> > > a >> > > > mess. It was discussed in Section 16 of RFC 6762 but at the end of >> the >> > > day >> > > > punycode won. Even Apple's implementation of getaddrinfo will >> perform >> > > > punycode conversion for .local instead of sending the UTF-8. So in >> > > practice >> > > > you wouldn't need to special-case anything here. >> > > >> > > OK, these are both really good news! >> > > >> > > > There's also very much a policy matter of what "locally over >> > > > > multicast" means (what the user wants it to mean). Which >> interfaces >> > > > > should be queried? Wired and wireless ethernet? VPN links or other >> > > > > sorts of tunnels? Just one local interface (which one to >> prioritize) >> > > > > or all of them? Only if the network is "trusted"? Etc. >> > > > > >> > > > >> > > > You're absolutely right. Most mDNS systems try all non-loopback >> non-p2p >> > > > multicast-supporting interfaces, but sending to the default route >> > > interface >> > > > would be a good start, more on that below. >> > > >> > > This is really one thing that suggests a need for configurability >> > > outside of what libc might be able to offer. With normal DNS lookups, >> > > they're something you can block off and prevent from going to the >> > > network at all by policy (and in fact they don't go past the loopback >> > > by default, in the absence of a resolv.conf file). Adding mDNS that's >> > > on-by-default and not configurable would make a vector for network >> > > traffic being generated that's probably not expected and that could be >> > > a privacy leak. >> > > >> > >> > Totally agree. I was thinking through this both in terms of RFCs and in >> > terms of minimal code changes, and had a potential idea. Conceptually, >> > sending DNS to localhost is musl's IPC mechanism to a more feature-rich >> > resolver running in user-space. So when that's happening, we don't want >> to >> > mess with it because that could cause a privacy leak. Conversely, when >> > there's a non-loopback IP configured in resolv.conf, then musl acts as a >> > DNS stub resolver and the server in resolv.conf acts as a DNS recursive >> > resolver. In that scenario, sending the .local query over DNS to that >> other >> > host violates the RFCs. This allows us to treat the configured resolver >> > address as an implicit configuration mechanism that allows us to >> > selectively enable this without impacting anyone doing their own DNS >> > locally. >> >> This sounds like an odd overloading of one thing to have a very >> different meaning, and would break builtin mDNS for anyone doing >> DNSSEC right (which requires validating nameserver on localhost). >> Inventing a knob that's an overload of an existing knob is still >> inventing a knob, just worse. >> > > Sorry, I was suggesting the other way around: to only enable the mDNS mode > if resolver != 127.0.0.1. But on the topic of DNSSEC, that doesn't really > make sense in the context of mDNS because the names aren't globally unique > and signed. In theory you could exchange DNSSEC keys out of band and use > DNSSEC with mDNS, but I've never heard of anyone doing that. At that point > people exchange TLS certificates out of band and use mTLS. But overall I > can't argue that overloading configs to mean multiple things is janky :-) > > > > > When you do that, how do you control which interface(s) it goes over? >> > > > > I think that's an important missing ingredient. >> > > > >> > > > You're absolutely right. In IPv4, sending to a link-local multicast >> > > address >> > > > like this will send it over the IPv4 default route interface. In >> IPv6, >> > > the >> > > > interface needs to be specified in the scope_id. So we'd need to >> pull >> > > that >> > > > out of the kernel with rtnetlink. >> > > >> > > There's already code to enumerate interfaces, but it's a decent bit of >> > > additional machinery to pull in as a dep for the stub resolver, >> > >> > >> > Yeah we'd need lookup_name.c to include netlink.h - it's not huge >> though, >> > netlink.c is 50 lines long and statically linked anyway right? >> >> I was thinking in terms of using if_nameindex or something, but indeed >> that's not desirable because it's allocating. So it looks like it >> wouldn't share code but use netlink.c directly if it were done this >> way. >> >> BTW if there's a legacy ioctl that tells you the number of interfaces >> (scope_ids), it sems like you could just iterate over the whole >> numeric range without actually doing netlink enumeration. >> > > That would also work. The main limitation I was working around was that > you can only pass around MAXNS (3) name servers around without making more > changes. > > > > and >> > > it's not clear how to do it properly for IPv4 (do scope ids work with >> > > v4-mapped addresses by any chance?) >> > > >> > >> > Scope IDs unfortunately don't work for IPv4. There's the SO_BINDTODEVICE >> > socket option, but that requires elevated privileges. For IPv4 I'd just >> use >> > the default route interface. >> >> But the default route interface is almost surely *not* the LAN where >> you expect .local things to live except in the case where there is >> only one interface. If you have a network that's segmented into >> separate LAN and outgoing interfaces, the LAN, not the route to the >> public internet, is where you would want mDNS going. >> > > In the case of a router, definitely. In the case of most end hosts or VMs > though, they often have only one or two routable interfaces, and the > default route is also the LAN. > > With that said, SO_BINDTODEVICE is not the standard way to do this, >> and the correct/standard way doesn't need root. What it does need is >> binding to the local address on each device, which is still rather >> undesirable because it means you need N sockets for N interfaces, >> rather than one socket that can send/receive all addresses. >> > > Oh you're absolutely right, I knew there was a non-privileged way to do > this but couldn't remember it earlier. > > This is giving me an idea though: we could use the "connect UDP socket to > get a route lookup" trick. Let's say we're configured with a nameserver > that's not 127.0.0.1 (which is the case where I'd like to enable this) > let's say the nameserver is set to 192.0.2.33, then today foobar.local > would be sent to 192.0.2.33 over whichever interface has a route to it (in > most cases the default interface, but not always). We could open an > AF_INET/SOCK_DGRAM socket, connect it to 192.0.2.33:53, and then > use getsockname to get the local address - we then close that socket. We > can then create a new socket, bind it to that local address. That would > ensure that we send the mDNS traffic on the same interface where we would > have sent the unicast query. Downside is that since all queries share the > same socket, we'd bind everything to the interface of the first resolver, > or need multiple sockets. > > You answered for v4, but are you sure scope ids don't work for >> *v4-mapped*? That is, IPv6 addresses of the form >> ::ffff:aaa.bbb.ccc.ddd. I guess I could check this. I'm not very >> hopeful, but it would be excellent if this worked to send v4 multicast >> to a particular interface. >> > > Huh I hadn't thought of that, worth a try? RFC 4007 doesn't really allow > using scope IDs for globally routable addresses but I'm not sure if Linux > does. > > > Another issue you haven't mentioned: how does TCP fallback work with >> > > mDNS? Or are answers too large for standard UDP replies just illegal? >> > > >> > >> > Good point, I hadn't thought of that. That handling for mDNS is defined >> in >> > [1]. In the ephemeral query mode that we'd use here, it works the same >> as >> > for regular DNS: when you receive a response with the TC bit, retry the >> > query with TCP. The slight difference is that you send the TCP to the >> > address you got the response from (not to the multicast address that you >> > sent the original query to). From looking at the musl code, we'd need a >> > small tweak to __res_msend_rc() to use that address. Luckily that code >> > already looks at the sender address so we don't need any additional >> calls >> > to get it. >> >> Yes, that's what I figured you might do. I guess that works reasonably >> well. >> >> > > > Reason for that is that that is the most generic way to support any >> > > > > other name service besides DNS. It avoids the dependency on >> dynamic >> > > > > loading that something like glibc's nsswitch would create, and >> would >> > > > > avoid having multiple backends in libc. I really don't think >> anyone >> > > > > wants to open that particular door. Once mDNS is in there, >> someone will >> > > > > add NetBIOS, just you wait. >> > > > >> > > > >> > > > I'm definitely supportive of the slippery slope argument, but I >> think >> > > > there's still a real line between mDNS and NetBIOS. mDNS uses a >> different >> > > > transport but lives inside the DNS namespace, whereas NetBIOS is >> really >> > > its >> > > > own thing - NetBIOS names aren't valid DNS hostnames. >> > > > >> > > > Let me know what you think of the above. If you think of mDNS as >> its own >> > > > beast then I can see how including it wouldn't really make sense. >> But if >> > > > you see it as an actual part of the DNS, then it might be worth a >> small >> > > > code change :-) >> > > >> > > I'm not worried about slippery slopes to NetBIOS. :-P I am concerned >> > > about unwanted network traffic that can't be suppressed, privacy >> > > leaks, inventing new configuration knobs, potentially pulling in more >> > > code & more fragility, getting stuck supporting something that turns >> > > out to have hidden problems we haven't thought about, etc. >> > > >> > >> > Those are great reasons, and I totally agree with those goals. If we >> scope >> > the problem down with the details higher up in this email, we have a >> way to >> > turn this off (set the resolver to localhost), we avoid privacy leaks in >> > cases where the traffic wasn't going out in the first place, we don't >> have >> > to add more configuration knobs because we're reusing an existing one, >> and >> >> As mentioned above, I don't think "reusing an existing one" is an >> improvement. >> > > Fair, my goal was minimizing change size, but that's not the only goal. > > > the amount of added code would be quite small. Limiting things to the >> > default interface isn't a full multi-network solution, but for those I >> > think it makes more sense to recommend running your own resolver on >> > loopback (you'd need elevated privileges to make this work fully >> anyway). >> > Coding wise, I think this would be pretty robust. The only breakage I >> > foresee is cases where someone built a custom resolver that runs on a >> > different machine and somehow handles .local differently than what the >> RFCs >> > say. That config sounds like a bad idea, and a violation of the RFCs, >> but >> > that doesn't mean there isn't someone somewhere who's doing it. So >> there's >> > a non-zero risk there. But to me that's manageable risk. >> > >> > What do you think? >> >> I think a more reasonable approach might be requiring an explicit knob >> to enable mDNS, in the form of an options field like ndots, timeout, >> retries, etc. in resolv.conf. This ensures that it doesn't become >> attack surface/change-of-behavior in network environments where peers >> are not supposed to be able to define network names. >> > > That would work. I'm not sure who maintains the list of options though. > From a quick search it looks like they came out of 4.3BSD like many > networking features, but it's unclear if POSIX owns it or just no one does > (which would be the same, POSIX is not around as a standard body any more). > > One further advantage of such an approach is that it could also solve >> the "which interface(s)" problem by letting the answer just be >> "whichever one(s) the user configured" (with the default list being >> empty). That way we wouldn't even need netlink, just if_nametoindex to >> convert interface name strings to scope ids, or alternatively (does >> this work for v6 in the absence of an explicit scope_id?) one or more >> local addresses to bind and send from. >> > > I definitely would avoid putting local addresses in the config, because it > would break for any non-static addresses like DHCP or v6 RAs. The interface > name would require walking the getifaddrs list to map it to a corresponding > source address but it would work if the interface name is stable. > > I guess we're looking at two ways to go about this: > > (1) the simpler but less clean option - where we key off of "resolver != > 127.0.0.1" - very limited code size change, but only handles a small subset > of scenarios > > (2) the cleaner option that involves more work - new config option, need > multiple sockets - would be cleaner design-wise, but would change quite a > bit more code > > Another aspect to consider is the fact that in a lot of cases resolv.conf > is overwritten by various components like NetworkManager, so we'd need to > modify them to also understand the option. > > I'm always in favor of doing the right thing, unless the right thing ends > up being so much effort that it doesn't happen. Then I'm a fan of doing the > easy thing ;-) > > David >