Oh, one more thing: we might be able to use sendmsg and IP_PKTINFO to
select the outgoing interface for each send call instead of binding and
requiring multiple sockets.
David

On Thu, Mar 7, 2024 at 5:30 PM David Schinazi <dschinazi.ietf@gmail.com>
wrote:

>
>
> On Thu, Mar 7, 2024 at 4:08 PM Rich Felker <dalias@libc.org> wrote:
>
>> On Thu, Mar 07, 2024 at 02:50:53PM -0800, David Schinazi wrote:
>> > On Wed, Mar 6, 2024 at 6:42 PM Rich Felker <dalias@libc.org> wrote:
>> >
>> > > On Wed, Mar 06, 2024 at 04:17:44PM -0800, David Schinazi wrote:
>> > > > As Jeffrey points out, when the IETF decided to standardize mDNS,
>> they
>> > > > published it (RFC 6762) at the same time as the Special-Use Domain
>> > > Registry
>> > > > (RFC 6761) which created a process for reserving domain names for
>> custom
>> > > > purposes, and ".local" was one of the initial entries into that
>> registry.
>> > > > The UTF-8 vs punycode issue when it comes to mDNS and DNS is
>> somewhat of
>> > > a
>> > > > mess. It was discussed in Section 16 of RFC 6762 but at the end of
>> the
>> > > day
>> > > > punycode won. Even Apple's implementation of getaddrinfo will
>> perform
>> > > > punycode conversion for .local instead of sending the UTF-8. So in
>> > > practice
>> > > > you wouldn't need to special-case anything here.
>> > >
>> > > OK, these are both really good news!
>> > >
>> > > > There's also very much a policy matter of what "locally over
>> > > > > multicast" means (what the user wants it to mean). Which
>> interfaces
>> > > > > should be queried? Wired and wireless ethernet? VPN links or other
>> > > > > sorts of tunnels? Just one local interface (which one to
>> prioritize)
>> > > > > or all of them? Only if the network is "trusted"? Etc.
>> > > > >
>> > > >
>> > > > You're absolutely right. Most mDNS systems try all non-loopback
>> non-p2p
>> > > > multicast-supporting interfaces, but sending to the default route
>> > > interface
>> > > > would be a good start, more on that below.
>> > >
>> > > This is really one thing that suggests a need for configurability
>> > > outside of what libc might be able to offer. With normal DNS lookups,
>> > > they're something you can block off and prevent from going to the
>> > > network at all by policy (and in fact they don't go past the loopback
>> > > by default, in the absence of a resolv.conf file). Adding mDNS that's
>> > > on-by-default and not configurable would make a vector for network
>> > > traffic being generated that's probably not expected and that could be
>> > > a privacy leak.
>> > >
>> >
>> > Totally agree. I was thinking through this both in terms of RFCs and in
>> > terms of minimal code changes, and had a potential idea. Conceptually,
>> > sending DNS to localhost is musl's IPC mechanism to a more feature-rich
>> > resolver running in user-space. So when that's happening, we don't want
>> to
>> > mess with it because that could cause a privacy leak. Conversely, when
>> > there's a non-loopback IP configured in resolv.conf, then musl acts as a
>> > DNS stub resolver and the server in resolv.conf acts as a DNS recursive
>> > resolver. In that scenario, sending the .local query over DNS to that
>> other
>> > host violates the RFCs. This allows us to treat the configured resolver
>> > address as an implicit configuration mechanism that allows us to
>> > selectively enable this without impacting anyone doing their own DNS
>> > locally.
>>
>> This sounds like an odd overloading of one thing to have a very
>> different meaning, and would break builtin mDNS for anyone doing
>> DNSSEC right (which requires validating nameserver on localhost).
>> Inventing a knob that's an overload of an existing knob is still
>> inventing a knob, just worse.
>>
>
> Sorry, I was suggesting the other way around: to only enable the mDNS mode
> if resolver != 127.0.0.1. But on the topic of DNSSEC, that doesn't really
> make sense in the context of mDNS because the names aren't globally unique
> and signed. In theory you could exchange DNSSEC keys out of band and use
> DNSSEC with mDNS, but I've never heard of anyone doing that. At that point
> people exchange TLS certificates out of band and use mTLS. But overall I
> can't argue that overloading configs to mean multiple things is janky :-)
>
> > > > When you do that, how do you control which interface(s) it goes over?
>> > > > > I think that's an important missing ingredient.
>> > > >
>> > > > You're absolutely right. In IPv4, sending to a link-local multicast
>> > > address
>> > > > like this will send it over the IPv4 default route interface. In
>> IPv6,
>> > > the
>> > > > interface needs to be specified in the scope_id. So we'd need to
>> pull
>> > > that
>> > > > out of the kernel with rtnetlink.
>> > >
>> > > There's already code to enumerate interfaces, but it's a decent bit of
>> > > additional machinery to pull in as a dep for the stub resolver,
>> >
>> >
>> > Yeah we'd need lookup_name.c to include netlink.h - it's not huge
>> though,
>> > netlink.c is 50 lines long and statically linked anyway right?
>>
>> I was thinking in terms of using if_nameindex or something, but indeed
>> that's not desirable because it's allocating. So it looks like it
>> wouldn't share code but use netlink.c directly if it were done this
>> way.
>>
>> BTW if there's a legacy ioctl that tells you the number of interfaces
>> (scope_ids), it sems like you could just iterate over the whole
>> numeric range without actually doing netlink enumeration.
>>
>
> That would also work. The main limitation I was working around was that
> you can only pass around MAXNS (3) name servers around without making more
> changes.
>
> > > and
>> > > it's not clear how to do it properly for IPv4 (do scope ids work with
>> > > v4-mapped addresses by any chance?)
>> > >
>> >
>> > Scope IDs unfortunately don't work for IPv4. There's the SO_BINDTODEVICE
>> > socket option, but that requires elevated privileges. For IPv4 I'd just
>> use
>> > the default route interface.
>>
>> But the default route interface is almost surely *not* the LAN where
>> you expect .local things to live except in the case where there is
>> only one interface. If you have a network that's segmented into
>> separate LAN and outgoing interfaces, the LAN, not the route to the
>> public internet, is where you would want mDNS going.
>>
>
> In the case of a router, definitely. In the case of most end hosts or VMs
> though, they often have only one or two routable interfaces, and the
> default route is also the LAN.
>
> With that said, SO_BINDTODEVICE is not the standard way to do this,
>> and the correct/standard way doesn't need root. What it does need is
>> binding to the local address on each device, which is still rather
>> undesirable because it means you need N sockets for N interfaces,
>> rather than one socket that can send/receive all addresses.
>>
>
> Oh you're absolutely right, I knew there was a non-privileged way to do
> this but couldn't remember it earlier.
>
> This is giving me an idea though: we could use the "connect UDP socket to
> get a route lookup" trick. Let's say we're configured with a nameserver
> that's not 127.0.0.1 (which is the case where I'd like to enable this)
> let's say the nameserver is set to 192.0.2.33, then today foobar.local
> would be sent to 192.0.2.33 over whichever interface has a route to it (in
> most cases the default interface, but not always). We could open an
> AF_INET/SOCK_DGRAM socket, connect it to 192.0.2.33:53, and then
> use getsockname to get the local address - we then close that socket. We
> can then create a new socket, bind it to that local address. That would
> ensure that we send the mDNS traffic on the same interface where we would
> have sent the unicast query. Downside is that since all queries share the
> same socket, we'd bind everything to the interface of the first resolver,
> or need multiple sockets.
>
> You answered for v4, but are you sure scope ids don't work for
>> *v4-mapped*? That is, IPv6 addresses of the form
>> ::ffff:aaa.bbb.ccc.ddd. I guess I could check this. I'm not very
>> hopeful, but it would be excellent if this worked to send v4 multicast
>> to a particular interface.
>>
>
> Huh I hadn't thought of that, worth a try? RFC 4007 doesn't really allow
> using scope IDs for globally routable addresses but I'm not sure if Linux
> does.
>
> > Another issue you haven't mentioned: how does TCP fallback work with
>> > > mDNS? Or are answers too large for standard UDP replies just illegal?
>> > >
>> >
>> > Good point, I hadn't thought of that. That handling for mDNS is defined
>> in
>> > [1]. In the ephemeral query mode that we'd use here, it works the same
>> as
>> > for regular DNS: when you receive a response with the TC bit, retry the
>> > query with TCP. The slight difference is that you send the TCP to the
>> > address you got the response from (not to the multicast address that you
>> > sent the original query to). From looking at the musl code, we'd need a
>> > small tweak to __res_msend_rc() to use that address. Luckily that code
>> > already looks at the sender address so we don't need any additional
>> calls
>> > to get it.
>>
>> Yes, that's what I figured you might do. I guess that works reasonably
>> well.
>>
>> > > > Reason for that is that that is the most generic way to support any
>> > > > > other name service besides DNS. It avoids the dependency on
>> dynamic
>> > > > > loading that something like glibc's nsswitch would create, and
>> would
>> > > > > avoid having multiple backends in libc. I really don't think
>> anyone
>> > > > > wants to open that particular door. Once mDNS is in there,
>> someone will
>> > > > > add NetBIOS, just you wait.
>> > > >
>> > > >
>> > > > I'm definitely supportive of the slippery slope argument, but I
>> think
>> > > > there's still a real line between mDNS and NetBIOS. mDNS uses a
>> different
>> > > > transport but lives inside the DNS namespace, whereas NetBIOS is
>> really
>> > > its
>> > > > own thing - NetBIOS names aren't valid DNS hostnames.
>> > > >
>> > > > Let me know what you think of the above. If you think of mDNS as
>> its own
>> > > > beast then I can see how including it wouldn't really make sense.
>> But if
>> > > > you see it as an actual part of the DNS, then it might be worth a
>> small
>> > > > code change :-)
>> > >
>> > > I'm not worried about slippery slopes to NetBIOS. :-P I am concerned
>> > > about unwanted network traffic that can't be suppressed, privacy
>> > > leaks, inventing new configuration knobs, potentially pulling in more
>> > > code & more fragility, getting stuck supporting something that turns
>> > > out to have hidden problems we haven't thought about, etc.
>> > >
>> >
>> > Those are great reasons, and I totally agree with those goals. If we
>> scope
>> > the problem down with the details higher up in this email, we have a
>> way to
>> > turn this off (set the resolver to localhost), we avoid privacy leaks in
>> > cases where the traffic wasn't going out in the first place, we don't
>> have
>> > to add more configuration knobs because we're reusing an existing one,
>> and
>>
>> As mentioned above, I don't think "reusing an existing one" is an
>> improvement.
>>
>
> Fair, my goal was minimizing change size, but that's not the only goal.
>
> > the amount of added code would be quite small. Limiting things to the
>> > default interface isn't a full multi-network solution, but for those I
>> > think it makes more sense to recommend running your own resolver on
>> > loopback (you'd need elevated privileges to make this work fully
>> anyway).
>> > Coding wise, I think this would be pretty robust. The only breakage I
>> > foresee is cases where someone built a custom resolver that runs on a
>> > different machine and somehow handles .local differently than what the
>> RFCs
>> > say. That config sounds like a bad idea, and a violation of the RFCs,
>> but
>> > that doesn't mean there isn't someone somewhere who's doing it. So
>> there's
>> > a non-zero risk there. But to me that's manageable risk.
>> >
>> > What do you think?
>>
>> I think a more reasonable approach might be requiring an explicit knob
>> to enable mDNS, in the form of an options field like ndots, timeout,
>> retries, etc. in resolv.conf. This ensures that it doesn't become
>> attack surface/change-of-behavior in network environments where peers
>> are not supposed to be able to define network names.
>>
>
> That would work. I'm not sure who maintains the list of options though.
> From a quick search it looks like they came out of 4.3BSD like many
> networking features, but it's unclear if POSIX owns it or just no one does
> (which would be the same, POSIX is not around as a standard body any more).
>
> One further advantage of such an approach is that it could also solve
>> the "which interface(s)" problem by letting the answer just be
>> "whichever one(s) the user configured" (with the default list being
>> empty). That way we wouldn't even need netlink, just if_nametoindex to
>> convert interface name strings to scope ids, or alternatively (does
>> this work for v6 in the absence of an explicit scope_id?) one or more
>> local addresses to bind and send from.
>>
>
> I definitely would avoid putting local addresses in the config, because it
> would break for any non-static addresses like DHCP or v6 RAs. The interface
> name would require walking the getifaddrs list to map it to a corresponding
> source address but it would work if the interface name is stable.
>
> I guess we're looking at two ways to go about this:
>
> (1) the simpler but less clean option - where we key off of "resolver !=
> 127.0.0.1" - very limited code size change, but only handles a small subset
> of scenarios
>
> (2) the cleaner option that involves more work - new config option, need
> multiple sockets - would be cleaner design-wise, but would change quite a
> bit more code
>
> Another aspect to consider is the fact that in a lot of cases resolv.conf
> is overwritten by various components like NetworkManager, so we'd need to
> modify them to also understand the option.
>
> I'm always in favor of doing the right thing, unless the right thing ends
> up being so much effort that it doesn't happen. Then I'm a fan of doing the
> easy thing ;-)
>
> David
>