mailing list of musl libc
 help / color / mirror / code / Atom feed
* mDNS and alternate hostname database backends
@ 2014-12-15 10:39 Brad Conroy
  2014-12-17  7:00 ` Rich Felker
  0 siblings, 1 reply; 2+ messages in thread
From: Brad Conroy @ 2014-12-15 10:39 UTC (permalink / raw)
  To: musl

I've been looking into using a simplified DNS caching mechanism using the file-
system as the "database" and came across this from the wiki:

> The inability to use mDNS (a multicast-DNS-based zero config system) with musl
> has been raised as an issue by users in the past. On glibc, using mDNS is
> accomplished with NSS; obviously musl does not have (or want) NSS.
>
> In principle, however, musl is fully extensible to use alternate hostname
> database backends in place of normal DNS. All that's needed is a daemon that
> runs on localhost, speaks DNS, and translates the requests to whatever backend
> is needed. However it's unclear whether there are any existing tools of this
> form. Developing one, adapting an existing DNS proxy program, or documenting
> how to setup an existing program that's already capable could be a nice future
> project.

My idea is much simpler: store the data as file name by the hostname (in /tmp ?):
/tmp/hosts/a for ipv4  (limit to 15/host so they can be stored in the inode*)
/tmp/hosts/aaaa for ipv6 (limit to 3/host *)
* of the filesystems capable of inlining data, ext4 has the lowest at 60 bytes.
This means we can just read/write an array of uint32_t for ipv4 and uint128_t
with something like:

	static int get_value(const char *path, void *buf,size_t len){
		int fd = open(path, O_RDONLY);
		if (fd<0) return fd;
		len=read(fd,buf,len);
		close(fd);
		return len;
	}
	static int set_value(const char *path, void *buf,size_t len){
		int fd = open(path, O_CREAT|O_WRONLY|O_TRUNC);
		if (fd<0) return fd;
		len=write(fd,buf,len);
		close(fd);
		return len;
	}

The existing systems /etc/hosts* don't account for TTL, but using the filesystem
we can hack this feature pretty simply using the filesystem by adding the TTL
to the modification time.

	struct utimbuf ut={.actime=st.st_atime, .modtime=ttl+st.st_mtime};
	utime(path,&ut);

Note: I chose mod time for TTL since a file system may be mounted noatime

initilization:
if /tmp/hosts/a (or aaaa for ipv6) does not exist
   1. mkdir
   2. read in /etc/hosts to our format
       a.) for 0.0.0.0 and 127.0.0.1 and their mathing ipv6 counterparts :: and ::1,
            create a hard link to NULL and localhost
       b.)similarly create hard links for aliases. for example:

/etc/hosts|  74.125.225.134 www.google.com google.com www.bing.com bing.com

/tmp/hosts/a/www.google.com will contain a uint32_t representing 74.125.225.134
with a modification time set to INT_MAX
  /tmp/hosts/a/google.com  --hardlink--> /tmp/hosts/a/www.google.com
  /tmp/hosts/a/www.bing.com --hardlink--> /tmp/hosts/a/www.google.com
  /tmp/hosts/a/bing.com --hardlink--> /tmp/hosts/a/www.google.com

This seems pretty over-simplified, but it opens up some possibilities:

 1. the network functions could be much smaller and rely on a single binary to do
     all of the hard work in a unix style.  Before anyone argues that starting an
     external program takes too long, I must point out that this is typically
     insignificant compared to DNS query/response time and that keeping this
     functionality internal to the libc requires making certain tradeoffs to keep
     the overall code size and complexity down.  Other functions already call
     /bin/sh IIRC, so this isn't a huge leap. ... though all code _could_ stay in
     the libc if there is a good argument for it.
2.  sharing caches between clients now becomes as easy as using rsync or
     even tar or cpio.
3.  A cron task can replace a running daemon to periodically clean up the cache
     If disk space is low, it can purge or it could even systematically recheck the
     DNS, update the TTL and even ping all the entries to get a response time so
     it can sort them from fastest to slowest entries 
4.  Ad-blocking can be as simple as:
       cd /tmp/hosts/a;
       ln NULL pagead2.googlesyndication.com
5.  Filtering can also be accomplish using standard users/groups.
     blacklist filtering by making them hardlinks to NULL
     whitelist filtering by making /tmp/hosts read only
6.  Because the cache is so simple, integrating it to work with other caching
     methods like nss/nscd, libresolvconf, dnsmasq, djbdns and bind _should_
     be fairly straight forward.

I've been working on my own rudimentary implementation to include with my
own libc.h headers, (only for small, single file static apps) but I primarily use
musl, so I'd be interested in hearing any feedback, especially if there is a
possibility that it could become a standard practice.  My guess is that it has
probably already been done by Bell Labs for Plan.

R,
Brad Conroy


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: mDNS and alternate hostname database backends
  2014-12-15 10:39 mDNS and alternate hostname database backends Brad Conroy
@ 2014-12-17  7:00 ` Rich Felker
  0 siblings, 0 replies; 2+ messages in thread
From: Rich Felker @ 2014-12-17  7:00 UTC (permalink / raw)
  To: musl

On Mon, Dec 15, 2014 at 02:39:46AM -0800, Brad Conroy wrote:
> I've been looking into using a simplified DNS caching mechanism using the file-
> system as the "database" and came across this from the wiki:
> 
> > The inability to use mDNS (a multicast-DNS-based zero config system) with musl
> > has been raised as an issue by users in the past. On glibc, using mDNS is
> > accomplished with NSS; obviously musl does not have (or want) NSS.
> >
> > In principle, however, musl is fully extensible to use alternate hostname
> > database backends in place of normal DNS. All that's needed is a daemon that
> > runs on localhost, speaks DNS, and translates the requests to whatever backend
> > is needed. However it's unclear whether there are any existing tools of this
> > form. Developing one, adapting an existing DNS proxy program, or documenting
> > how to setup an existing program that's already capable could be a nice future
> > project.
> 
> My idea is much simpler: store the data as file name by the hostname (in /tmp ?):

/tmp is most certainly the wrong place for anything like this. The
only thing that's valid to create in /tmp is random filenames, with
proper mechanisms (e.g. O_EXCL or mkdir) to avoid collisions. This is
because it's a shared namespace and anyone can create things there. A
malicious user could drop a file named /tmp/hosts before you mkdir it,
or mkdir their own directory with their own malicious entries in it.

Presumably you want something under /var/ in a directory owned by
whoever manages it, possibly with a symlink from /etc/.

> /tmp/hosts/a for ipv4  (limit to 15/host so they can be stored in the inode*)
> /tmp/hosts/aaaa for ipv6 (limit to 3/host *)
> * of the filesystems capable of inlining data, ext4 has the lowest at 60 bytes.
> This means we can just read/write an array of uint32_t for ipv4 and uint128_t
> with something like:
> 
> 	static int get_value(const char *path, void *buf,size_t len){
> 		int fd = open(path, O_RDONLY);
> 		if (fd<0) return fd;
> 		len=read(fd,buf,len);
> 		close(fd);
> 		return len;
> 	}
> 	static int set_value(const char *path, void *buf,size_t len){
> 		int fd = open(path, O_CREAT|O_WRONLY|O_TRUNC);
> 		if (fd<0) return fd;
> 		len=write(fd,buf,len);
> 		close(fd);
> 		return len;
> 	}
> 
> The existing systems /etc/hosts* don't account for TTL, but using the filesystem
> we can hack this feature pretty simply using the filesystem by adding the TTL
> to the modification time.
> 
> 	struct utimbuf ut={.actime=st.st_atime, .modtime=ttl+st.st_mtime};
> 	utime(path,&ut);
> 
> Note: I chose mod time for TTL since a file system may be mounted noatime
> 
> initilization:
> if /tmp/hosts/a (or aaaa for ipv6) does not exist
>    1. mkdir
>    2. read in /etc/hosts to our format
>        a.) for 0.0.0.0 and 127.0.0.1 and their mathing ipv6 counterparts :: and ::1,
>             create a hard link to NULL and localhost
>        b.)similarly create hard links for aliases. for example:
> 
> /etc/hosts|  74.125.225.134 www.google.com google.com www.bing.com bing.com
> 
> /tmp/hosts/a/www.google.com will contain a uint32_t representing 74.125.225.134
> with a modification time set to INT_MAX
>   /tmp/hosts/a/google.com  --hardlink--> /tmp/hosts/a/www.google.com
>   /tmp/hosts/a/www.bing.com --hardlink--> /tmp/hosts/a/www.google.com
>   /tmp/hosts/a/bing.com --hardlink--> /tmp/hosts/a/www.google.com
> 
> This seems pretty over-simplified, but it opens up some possibilities:
> 
>  1. the network functions could be much smaller and rely on a single binary to do
>      all of the hard work in a unix style.  Before anyone argues that starting an
>      external program takes too long, I must point out that this is typically
>      insignificant compared to DNS query/response time and that keeping this
>      functionality internal to the libc requires making certain tradeoffs to keep
>      the overall code size and complexity down.  Other functions already call
>      /bin/sh IIRC, so this isn't a huge leap. ... though all code _could_ stay in
>      the libc if there is a good argument for it.
> 2.  sharing caches between clients now becomes as easy as using rsync or
>      even tar or cpio.
> 3.  A cron task can replace a running daemon to periodically clean up the cache
>      If disk space is low, it can purge or it could even systematically recheck the
>      DNS, update the TTL and even ping all the entries to get a response time so
>      it can sort them from fastest to slowest entries 
> 4.  Ad-blocking can be as simple as:
>        cd /tmp/hosts/a;
>        ln NULL pagead2.googlesyndication.com
> 5.  Filtering can also be accomplish using standard users/groups.
>      blacklist filtering by making them hardlinks to NULL
>      whitelist filtering by making /tmp/hosts read only
> 6.  Because the cache is so simple, integrating it to work with other caching
>      methods like nss/nscd, libresolvconf, dnsmasq, djbdns and bind _should_
>      be fairly straight forward.
> 
> I've been working on my own rudimentary implementation to include with my
> own libc.h headers, (only for small, single file static apps) but I primarily use
> musl, so I'd be interested in hearing any feedback, especially if there is a
> possibility that it could become a standard practice.  My guess is that it has
> probably already been done by Bell Labs for Plan.

While this may be a good system, musl isn't really in the business of
imposing policy unless there's strong existing precedent. On the other
hand, the beauty of the "just run a daemon speaking DNS protocol" is
that you can write an utterly trivial daemon that serves records
stored in the above form over DNS protocol, and have musl's resolver
(or any other resolver) return these records with no code changes
whatsoever.

Rich


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2014-12-17  7:00 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-15 10:39 mDNS and alternate hostname database backends Brad Conroy
2014-12-17  7:00 ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).