From mboxrd@z Thu Jan  1 00:00:00 1970
From: gtaylor at tnetconsulting.net (Grant Taylor)
Date: Mon, 21 Sep 2020 18:19:18 -0600
Subject: [COFF] A little networking tool to reduce having to run
 emulators with privilege
In-Reply-To: <20200921213834.KaMCt%steffen@sdaoden.eu>
References: <23BB3E13-7306-4BB6-9566-DF4C61DE9799@gmail.com>
 <20200921213834.KaMCt%steffen@sdaoden.eu>
Message-ID: <1dbc110c-8844-040d-a08d-07914094b47f@spamtrap.tnetconsulting.net>

On 9/21/20 3:38 PM, Steffen Nurpmeso wrote:
> Bridges usually do not work with wireless interfaces, it need some 
> v?eth.

Is it the bridge that's the problem or is it the wireless interface 
that's the problem?

My understanding is that some (many?) wireless interfaces are funky 
regarding multiple MAC addresses.  Mostly in that they don't work with 
extra MAC addresses.

What would you do with veth interfaces in this context?

> And those br* tools are not everywhere, too (grr).

I've been able to use ip to do much of what I used to do with brctl.

    ip link add bri0 type bridge
    ip link set eth1 master bri0

(from memory)

> Have you ever considered network namespaces?

What would network namespaces provide in this context?

Would you run the $EMULATOR in the network namespace (~container)?

What does that get you that running the $EMULATOR in the main / root / 
unnamed network namespace does not get you?  --  If anything, I'd think 
that running the $EMULATOR in the network namespace would add additional 
networking complexity.

Don't get me wrong, I'm all for network namespaces and doing fun(ky) 
things with them.  I've emulated entire corporate networks with network 
namespaces.  I've currently got nine of them on the system I'm replying 
from, with dynamic routing.

> After over a year of using proxy_arp based pseudo bridging (cool!)

I'll argue that Proxy ARP /is/ a form of routing.  ;-)  Your system 
replies to ARP requests for the IP(s) behind it and the packets are sent 
to it as if it's a router.  }:-)

> i finally wrapped my head around veth,

I find veth to be quite helpful.  {MAC,IP}{VLAN,VTAP} are also fun.

Aside:  {MAC,IP}{VLAN,VTAP} is slightly more difficult to get working. 
In my experience, {MAC,IP}{VLAN,VTAP} don't support talking to the host 
directly, instead you need to create an additional {MAC,IP}{VLAN,VTAP} 
and have the hose use it as if it is it's own guest.

> with it and Linux network namespaces i loose 40 percent ping response 
> speed, but have a drastically reduced need for configuration.

I've never noticed any sort of increased latency worth mentioning.

> What i have is this, maybe you find it useful.  It does not need
> any firewall rules.  (Except allowing 10.0.0.0/8.)
> 
> In my net-qos.sh (which is my shared-everywhere firewall and tc
> script)
> 
>    vm_ns_start() {
>          #net.ipv4.conf.all.arp_ignore=0
>       sysctl -w \
>          net.ipv4.ip_forward=1
> 
>       ${ip} link add v_n type veth peer name v_i
>       ${ip} netns add v_ns
>       ${ip} link set v_i netns v_ns

If you create the netns first, then you can have the veth interface 
created and moved into the network namespace in one command.

    ${ip} link add v_n type veth peer name v_i netns v_ns

Note:  This does create the v_i veth interface in the network namespace 
that you're running the command in and then automatically move it for you.

>       ${ip} a add 10.0.0.1/8 dev v_n
>       ${ip} link set v_n up
>       ${ip} route add 10.0.0.1 dev v_n

Why are you adding a (host) route to 10.0.0.1 when it's part of 
10.0.0.0/8 which is going out the same interface?

>       ${ip} netns exec v_ns ${ip} link set lo up
>       #if [ -z "$BR" ]; then
>       #   ${ip} netns exec v_ns ip addr add 10.1.0.1/8 dev v_i broadcast +
>       #   ${ip} netns exec v_ns ip link set v_i up
>       #   ${ip} netns exec v_ns ip route add default via 10.0.0.1
>       #else
>          ${ip} netns exec v_ns ${ip} link set v_i up
>          ${ip} netns exec v_ns ${ip} link add v_br type bridge

Why are you adding a bridge inside the v_ns network namespace?

>          ${ip} netns exec v_ns ${ip} addr add 10.1.0.1/8 dev v_br broadcast +
>          ${ip} netns exec v_ns ${ip} link set v_br up
>          ${ip} netns exec v_ns ${ip} link set v_i master v_br

Why are you adding the v_i interface to the bridge /inside/ the network 
namespace?

>          ${ip} netns exec v_ns ${ip} route add default via 10.0.0.1
>       #fi
>    }

What does creating a bridge with a single interface /inside/ of the 
network namespace get you?

I would have assumed that you were creating the bridge outside the 
network namespace and adding the network namespace's outside veth to 
said bridge.

>    vm_ns_stop() {
>       ${ip} netns del v_ns
> 
> ^ That easy it is!

Yep.  I've done a LOT of things like that.  Though I have the bridge 
outside.

>          #net.ipv4.conf.all.arp_ignore=1

What was (historically, since it's commented out) the purpose for 
setting arp_ignore to 1.

>       sysctl -w \
>          net.ipv4.ip_forward=0
>    }
> 
> And then, in my /x/vm directory the qemu .ifup.sh script
> 
>    #!/bin/sh -
> 
>    if [ "$VMNETMODE" = bridge ]; then
>       ip link set dev $1 master v_br

This is more what I would expect.

>       ip link set $1 up
>    elif [ "$VMNETMODE" = proxy_arp ]; then
>       echo 1 > /proc/sys/net/ipv4/conf/$1/proxy_arp
>       ip link set $1 up
>       ip route add $VMADDR dev $1

I guess the route is because you're using Proxy ARP.

That makes me ask, is the 10.0.0.0/8 network also used on the outside 
home LAN?

>    else
>       echo >&2 Unknown VMNETMODE=$VMNETMODE
>    fi

;-)

> Of course qemu creates the actual device for me here.
> The .ifdown.sh script i omit, it is not used in this "vbridge"
> mode.  It would do nothing really, and it cannot be called because
> i now can chroot into /x/vm (needs dev/u?random due to libcrypt
> needing it though it would not need them, but i cannot help it).

You can bind mount /dev into the chroot.  That way you could chroot in. 
Much like Gentoo does during installation.

> This then gets driven by a .run.sh script (which is called by the
> real per-VM scripts, like
> 
>    #!/bin/sh -
>    # root.alp-2020, steffen: Sway
> 
>    debug=
>    vmsys=x86_64
>    vmname=alp-2020
>    vmimg=.alp-2020-amd64.vmdk
>    vmpower=half
>    vmmac=52:54:45:01:00:12
>    vmcustom= #'-boot menu=on -cdrom /x/iso/alpine-virt-3.12.0-x86_64.iso'
> 
>    . /x/vm/.run.sh
>    # s-sh-mode
> 
> so, and finally invokes qemu like so
> 
>    echo 'Monitor at '$0' monitor'
>    eval exec $sudo /bin/ip netns exec v_ns /usr/bin/qemu-system-$vmsys \
>       -name $VMNAME $runas $chroot \
>       $host $accel $vmdisp $net $usb $vmrng $vmcustom \
>       -monitor telnet:127.0.0.1:$monport,server,nowait \
>       -drive file=$vmimg,index=0,if=ide$drivecache \
>       $redir
> 
> Users in the vm group may use that sudo, qemu is executed in the
> v_ns network namespace under runas='-runas vm' and jailed via
> chroot='-chroot .'.  It surely could be more sophisticated, more
> cgroups, whatever.  Good enough for me.

:-)

Have you spent any time looking at unshare and / or nsenter?

    # Spawn the lab# NetNSs and set it's hostname.
    unshare --mount=/run/mountns/${1} --net=/run/netns/${1} 
--uts=/run/utsns/${1} /bin/hostname ${1}
    # Bring up the loopback interface.
    nsenter --mount=/run/mountns/${1} --net=/run/netns/${1} 
--uts=/run/utsns/${1} /bin/ip link set dev lo up

I use the mount, net, and uts namespaces.  RTFM for more details on 
different combinations of namespaces.

> That .run.sh does enter
> 
>     if [ "$1" = monitor ]; then
>        echo 'Entering monitor of '$VMNAME' ('$VMADDR') at '$monport
>        eval exec $sudo /bin/ip netns exec v_ns telnet localhost $monport
>        exit 5
> 
> and enters via ssh
> 
>     elif [ "$1" = ssh ]; then
>        echo 'SSH into '$VMNAME' ('$VMADDR')'
>        doex=exec
>        if command -v tmux >/dev/null 2>&1 && [ -n "$TMUX_PANE" ]; then
>           tmux set window-active-style bg=colour231,fg=colour0
>           doex=
>        fi
>        ( eval $doex ssh $VMADDR )
>        exec tmux set window-active-style bg=default,fg=default
>        exit 5
> 
> for me.  (I use VMs in Donald Knuth emacs colour scheme it seems,
> at least more or less.  VMs here, VM there.  Hm.)

... VM everywhere.

But ... are they /really/ VMs?

You're running an /emulator/* in a (home grown) /container/.

}:-)

*Okay.  QEMU can be more of a VM than an emulator, depending on command 
line options.

> Overall this network namespace thing is pretty cool.  Especially
> since, compared to FreeBSD jails, for example, you simply can run
> a single command.  Unfair comparison though.  WHat i'd really
> wish would be a system which is totally embedded in that
> namespace/jail idea.  I.e., _one_ /, and then only moving targets
> mounted via overlayfs into "per-jail" directories.  Never found
> time nor motivation to truly try this out.

I'm not that familiar with jails.  But I'm convinced that there are some 
possibilities with namespaces (containers) that may come close.

Network namespaces (ip netns ...) don't alter the mount namespace. 
That's one of the advantages of unshare / nsenter.  You can create new 
namespace (container) specific mount configurations.


-- 
Grant. . . .
unix || die

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4013 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://minnie.tuhs.org/pipermail/coff/attachments/20200921/4833daf7/attachment.bin>