Development discussion of WireGuard
 help / color / mirror / Atom feed
* Wireguard uses incorrect interface - routing issue
@ 2024-06-21 11:13 Nico Schottelius
  2024-06-21 11:24 ` Nico Schottelius
  2024-06-21 14:42 ` Diyaa Alkanakre
  0 siblings, 2 replies; 7+ messages in thread
From: Nico Schottelius @ 2024-06-21 11:13 UTC (permalink / raw)
  To: WireGuard mailing list

[-- Attachment #1: Type: text/plain, Size: 1835 bytes --]


Hello again,

I'm sorry to flood the mailing list with wireguard bugs, but it seems
there is yet another routing bug in wireguard - happy to be wrong, but
here are my findings:

a) system has source based routing on via ip rule:

[11:07] server141.place10:~# ip rule ls
0:      from all lookup local
32765:  from 192.168.1.0/24 lookup 42
32766:  from all lookup main
32767:  from all lookup default
[11:07] server141.place10:~# ip route sh table 42
194.5.220.0/24 via 192.168.1.254 dev eth1 proto bird metric 32 
194.187.90.23 via 192.168.1.254 dev eth1 proto bird metric 32 
212.103.65.231 via 192.168.1.254 dev eth1 proto bird metric 32 
[11:08] server141.place10:~# 

This should ensure that packets towards 194.187.90.23 travel via eth1.

b) tcpdump for verification

Using "tcpdump -ni any port 4000" I observe:

11:10:22.445638 eth0  Out IP 192.168.1.149.58591 > 194.187.90.23.4000: UDP, length 148
11:10:27.447026 eth0  Out IP 192.168.1.149.58591 > 194.187.90.23.4000: UDP, length 148
11:10:32.448329 eth0  Out IP 192.168.1.149.58591 > 194.187.90.23.4000: UDP, length 148
11:10:37.449719 eth0  Out IP 192.168.1.149.58591 > 194.187.90.23.4000: UDP, length 148

c) Route in main table

There is indeed a route in the main routing table that matches, too:

[11:08] server141.place10:~# ip r get 194.187.90.23
194.187.90.23 via 10.5.2.123 dev eth0 src 192.168.1.149 uid 0 
    cache 

d) ip rule not working (?)

So from what I can observe it is that ip rule does not work together
with wireguard / wireguard routing takes the route from main fib instead
of from the separate table.

I am not sure if this is related at all to the IP address binding bug,
but it appears in a similar context from our tests.

BR,

Nico

-- 
Sustainable and modern Infrastructures by ungleich.ch

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Wireguard uses incorrect interface - routing issue
  2024-06-21 11:13 Wireguard uses incorrect interface - routing issue Nico Schottelius
@ 2024-06-21 11:24 ` Nico Schottelius
  2024-06-21 12:29   ` Daniel Gröber
  2024-06-21 14:42 ` Diyaa Alkanakre
  1 sibling, 1 reply; 7+ messages in thread
From: Nico Schottelius @ 2024-06-21 11:24 UTC (permalink / raw)
  To: WireGuard mailing list


p.s.: the route lookup looks correct on the machine, when selecting the
source IP:

[11:15] server141.place10:~# ip r get 194.187.90.23
194.187.90.23 via inet6 fe80::3eec:efff:fecb:d81a dev eth0 src 192.168.1.149 uid 0 
    cache 
[11:16] server141.place10:~# ip r get 194.187.90.23 from 192.168.1.149
194.187.90.23 from 192.168.1.149 via 192.168.1.254 dev eth1 table 42 uid 0 
    cache 

wireguard still uses the wrong interface:

11:20:13.115154 eth0  Out IP 192.168.1.149.60031 > 194.187.90.23.4000: UDP, length 148


-- 
Sustainable and modern Infrastructures by ungleich.ch

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Wireguard uses incorrect interface - routing issue
  2024-06-21 11:24 ` Nico Schottelius
@ 2024-06-21 12:29   ` Daniel Gröber
  2024-06-22  9:22     ` Nico Schottelius
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Gröber @ 2024-06-21 12:29 UTC (permalink / raw)
  To: Nico Schottelius; +Cc: WireGuard mailing list

On Fri, Jun 21, 2024 at 01:24:47PM +0200, Nico Schottelius wrote:
> 
> p.s.: the route lookup looks correct on the machine, when selecting the
> source IP:
> 
> [11:15] server141.place10:~# ip r get 194.187.90.23
> 194.187.90.23 via inet6 fe80::3eec:efff:fecb:d81a dev eth0 src 192.168.1.149 uid 0 
>     cache 
> [11:16] server141.place10:~# ip r get 194.187.90.23 from 192.168.1.149
> 194.187.90.23 from 192.168.1.149 via 192.168.1.254 dev eth1 table 42 uid 0 
>     cache 
> 
> wireguard still uses the wrong interface:
> 
> 11:20:13.115154 eth0  Out IP 192.168.1.149.60031 > 194.187.90.23.4000: UDP, length 148

I haven't looked at the details yet but this smells like the same route
caching issue I found a while ago:
https://lists.zx2c4.com/pipermail/wireguard/2023-July/008111.html

Does up/down'ing the interface make the problem go away? IIRC that will
re-initialize the udp socket and thus clear the route chache.

FYI Nico: It may be time to escalate these bugs to the network subsystem
maintainers on netdev@vger.kernel.org since Jason is not reading this list
anymore AFAICT.

get_maintainer.pl spits out this list of emails to send To:

    Jason A. Donenfeld" <Jason@zx2c4.com>,
    "David S. Miller" <davem@davemloft.net>,
    Eric Dumazet <edumazet@google.com>, 
    Jakub Kicinski <kuba@kernel.org>,
    Paolo Abeni <pabeni@redhat.com>,
    wireguard@lists.zx2c4.com, 
    netdev@vger.kernel.org,
    linux-kernel@vger.kernel.org

Do add me to CC as well. Before sending I'd recommend working out an
ip-netns based reproducer script -- makes it harder to ignore the report as
"ugh, too much work" ;)

Let me know if you need help with that,
--Daniel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Wireguard uses incorrect interface - routing issue
  2024-06-21 11:13 Wireguard uses incorrect interface - routing issue Nico Schottelius
  2024-06-21 11:24 ` Nico Schottelius
@ 2024-06-21 14:42 ` Diyaa Alkanakre
  2024-06-21 15:18   ` Daniel Gröber
  2024-06-21 15:38   ` Nico Schottelius
  1 sibling, 2 replies; 7+ messages in thread
From: Diyaa Alkanakre @ 2024-06-21 14:42 UTC (permalink / raw)
  To: Nico Schottelius; +Cc: WireGuard mailing list

The better approach would be to exclude the IPs from your WireGuard AllowedIPs. I always exclude IPs if I can before doing policy based routing.

https://www.procustodibus.com/blog/2021/03/wireguard-allowedips-calculator/


Jun 21, 2024, 5:15 AM by nico.schottelius@ungleich.ch:

>
> Hello again,
>
> I'm sorry to flood the mailing list with wireguard bugs, but it seems
> there is yet another routing bug in wireguard - happy to be wrong, but
> here are my findings:
>
> a) system has source based routing on via ip rule:
>
> [11:07] server141.place10:~# ip rule ls
> 0:      from all lookup local
> 32765:  from 192.168.1.0/24 lookup 42
> 32766:  from all lookup main
> 32767:  from all lookup default
> [11:07] server141.place10:~# ip route sh table 42
> 194.5.220.0/24 via 192.168.1.254 dev eth1 proto bird metric 32 
> 194.187.90.23 via 192.168.1.254 dev eth1 proto bird metric 32 
> 212.103.65.231 via 192.168.1.254 dev eth1 proto bird metric 32 
> [11:08] server141.place10:~# 
>
> This should ensure that packets towards 194.187.90.23 travel via eth1.
>
> b) tcpdump for verification
>
> Using "tcpdump -ni any port 4000" I observe:
>
> 11:10:22.445638 eth0  Out IP 192.168.1.149.58591 > 194.187.90.23.4000: UDP, length 148
> 11:10:27.447026 eth0  Out IP 192.168.1.149.58591 > 194.187.90.23.4000: UDP, length 148
> 11:10:32.448329 eth0  Out IP 192.168.1.149.58591 > 194.187.90.23.4000: UDP, length 148
> 11:10:37.449719 eth0  Out IP 192.168.1.149.58591 > 194.187.90.23.4000: UDP, length 148
>
> c) Route in main table
>
> There is indeed a route in the main routing table that matches, too:
>
> [11:08] server141.place10:~# ip r get 194.187.90.23
> 194.187.90.23 via 10.5.2.123 dev eth0 src 192.168.1.149 uid 0 
>  cache 
>
> d) ip rule not working (?)
>
> So from what I can observe it is that ip rule does not work together
> with wireguard / wireguard routing takes the route from main fib instead
> of from the separate table.
>
> I am not sure if this is related at all to the IP address binding bug,
> but it appears in a similar context from our tests.
>
> BR,
>
> Nico
>
> -- 
> Sustainable and modern Infrastructures by ungleich.ch
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Wireguard uses incorrect interface - routing issue
  2024-06-21 14:42 ` Diyaa Alkanakre
@ 2024-06-21 15:18   ` Daniel Gröber
  2024-06-21 15:38   ` Nico Schottelius
  1 sibling, 0 replies; 7+ messages in thread
From: Daniel Gröber @ 2024-06-21 15:18 UTC (permalink / raw)
  To: Stephan von Krawczynski, Diyaa Alkanakre
  Cc: Nico Schottelius, WireGuard mailing list

Hi,

On Fri, Jun 21, 2024 at 03:54:39PM +0200, Stephan von Krawczynski wrote:
> ... and in case you do find someone interested at all there is still the
> problem of no signaling to anyone when a client connects.
> I hardly can remember the decade when all this was implemented in cipe.

Yeah. Can be hard to get attention on netdev, but I've been advised that
when the maintainance of a (sub)subsystem is in question that is an issue
they'll take notice of. So be sure to lament the fact that Jason hasn't
been responding in at least a year on this ML ;)

IIRC we have a patch for netlink notifications on handshakes flying
around somewhere tho. Just needs some more work.

On Fri, Jun 21, 2024 at 04:42:02PM +0200, Diyaa Alkanakre wrote:
> The better approach would be to exclude the IPs from your WireGuard
> AllowedIPs. I always exclude IPs if I can before doing policy based
> routing.
> 
> https://www.procustodibus.com/blog/2021/03/wireguard-allowedips-calculator/

Interesting approach, thanks for the pointer :)

--Daniel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Wireguard uses incorrect interface - routing issue
  2024-06-21 14:42 ` Diyaa Alkanakre
  2024-06-21 15:18   ` Daniel Gröber
@ 2024-06-21 15:38   ` Nico Schottelius
  1 sibling, 0 replies; 7+ messages in thread
From: Nico Schottelius @ 2024-06-21 15:38 UTC (permalink / raw)
  To: Diyaa Alkanakre; +Cc: WireGuard mailing list

[-- Attachment #1: Type: text/plain, Size: 2397 bytes --]


Diyaa,

this is about the *outside* tunnel IP address that wireguard uses to
establish connection, not about inside routing.

BR,

Nico


Diyaa Alkanakre <diyaa@diyaa.ca> writes:

> The better approach would be to exclude the IPs from your WireGuard AllowedIPs. I always exclude IPs if I can before doing policy based routing.
>
> https://www.procustodibus.com/blog/2021/03/wireguard-allowedips-calculator/
>
>
> Jun 21, 2024, 5:15 AM by nico.schottelius@ungleich.ch:
>
>>
>> Hello again,
>>
>> I'm sorry to flood the mailing list with wireguard bugs, but it seems
>> there is yet another routing bug in wireguard - happy to be wrong, but
>> here are my findings:
>>
>> a) system has source based routing on via ip rule:
>>
>> [11:07] server141.place10:~# ip rule ls
>> 0:      from all lookup local
>> 32765:  from 192.168.1.0/24 lookup 42
>> 32766:  from all lookup main
>> 32767:  from all lookup default
>> [11:07] server141.place10:~# ip route sh table 42
>> 194.5.220.0/24 via 192.168.1.254 dev eth1 proto bird metric 32 
>> 194.187.90.23 via 192.168.1.254 dev eth1 proto bird metric 32 
>> 212.103.65.231 via 192.168.1.254 dev eth1 proto bird metric 32 
>> [11:08] server141.place10:~# 
>>
>> This should ensure that packets towards 194.187.90.23 travel via eth1.
>>
>> b) tcpdump for verification
>>
>> Using "tcpdump -ni any port 4000" I observe:
>>
>> 11:10:22.445638 eth0  Out IP 192.168.1.149.58591 > 194.187.90.23.4000: UDP, length 148
>> 11:10:27.447026 eth0  Out IP 192.168.1.149.58591 > 194.187.90.23.4000: UDP, length 148
>> 11:10:32.448329 eth0  Out IP 192.168.1.149.58591 > 194.187.90.23.4000: UDP, length 148
>> 11:10:37.449719 eth0  Out IP 192.168.1.149.58591 > 194.187.90.23.4000: UDP, length 148
>>
>> c) Route in main table
>>
>> There is indeed a route in the main routing table that matches, too:
>>
>> [11:08] server141.place10:~# ip r get 194.187.90.23
>> 194.187.90.23 via 10.5.2.123 dev eth0 src 192.168.1.149 uid 0 
>>  cache 
>>
>> d) ip rule not working (?)
>>
>> So from what I can observe it is that ip rule does not work together
>> with wireguard / wireguard routing takes the route from main fib instead
>> of from the separate table.
>>
>> I am not sure if this is related at all to the IP address binding bug,
>> but it appears in a similar context from our tests.
>>
>> BR,
>>
>> Nico
>>
>> -- 
>> Sustainable and modern Infrastructures by ungleich.ch
>>

[-- Attachment #2.1: Type: text/plain, Size: 62 bytes --]


-- 
Sustainable and modern Infrastructures by ungleich.ch

[-- Attachment #2.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Wireguard uses incorrect interface - routing issue
  2024-06-21 12:29   ` Daniel Gröber
@ 2024-06-22  9:22     ` Nico Schottelius
  0 siblings, 0 replies; 7+ messages in thread
From: Nico Schottelius @ 2024-06-22  9:22 UTC (permalink / raw)
  To: Daniel Gröber; +Cc: WireGuard mailing list

[-- Attachment #1: Type: text/plain, Size: 2427 bytes --]


Good morning Daniel,

Daniel Gröber <dxld@darkboxed.org> writes:
>> wireguard still uses the wrong interface:
>> 
>> 11:20:13.115154 eth0  Out IP 192.168.1.149.60031 > 194.187.90.23.4000: UDP, length 148
>
> I haven't looked at the details yet but this smells like the same route
> caching issue I found a while ago:
> https://lists.zx2c4.com/pipermail/wireguard/2023-July/008111.html
>
> Does up/down'ing the interface make the problem go away? IIRC that will
> re-initialize the udp socket and thus clear the route chache.

Up & down does *not* fix it, however a *reboot* did. I've the feeling
that this is a race condition together with bird running on the
machine. I suspect the following is happening:

- machine starts
- ip rule is used to move traffic into table 42 (part of the container startup)
- table 42 is populated by bird with static routes (part of bird
  startup)

-- at this stage wireguard works

- bird establishes iBGP sessions and receives alternate routes for the
  target in the main routing table
- wireguard restart is triggered and from that moment on wireguard uses
  the route from the main table

-- at this stage wireguard is broken/takes the route from the main table

This is so far a theory, I'll need to verify that, maybe a simple test
script as you suggested makes sense.

> FYI Nico: It may be time to escalate these bugs to the network subsystem
> maintainers on netdev@vger.kernel.org since Jason is not reading this list
> anymore AFAICT.

That is a very good point and I shall do so next week!

> get_maintainer.pl spits out this list of emails to send To:
>
>     Jason A. Donenfeld" <Jason@zx2c4.com>,
>     "David S. Miller" <davem@davemloft.net>,
>     Eric Dumazet <edumazet@google.com>, 
>     Jakub Kicinski <kuba@kernel.org>,
>     Paolo Abeni <pabeni@redhat.com>,
>     wireguard@lists.zx2c4.com, 
>     netdev@vger.kernel.org,
>     linux-kernel@vger.kernel.org

Thanks for looking up!

> Do add me to CC as well. Before sending I'd recommend working out an
> ip-netns based reproducer script -- makes it harder to ignore the report as
> "ugh, too much work" ;)

Understood and ...


> Let me know if you need help with that,

... would certainly appreciate that.

You are on matrix, too, aren't you?
I'm @nico:ungleich.ch, might be easier for coordination.

Best regards from sunny Glarus,

Nico



[-- Attachment #2.1: Type: text/plain, Size: 62 bytes --]


-- 
Sustainable and modern Infrastructures by ungleich.ch

[-- Attachment #2.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-06-22  9:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-21 11:13 Wireguard uses incorrect interface - routing issue Nico Schottelius
2024-06-21 11:24 ` Nico Schottelius
2024-06-21 12:29   ` Daniel Gröber
2024-06-22  9:22     ` Nico Schottelius
2024-06-21 14:42 ` Diyaa Alkanakre
2024-06-21 15:18   ` Daniel Gröber
2024-06-21 15:38   ` Nico Schottelius

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).