* Should I expect faster recovery after one side goes down
@ 2017-11-27 9:49 Bruno Wolff III
2017-11-27 11:04 ` Jason A. Donenfeld
0 siblings, 1 reply; 11+ messages in thread
From: Bruno Wolff III @ 2017-11-27 9:49 UTC (permalink / raw)
To: WireGuard mailing list
I'm not sure what is really going on but I have seen some very long delays
after one side of the link goes down, while the other keeps sending
packets. The work around is to restart the local side once the remote side
is back up.
When I do some testing and say reboot the router the a wg tunnel terminates
at, while continuing to use the laptop at the other end, after the router
is back up very little traffic seems to get through or there is a very
large latency. Restarting the iptables service with systemd will also
hang. I don't know if that is forever or just a very long time. If I restart
wireguard on the laptop (which deletes and recreates the device) things
will start working normally again.
Is there some information I can collect that will illuminate what is going
on here?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Should I expect faster recovery after one side goes down
2017-11-27 9:49 Should I expect faster recovery after one side goes down Bruno Wolff III
@ 2017-11-27 11:04 ` Jason A. Donenfeld
2017-11-27 13:49 ` Bruno Wolff III
0 siblings, 1 reply; 11+ messages in thread
From: Jason A. Donenfeld @ 2017-11-27 11:04 UTC (permalink / raw)
To: Bruno Wolff III; +Cc: WireGuard mailing list
Hi Bruno,
The first question is - how long?
Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Should I expect faster recovery after one side goes down
2017-11-27 11:04 ` Jason A. Donenfeld
@ 2017-11-27 13:49 ` Bruno Wolff III
2017-11-27 17:33 ` Bruno Wolff III
0 siblings, 1 reply; 11+ messages in thread
From: Bruno Wolff III @ 2017-11-27 13:49 UTC (permalink / raw)
To: Jason A. Donenfeld; +Cc: WireGuard mailing list
On Mon, Nov 27, 2017 at 12:04:06 +0100,
"Jason A. Donenfeld" <Jason@zx2c4.com> wrote:
>Hi Bruno,
>
>The first question is - how long?
For "systemctl iptables stop" I have waited around a minute before
using control C. After running "systemctl stop wireguard" or
"systemctl restart wireguard" (which will delete wg0) "systemctl stop
iptables" will run with no noticeable delay.
For network traffic, I waited around 10 minutes and things were still not
working. Web page loads would still time out after a minute or two. But
I did have a few DNS lookups succeed. I'm not sure if I did something
that allowed a value to get cached (there is a local caching resolver
on the affected machines) or if a response eventually made it through.
After "systemctl restart wireguard" things start working normal right
away. So I don't know the delay for specific traffic, but it looks to
be at least a minute for most traffic. The problem does not seem to
resolve for at least 10 minutes, though I don't think I have ever seen it
resolve on its own.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Should I expect faster recovery after one side goes down
2017-11-27 13:49 ` Bruno Wolff III
@ 2017-11-27 17:33 ` Bruno Wolff III
2017-11-27 17:36 ` Jason A. Donenfeld
0 siblings, 1 reply; 11+ messages in thread
From: Bruno Wolff III @ 2017-11-27 17:33 UTC (permalink / raw)
To: Jason A. Donenfeld; +Cc: WireGuard mailing list
On Mon, Nov 27, 2017 at 07:49:14 -0600,
Bruno Wolff III <bruno@wolff.to> wrote:
>On Mon, Nov 27, 2017 at 12:04:06 +0100,
> "Jason A. Donenfeld" <Jason@zx2c4.com> wrote:
>>Hi Bruno,
>>
>>The first question is - how long?
This might be related to the amount or type of traffic backed up. The two
machines where this was very noticeable in testing had all of their traffic
routed through the tunnel other than the encapsulating packets. (DNS traffic
gets tunnelled.) Playing with this on my work machine where only traffic
destined for a few specific hosts was tunnelled, I am finding it hard to
duplicate the problem.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Should I expect faster recovery after one side goes down
2017-11-27 17:33 ` Bruno Wolff III
@ 2017-11-27 17:36 ` Jason A. Donenfeld
2017-11-27 18:25 ` Bruno Wolff III
0 siblings, 1 reply; 11+ messages in thread
From: Jason A. Donenfeld @ 2017-11-27 17:36 UTC (permalink / raw)
To: Bruno Wolff III; +Cc: WireGuard mailing list
Hello Bruno,
That's some pretty weird behavior, and it sounds like whatever the
cause is is being obscured under layers of systemd. Perhaps come on
into #wireguard on Freenode and we can debug this in real time? I've
got a few ideas.
Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Should I expect faster recovery after one side goes down
2017-11-27 17:36 ` Jason A. Donenfeld
@ 2017-11-27 18:25 ` Bruno Wolff III
2017-11-28 6:13 ` Bruno Wolff III
0 siblings, 1 reply; 11+ messages in thread
From: Bruno Wolff III @ 2017-11-27 18:25 UTC (permalink / raw)
To: Jason A. Donenfeld; +Cc: WireGuard mailing list
On Mon, Nov 27, 2017 at 18:36:23 +0100,
"Jason A. Donenfeld" <Jason@zx2c4.com> wrote:
>Hello Bruno,
>
>That's some pretty weird behavior, and it sounds like whatever the
>cause is is being obscured under layers of systemd. Perhaps come on
>into #wireguard on Freenode and we can debug this in real time? I've
>got a few ideas.
I don't have my laptop with me at work and breaking the wireguard tunnel
to it will break my access to it from here. I could check configuration
from work. I can bring it with me tomorrow, otherwise I'll probably get
home too late for you tonight and I probably won't be up late enough
to catch you early tomorrow.
Probably I'll be able to reproduce the issue from work.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Should I expect faster recovery after one side goes down
2017-11-27 18:25 ` Bruno Wolff III
@ 2017-11-28 6:13 ` Bruno Wolff III
2017-11-28 6:44 ` Bruno Wolff III
0 siblings, 1 reply; 11+ messages in thread
From: Bruno Wolff III @ 2017-11-28 6:13 UTC (permalink / raw)
To: Jason A. Donenfeld; +Cc: WireGuard mailing list
I'm pretty sure I'm being bit by firewall rules on the router. It seems to
be rejecting all of the tunnel packets and it has no reason to try to
connect to the laptop the handshake never occurs again. I suspect that
normally a connection established related rule lets things through. I just
need to figure out how the start up packet is different so that it gets
through. The systemd iptables service eventually seems to stop. Probably
there is a DNS request that needs to timeout.
I do some source address rewriting and it may be that the initial addresses
used for the encapsulating packets are different than the ones later.
So most likely this is all on my end and not wireguard related.
Thanks for the tcpdump suggestion. I should have tried that sooner.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Should I expect faster recovery after one side goes down
2017-11-28 6:13 ` Bruno Wolff III
@ 2017-11-28 6:44 ` Bruno Wolff III
2017-11-28 8:42 ` Bruno Wolff III
0 siblings, 1 reply; 11+ messages in thread
From: Bruno Wolff III @ 2017-11-28 6:44 UTC (permalink / raw)
To: Jason A. Donenfeld; +Cc: WireGuard mailing list
On Tue, Nov 28, 2017 at 00:13:06 -0600,
Bruno Wolff III <bruno@wolff.to> wrote:
>I do some source address rewriting and it may be that the initial
>addresses used for the encapsulating packets are different than the
>ones later.
When I'm on the local network, 192.168.6.1 gets used for the initial
source adddress and gets rewritten to 98.103.208.26 in order to make
the source consistent for the laptop whether or not it is on the
local network. (That way I don't need to allow connections from
192.168.6.1 somewhere else where it wouldn't be my router.) When this
happens the source port seems to normally get changed. Wireguard on the
laptop remembers the new source port and tries to keep using it after
the router is rebooted. But during the reboot the router forgets about
the port mapping so it ends up dropping the packets. It has no reason
to send packets on its own to the laptop (and wouldn't know where to
send them) so the port doesn't get corrected.
I think the correct fix is to know if I reboot the router for testing
something, I need to also restart wireguard to make sure it is sending
data to the expected port. This isn't going to be an issue in normal
operation.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Should I expect faster recovery after one side goes down
2017-11-28 6:44 ` Bruno Wolff III
@ 2017-11-28 8:42 ` Bruno Wolff III
2017-12-01 8:43 ` Baptiste Jonglez
0 siblings, 1 reply; 11+ messages in thread
From: Bruno Wolff III @ 2017-11-28 8:42 UTC (permalink / raw)
To: Jason A. Donenfeld; +Cc: WireGuard mailing list
On Tue, Nov 28, 2017 at 00:44:13 -0600,
Bruno Wolff III <bruno@wolff.to> wrote:
>
>I think the correct fix is to know if I reboot the router for testing
>something, I need to also restart wireguard to make sure it is sending
>data to the expected port. This isn't going to be an issue in normal
>operation.
I found a way to make it work more automatically. The reason the port
was getting reassigned was because the original connection packet was
being tracked and was conflicting with the source nat mapping even though
in reallity the connection was the same. By putting in CT --notrack rules
I was able to block that traking and without the conflict the port doesn't
get remapped. I don't need tracking or the original connection for my
firewall rules so this should be OK. On testing it seems to work as
expected. Now when I reboot my router, my laptop reconnects and the wireguard
tunnel works without having to restart it.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Should I expect faster recovery after one side goes down
2017-11-28 8:42 ` Bruno Wolff III
@ 2017-12-01 8:43 ` Baptiste Jonglez
2017-12-01 17:02 ` Bruno Wolff III
0 siblings, 1 reply; 11+ messages in thread
From: Baptiste Jonglez @ 2017-12-01 8:43 UTC (permalink / raw)
To: wireguard
[-- Attachment #1: Type: text/plain, Size: 1323 bytes --]
Hi,
On 28-11-17, Bruno Wolff III wrote:
> On Tue, Nov 28, 2017 at 00:44:13 -0600,
> Bruno Wolff III <bruno@wolff.to> wrote:
> >
> >I think the correct fix is to know if I reboot the router for testing
> >something, I need to also restart wireguard to make sure it is sending
> >data to the expected port. This isn't going to be an issue in normal
> >operation.
It sounds like one of these situations where persistent keepalives would
be useful, doesn't it?
This way the laptop would create a new binding in your firewall.
> I found a way to make it work more automatically. The reason the port was
> getting reassigned was because the original connection packet was being
> tracked and was conflicting with the source nat mapping even though in
> reallity the connection was the same. By putting in CT --notrack rules I was
> able to block that traking and without the conflict the port doesn't get
> remapped. I don't need tracking or the original connection for my firewall
> rules so this should be OK. On testing it seems to work as expected. Now
> when I reboot my router, my laptop reconnects and the wireguard tunnel works
> without having to restart it.
> _______________________________________________
> WireGuard mailing list
> WireGuard@lists.zx2c4.com
> https://lists.zx2c4.com/mailman/listinfo/wireguard
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Should I expect faster recovery after one side goes down
2017-12-01 8:43 ` Baptiste Jonglez
@ 2017-12-01 17:02 ` Bruno Wolff III
0 siblings, 0 replies; 11+ messages in thread
From: Bruno Wolff III @ 2017-12-01 17:02 UTC (permalink / raw)
To: Baptiste Jonglez; +Cc: wireguard
On Fri, Dec 01, 2017 at 09:43:19 +0100,
Baptiste Jonglez <baptiste@bitsofnetworks.org> wrote:
>
>It sounds like one of these situations where persistent keepalives would
>be useful, doesn't it?
It is definitely useful as the laptop is expected to be behind NAT, but it
doesn't help with the rebooting the router breaking historical source
NAT (on my local network) while the other end remembers where it last got a
packet from. The solution below dealt with that.
>
>> I found a way to make it work more automatically. The reason the port was
>> getting reassigned was because the original connection packet was being
>> tracked and was conflicting with the source nat mapping even though in
>> reallity the connection was the same. By putting in CT --notrack rules I was
>> able to block that traking and without the conflict the port doesn't get
>> remapped. I don't need tracking or the original connection for my firewall
>> rules so this should be OK. On testing it seems to work as expected. Now
>> when I reboot my router, my laptop reconnects and the wireguard tunnel works
>> without having to restart it.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2017-12-01 16:57 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-27 9:49 Should I expect faster recovery after one side goes down Bruno Wolff III
2017-11-27 11:04 ` Jason A. Donenfeld
2017-11-27 13:49 ` Bruno Wolff III
2017-11-27 17:33 ` Bruno Wolff III
2017-11-27 17:36 ` Jason A. Donenfeld
2017-11-27 18:25 ` Bruno Wolff III
2017-11-28 6:13 ` Bruno Wolff III
2017-11-28 6:44 ` Bruno Wolff III
2017-11-28 8:42 ` Bruno Wolff III
2017-12-01 8:43 ` Baptiste Jonglez
2017-12-01 17:02 ` Bruno Wolff III
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).