Development discussion of WireGuard
 help / color / mirror / Atom feed
* Connection hangs over CGNAT (Starlink)
@ 2022-12-16  2:12 Nikolay Martynov
  2022-12-17 22:15 ` Szymon Nowak
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Nikolay Martynov @ 2022-12-16  2:12 UTC (permalink / raw)
  To: wireguard

Hi!

I'm experiencing strange behaviour with wireguard: from time to time
connection 'freezes'.
Most often I'm observing this on an Android phone when connected from
my home over Starlink.
Server: latest Openwrt, Client: latest Android app.
The connection establishes and works fine for some time. After some
time the client still shows connection is established, but no incoming
data is coming.
On a server side 'latest handshake' goes into hours/days.
The freeze happens randomly, for no apparent reason and I think only
over starlink. I do not think I have ever observed this problem on
cell networks.

Reconnection solves the problem immediately.
I did some tcpdumping when the problem was present and found the following:
* Server side sees incoming traffic from the client and sends responses.
* On my own router connected to Starlink (i.e. interface between my
router and Starlink router) I see data going from the client to the
server - but no packets coming back.

So my 'hypothesis' is that somehow Starlink's CGNAT 'forgets' one side
of the connection - and so data continues to go in one direction, but
it doesn't come back. The thing with the wireguard is that it looks
like it doesn't change the outgoing port when it attempts to do
another handshake. This means that it continues using the same 'half
broken' connection forever.

I think the same happens to me at least once on a Linux client - but
the difference with the phone is that the phone is always on and
therefore the duration of the connection is much longer.

I tried experimenting with keepalive messages - but it looks like they
make no difference. Once connection freezes I see keepalived arriving
onto the server, server sending reply - but that reply never arrives
to the client.

It looks like the solution to this problem would be for the client to
use a different outgoing port when sending a handshake but I was not
able to find an option for that.

Is this something that is possible to do?
Thanks!


-- 
Martynov Nikolay.
Email: mar.kolya@gmail.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Connection hangs over CGNAT (Starlink)
  2022-12-16  2:12 Connection hangs over CGNAT (Starlink) Nikolay Martynov
@ 2022-12-17 22:15 ` Szymon Nowak
  2023-01-10  9:44   ` reox
  2022-12-18 13:24 ` Lonnie Abelbeck
  2022-12-19  3:41 ` Dean Davis
  2 siblings, 1 reply; 7+ messages in thread
From: Szymon Nowak @ 2022-12-17 22:15 UTC (permalink / raw)
  To: Nikolay Martynov; +Cc: wireguard

I've noticed the same thing on a WIndows client, it happens when you
provide internet from two sources, e.g. Wifi and mobile network or
Wifi and LAN in case of computer, When one of these sources has a
problem and internet is not available on it. Then Wireguard stops
working even though it doesn't break the tunnel. Completely
disconnecting the faulty connection and reconnecting the tunleu solves
this problem. I don't know why they don't work even though the tunnel
is connected

On Sat, Dec 17, 2022 at 11:05 PM Nikolay Martynov <mar.kolya@gmail.com> wrote:
>
> Hi!
>
> I'm experiencing strange behaviour with wireguard: from time to time
> connection 'freezes'.
> Most often I'm observing this on an Android phone when connected from
> my home over Starlink.
> Server: latest Openwrt, Client: latest Android app.
> The connection establishes and works fine for some time. After some
> time the client still shows connection is established, but no incoming
> data is coming.
> On a server side 'latest handshake' goes into hours/days.
> The freeze happens randomly, for no apparent reason and I think only
> over starlink. I do not think I have ever observed this problem on
> cell networks.
>
> Reconnection solves the problem immediately.
> I did some tcpdumping when the problem was present and found the following:
> * Server side sees incoming traffic from the client and sends responses.
> * On my own router connected to Starlink (i.e. interface between my
> router and Starlink router) I see data going from the client to the
> server - but no packets coming back.
>
> So my 'hypothesis' is that somehow Starlink's CGNAT 'forgets' one side
> of the connection - and so data continues to go in one direction, but
> it doesn't come back. The thing with the wireguard is that it looks
> like it doesn't change the outgoing port when it attempts to do
> another handshake. This means that it continues using the same 'half
> broken' connection forever.
>
> I think the same happens to me at least once on a Linux client - but
> the difference with the phone is that the phone is always on and
> therefore the duration of the connection is much longer.
>
> I tried experimenting with keepalive messages - but it looks like they
> make no difference. Once connection freezes I see keepalived arriving
> onto the server, server sending reply - but that reply never arrives
> to the client.
>
> It looks like the solution to this problem would be for the client to
> use a different outgoing port when sending a handshake but I was not
> able to find an option for that.
>
> Is this something that is possible to do?
> Thanks!
>
>
> --
> Martynov Nikolay.
> Email: mar.kolya@gmail.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Connection hangs over CGNAT (Starlink)
  2022-12-16  2:12 Connection hangs over CGNAT (Starlink) Nikolay Martynov
  2022-12-17 22:15 ` Szymon Nowak
@ 2022-12-18 13:24 ` Lonnie Abelbeck
  2022-12-19  3:41 ` Dean Davis
  2 siblings, 0 replies; 7+ messages in thread
From: Lonnie Abelbeck @ 2022-12-18 13:24 UTC (permalink / raw)
  To: Nikolay Martynov; +Cc: wireguard

Have you tried reducing the MTU of the WG tunnel?

I have a similar use case with a WG tunnel over a T-Mobile Home Internet (TMHI) CGNAT network.

After some testing determining the reduced MTU of the TMHI network, I set the WG endpoints' MTU to be 1340.

The WG tunnel has been rock solid.

Lonnie


> On Dec 15, 2022, at 8:12 PM, Nikolay Martynov <mar.kolya@gmail.com> wrote:
> 
> Hi!
> 
> I'm experiencing strange behaviour with wireguard: from time to time
> connection 'freezes'.
> Most often I'm observing this on an Android phone when connected from
> my home over Starlink.
> Server: latest Openwrt, Client: latest Android app.
> The connection establishes and works fine for some time. After some
> time the client still shows connection is established, but no incoming
> data is coming.
> On a server side 'latest handshake' goes into hours/days.
> The freeze happens randomly, for no apparent reason and I think only
> over starlink. I do not think I have ever observed this problem on
> cell networks.
> 
> Reconnection solves the problem immediately.
> I did some tcpdumping when the problem was present and found the following:
> * Server side sees incoming traffic from the client and sends responses.
> * On my own router connected to Starlink (i.e. interface between my
> router and Starlink router) I see data going from the client to the
> server - but no packets coming back.
> 
> So my 'hypothesis' is that somehow Starlink's CGNAT 'forgets' one side
> of the connection - and so data continues to go in one direction, but
> it doesn't come back. The thing with the wireguard is that it looks
> like it doesn't change the outgoing port when it attempts to do
> another handshake. This means that it continues using the same 'half
> broken' connection forever.
> 
> I think the same happens to me at least once on a Linux client - but
> the difference with the phone is that the phone is always on and
> therefore the duration of the connection is much longer.
> 
> I tried experimenting with keepalive messages - but it looks like they
> make no difference. Once connection freezes I see keepalived arriving
> onto the server, server sending reply - but that reply never arrives
> to the client.
> 
> It looks like the solution to this problem would be for the client to
> use a different outgoing port when sending a handshake but I was not
> able to find an option for that.
> 
> Is this something that is possible to do?
> Thanks!
> 
> 
> -- 
> Martynov Nikolay.
> Email: mar.kolya@gmail.com
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Connection hangs over CGNAT (Starlink)
  2022-12-16  2:12 Connection hangs over CGNAT (Starlink) Nikolay Martynov
  2022-12-17 22:15 ` Szymon Nowak
  2022-12-18 13:24 ` Lonnie Abelbeck
@ 2022-12-19  3:41 ` Dean Davis
  2022-12-31  1:08   ` Nikolay Martynov
  2 siblings, 1 reply; 7+ messages in thread
From: Dean Davis @ 2022-12-19  3:41 UTC (permalink / raw)
  To: Nikolay Martynov, wireguard

hi


same issue


I do kind of a automated ping test and if fails on the server side to
many times bring down interface and then back up in a bash script

with 

nmcli connection down wg0 && nmcli connection up wg0


to me it looks to be connection state issue difference between new and
established/related  ( can not confirm )


ugly but works for me 



regards
dean


On Thu, 2022-12-15 at 21:12 -0500, Nikolay Martynov wrote:
> Hi!
> 
> I'm experiencing strange behaviour with wireguard: from time to time
> connection 'freezes'.
> Most often I'm observing this on an Android phone when connected from
> my home over Starlink.
> Server: latest Openwrt, Client: latest Android app.
> The connection establishes and works fine for some time. After some
> time the client still shows connection is established, but no
> incoming
> data is coming.
> On a server side 'latest handshake' goes into hours/days.
> The freeze happens randomly, for no apparent reason and I think only
> over starlink. I do not think I have ever observed this problem on
> cell networks.
> 
> Reconnection solves the problem immediately.
> I did some tcpdumping when the problem was present and found the
> following:
> * Server side sees incoming traffic from the client and sends
> responses.
> * On my own router connected to Starlink (i.e. interface between my
> router and Starlink router) I see data going from the client to the
> server - but no packets coming back.
> 
> So my 'hypothesis' is that somehow Starlink's CGNAT 'forgets' one
> side
> of the connection - and so data continues to go in one direction, but
> it doesn't come back. The thing with the wireguard is that it looks
> like it doesn't change the outgoing port when it attempts to do
> another handshake. This means that it continues using the same 'half
> broken' connection forever.
> 
> I think the same happens to me at least once on a Linux client - but
> the difference with the phone is that the phone is always on and
> therefore the duration of the connection is much longer.
> 
> I tried experimenting with keepalive messages - but it looks like
> they
> make no difference. Once connection freezes I see keepalived arriving
> onto the server, server sending reply - but that reply never arrives
> to the client.
> 
> It looks like the solution to this problem would be for the client to
> use a different outgoing port when sending a handshake but I was not
> able to find an option for that.
> 
> Is this something that is possible to do?
> Thanks!
> 
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Connection hangs over CGNAT (Starlink)
  2022-12-19  3:41 ` Dean Davis
@ 2022-12-31  1:08   ` Nikolay Martynov
  2023-01-10 10:55     ` Nikolay Kichukov
  0 siblings, 1 reply; 7+ messages in thread
From: Nikolay Martynov @ 2022-12-31  1:08 UTC (permalink / raw)
  To: Dean Davis; +Cc: wireguard

Hi!

FWIW, reducing MTU doesn't seem to help. Also it looks like I almost
never experience this on a cell network and experience this sometimes
multiple times in an hour on startlink.
At any rate the problem can easily be simulated by just using
iptables. If for whatever reason packets are cut on the return path
for an established connection a new connection (with new source port)
is not reestablished resulting in connection effectively hanging
forever. I do not want to sound too presumptuous, but this seems like
a clear bug to me.

On Sun, Dec 18, 2022 at 10:41 PM Dean Davis <dean_davis@withtel.com.au> wrote:
>
> hi
>
>
> same issue
>
>
> I do kind of a automated ping test and if fails on the server side to
> many times bring down interface and then back up in a bash script
>
> with
>
> nmcli connection down wg0 && nmcli connection up wg0
>
>
> to me it looks to be connection state issue difference between new and
> established/related  ( can not confirm )
>
>
> ugly but works for me
>
>
>
> regards
> dean
>
>
> On Thu, 2022-12-15 at 21:12 -0500, Nikolay Martynov wrote:
> > Hi!
> >
> > I'm experiencing strange behaviour with wireguard: from time to time
> > connection 'freezes'.
> > Most often I'm observing this on an Android phone when connected from
> > my home over Starlink.
> > Server: latest Openwrt, Client: latest Android app.
> > The connection establishes and works fine for some time. After some
> > time the client still shows connection is established, but no
> > incoming
> > data is coming.
> > On a server side 'latest handshake' goes into hours/days.
> > The freeze happens randomly, for no apparent reason and I think only
> > over starlink. I do not think I have ever observed this problem on
> > cell networks.
> >
> > Reconnection solves the problem immediately.
> > I did some tcpdumping when the problem was present and found the
> > following:
> > * Server side sees incoming traffic from the client and sends
> > responses.
> > * On my own router connected to Starlink (i.e. interface between my
> > router and Starlink router) I see data going from the client to the
> > server - but no packets coming back.
> >
> > So my 'hypothesis' is that somehow Starlink's CGNAT 'forgets' one
> > side
> > of the connection - and so data continues to go in one direction, but
> > it doesn't come back. The thing with the wireguard is that it looks
> > like it doesn't change the outgoing port when it attempts to do
> > another handshake. This means that it continues using the same 'half
> > broken' connection forever.
> >
> > I think the same happens to me at least once on a Linux client - but
> > the difference with the phone is that the phone is always on and
> > therefore the duration of the connection is much longer.
> >
> > I tried experimenting with keepalive messages - but it looks like
> > they
> > make no difference. Once connection freezes I see keepalived arriving
> > onto the server, server sending reply - but that reply never arrives
> > to the client.
> >
> > It looks like the solution to this problem would be for the client to
> > use a different outgoing port when sending a handshake but I was not
> > able to find an option for that.
> >
> > Is this something that is possible to do?
> > Thanks!
> >
> >
>


-- 
Martynov Nikolay.
Email: mar.kolya@gmail.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Connection hangs over CGNAT (Starlink)
  2022-12-17 22:15 ` Szymon Nowak
@ 2023-01-10  9:44   ` reox
  0 siblings, 0 replies; 7+ messages in thread
From: reox @ 2023-01-10  9:44 UTC (permalink / raw)
  To: wireguard

Interesting, because I was going to post a similar question here - but 
would not have thought about multi-network.
For me, this happens on my Android phone, where I use WG to route DNS 
traffic to my own server.
In some wifi networks, I get this issue that after some time (sometimes 
only minutes, sometimes hours), that for example K9mail is unable to 
fetch mails because it presumably runs into a DNS timeout (it cannot be 
a connection timeout to the mail server, because that is not routed via WG)
Toggling Wifi, switching to mobile network only, or toggling wireguard 
solves the issue.
I had this problem, however, also rarely in other wifis. Before that, I 
thought that the wifi network itself was the culprit, but as it happens 
also on other occasions, thus I thought it might be a combination of 
specific wifi setup and my server setup.
However, I have no idea how I could debug this, especially as DNS 
requests using termux and dig work flawlessly, even though at the same 
time k9mail hangs.
Using KeepAlive did not work so far.
I wanted to debug this further by running tcpdump on the server, but 
unfortunately, I have right now no wifi where I can trigger this bug 
reliably. In my own wifi, it happens every now and then - typically only 
once a month or less...

Best,
Sebastian

On 17.12.2022 23:15, Szymon Nowak wrote:
> I've noticed the same thing on a WIndows client, it happens when you
> provide internet from two sources, e.g. Wifi and mobile network or
> Wifi and LAN in case of computer, When one of these sources has a
> problem and internet is not available on it. Then Wireguard stops
> working even though it doesn't break the tunnel. Completely
> disconnecting the faulty connection and reconnecting the tunleu solves
> this problem. I don't know why they don't work even though the tunnel
> is connected
> 
> On Sat, Dec 17, 2022 at 11:05 PM Nikolay Martynov <mar.kolya@gmail.com> wrote:
>>
>> Hi!
>>
>> I'm experiencing strange behaviour with wireguard: from time to time
>> connection 'freezes'.
>> Most often I'm observing this on an Android phone when connected from
>> my home over Starlink.
>> Server: latest Openwrt, Client: latest Android app.
>> The connection establishes and works fine for some time. After some
>> time the client still shows connection is established, but no incoming
>> data is coming.
>> On a server side 'latest handshake' goes into hours/days.
>> The freeze happens randomly, for no apparent reason and I think only
>> over starlink. I do not think I have ever observed this problem on
>> cell networks.
>>
>> Reconnection solves the problem immediately.
>> I did some tcpdumping when the problem was present and found the following:
>> * Server side sees incoming traffic from the client and sends responses.
>> * On my own router connected to Starlink (i.e. interface between my
>> router and Starlink router) I see data going from the client to the
>> server - but no packets coming back.
>>
>> So my 'hypothesis' is that somehow Starlink's CGNAT 'forgets' one side
>> of the connection - and so data continues to go in one direction, but
>> it doesn't come back. The thing with the wireguard is that it looks
>> like it doesn't change the outgoing port when it attempts to do
>> another handshake. This means that it continues using the same 'half
>> broken' connection forever.
>>
>> I think the same happens to me at least once on a Linux client - but
>> the difference with the phone is that the phone is always on and
>> therefore the duration of the connection is much longer.
>>
>> I tried experimenting with keepalive messages - but it looks like they
>> make no difference. Once connection freezes I see keepalived arriving
>> onto the server, server sending reply - but that reply never arrives
>> to the client.
>>
>> It looks like the solution to this problem would be for the client to
>> use a different outgoing port when sending a handshake but I was not
>> able to find an option for that.
>>
>> Is this something that is possible to do?
>> Thanks!
>>
>>
>> --
>> Martynov Nikolay.
>> Email: mar.kolya@gmail.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Connection hangs over CGNAT (Starlink)
  2022-12-31  1:08   ` Nikolay Martynov
@ 2023-01-10 10:55     ` Nikolay Kichukov
  0 siblings, 0 replies; 7+ messages in thread
From: Nikolay Kichukov @ 2023-01-10 10:55 UTC (permalink / raw)
  To: Nikolay Martynov, Dean Davis; +Cc: wireguard

Hi folks,

I am getting similar experience (though it looks like different
underlying problem) on iOS 12 device running latest wireguard
application using cellular (LTE) network for the VPN tunnel.

Very randomly, it may be from a couple of days up to at most 4 days with
the VPN being active the server peer stops receiving the keepalive
packets because the tunnel has become inactive in wireguard client. It
then requires manual interaction to enable it which then connects
instantly.

tcpdump on the server peer does not indicate any incoming traffic when
it happens, so it is most likely a iOS client component crash.

Cheers,
-N

On Fri, 2022-12-30 at 20:08 -0500, Nikolay Martynov wrote:
> Hi!
> 
> FWIW, reducing MTU doesn't seem to help. Also it looks like I almost
> never experience this on a cell network and experience this sometimes
> multiple times in an hour on startlink.
> At any rate the problem can easily be simulated by just using
> iptables. If for whatever reason packets are cut on the return path
> for an established connection a new connection (with new source port)
> is not reestablished resulting in connection effectively hanging
> forever. I do not want to sound too presumptuous, but this seems like
> a clear bug to me.
> 
> On Sun, Dec 18, 2022 at 10:41 PM Dean Davis
> <dean_davis@withtel.com.au> wrote:
> > 
> > hi
> > 
> > 
> > same issue
> > 
> > 
> > I do kind of a automated ping test and if fails on the server side
> > to
> > many times bring down interface and then back up in a bash script
> > 
> > with
> > 
> > nmcli connection down wg0 && nmcli connection up wg0
> > 
> > 
> > to me it looks to be connection state issue difference between new
> > and
> > established/related  ( can not confirm )
> > 
> > 
> > ugly but works for me
> > 
> > 
> > 
> > regards
> > dean
> > 
> > 
> > On Thu, 2022-12-15 at 21:12 -0500, Nikolay Martynov wrote:
> > > Hi!
> > > 
> > > I'm experiencing strange behaviour with wireguard: from time to
> > > time
> > > connection 'freezes'.
> > > Most often I'm observing this on an Android phone when connected
> > > from
> > > my home over Starlink.
> > > Server: latest Openwrt, Client: latest Android app.
> > > The connection establishes and works fine for some time. After
> > > some
> > > time the client still shows connection is established, but no
> > > incoming
> > > data is coming.
> > > On a server side 'latest handshake' goes into hours/days.
> > > The freeze happens randomly, for no apparent reason and I think
> > > only
> > > over starlink. I do not think I have ever observed this problem on
> > > cell networks.
> > > 
> > > Reconnection solves the problem immediately.
> > > I did some tcpdumping when the problem was present and found the
> > > following:
> > > * Server side sees incoming traffic from the client and sends
> > > responses.
> > > * On my own router connected to Starlink (i.e. interface between
> > > my
> > > router and Starlink router) I see data going from the client to
> > > the
> > > server - but no packets coming back.
> > > 
> > > So my 'hypothesis' is that somehow Starlink's CGNAT 'forgets' one
> > > side
> > > of the connection - and so data continues to go in one direction,
> > > but
> > > it doesn't come back. The thing with the wireguard is that it
> > > looks
> > > like it doesn't change the outgoing port when it attempts to do
> > > another handshake. This means that it continues using the same
> > > 'half
> > > broken' connection forever.
> > > 
> > > I think the same happens to me at least once on a Linux client -
> > > but
> > > the difference with the phone is that the phone is always on and
> > > therefore the duration of the connection is much longer.
> > > 
> > > I tried experimenting with keepalive messages - but it looks like
> > > they
> > > make no difference. Once connection freezes I see keepalived
> > > arriving
> > > onto the server, server sending reply - but that reply never
> > > arrives
> > > to the client.
> > > 
> > > It looks like the solution to this problem would be for the client
> > > to
> > > use a different outgoing port when sending a handshake but I was
> > > not
> > > able to find an option for that.
> > > 
> > > Is this something that is possible to do?
> > > Thanks!
> > > 
> > > 
> > 
> 
> 



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-01-12  0:40 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-16  2:12 Connection hangs over CGNAT (Starlink) Nikolay Martynov
2022-12-17 22:15 ` Szymon Nowak
2023-01-10  9:44   ` reox
2022-12-18 13:24 ` Lonnie Abelbeck
2022-12-19  3:41 ` Dean Davis
2022-12-31  1:08   ` Nikolay Martynov
2023-01-10 10:55     ` Nikolay Kichukov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).