From mboxrd@z Thu Jan  1 00:00:00 1970
From: erik quanstrom <quanstro@quanstro.net>
Date: Wed,  6 May 2015 20:35:28 -0700
To: 9fans@9fans.net
Message-ID: <595e28c2c25b6d648085e5d51c5d51f3@lilly.quanstro.net>
In-Reply-To: <CAOw7k5j2q963Y_DyObQNVUHmVkrP5Z5UuRtO-9LsWgVtRMk8iQ@mail.gmail.com>
References: <87C61423-7C13-4516-88B5-C2ABA7D32AA9@me.com>
	<A1D8C0A6-BB79-42EB-9874-C3F4A8FCC1F6@9srv.net>
	<CAEAzY38Az=_4Min2vFfXCkNez-7Wka104QiWf=FaafmnJuAAWw@mail.gmail.com>
	<CANUoZoERAwWAMxLYGiAGOihHBwPAq07XyKmiTyCg2tmgvaO44Q@mail.gmail.com>
	<CAOw7k5jven0TcUs6HpreteRxr8vT57rNbAuubUbq2wEiqbv2+A@mail.gmail.com>
	<CANUoZoGXm6bf_Mi_QNJSmHN6U0yb+3CeVcK_r-UA+PFfDE_NQQ@mail.gmail.com>
	<CAOw7k5jBr37Owp1_pvjK4GqLCg-rP7Qut8hDci8ZwZ8nr1O1tw@mail.gmail.com>
	<CANUoZoFAq-h6dhpH+L54egj-jYBAyDYHRxG3cwc6suZdhs_dUQ@mail.gmail.com>
	<CAEAzY3_w3Cm+uAqEH_POW0sGUVods4UxxUPMHYQ9DFOcvbivXg@mail.gmail.com>
	<CANUoZoEtEMJLAhW>
	<CAOw7k5j2q963Y_DyObQNVUHmVkrP5Z5UuRtO-9LsWgVtRMk8iQ@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Subject: Re: [9fans] fossil+venti performance question
Topicbox-Message-UUID: 5006dd9e-ead9-11e9-9d60-3106f5b1d025

On Wed May  6 15:30:24 PDT 2015, charles.forsyth@gmail.com wrote:

> On 6 May 2015 at 22:28, David du Colombier <0intro@gmail.com> wrote:
>=20
> > Since the problem only happen when Fossil or vacfs are running
> > on the same machine as Venti, I suppose this is somewhat related
> > to how TCP behaves with the loopback.
> >
>=20
> Interesting. That would explain the clock-like delays.
> Possibly it's nearly zero RTT in initial exchanges and then when venti =
has
> to do some work,
> things time out. You'd think it would only lead to needless retransmiss=
ions
> not increased latency
> but perhaps some calculation doesn't work properly with tiny values,
> causing one side to back off
> incorrectly.

i don't think that's possible.

NOW is defined as MACHP(0)->ticks, so this is a pretty course timer
that can't go backwards on intel processors.  this limits the timer's res=
olution to HZ,
which on 9atom is 1000, and 100 on pretty much anything else.  further li=
miting the
resolution is the tcp retransmit timers which according to presotto are
	/* bounded twixt 0.3 and 64 seconds */
so i really doubt the retransmit timers are resending anything.  if someo=
ne
has a system that isn't working right, please post /net/tcp/<connectionno=
>/^(local remote status)
i'd like to have a look.

quoting steve stallion ,,,

> > Definitely interesting, and explains why I've never seen the regressi=
on (I
> > switched to a dedicated venti server a couple of years ago). Were the=
se the
> > changes that erik submitted? ISTR him working on reno bits somewhere =
around
> > there...
>
> I don't think so. Someone else submitted a different set of tcp changes
> independently much earlier.

just for the record, the earlier changes were an incorrect partial implem=
entation of
reno.  i implemented newreno from the specs and added corrected window sc=
aling
and removed the problem of window slamming.  we spent a month going over =
cases
from 50=C2=B5s to 100ms rtt latency and showed that we got near the theor=
etical max for
all those cases.  (big thanks to bruce wong for putting up with early, bu=
ggy versions.)

during the investigation of this i found that loopback *is* slow for reas=
ons i don't
completely understand.  part of this was the terrible scheduler.  as part=
 of the gsoc
work, we were able to make the nix scheduler not howlingly terrible for 1=
-8 cpus.  this
improvement depends on the goodness of mcs locks.  i developed a version =
of this,
but ended up using charles' much cleaner version.  there remain big probl=
ems with
the tcp and ip stack.  it's really slow.  i can't get >400MB/s on etherne=
t.  it seems
that the 3-way interaction between tcp:tx, tcp:rx and the user-space queu=
es is the issue.
queue locking is very wasteful as well.  i have some student code that ad=
dresses part
of the latter problem, but it smells to me like ip/tcp.c's direct calls b=
etween tx and rx
are the real issue.

- erik