From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 17942 invoked from network); 13 Feb 2022 03:58:26 -0000 Received: from 4ess.inri.net (216.126.196.42) by inbox.vuxu.org with ESMTPUTF8; 13 Feb 2022 03:58:26 -0000 Received: from mimir.eigenstate.org ([206.124.132.107]) by 4ess; Sat Feb 12 22:50:26 -0500 2022 Received: from abbatoir.myfiosgateway.com (pool-74-108-56-225.nycmny.fios.verizon.net [74.108.56.225]) by mimir.eigenstate.org (OpenSMTPD) with ESMTPSA id dee2f5a1 (TLSv1.2:ECDHE-RSA-AES256-SHA:256:NO) for <9front@9front.org>; Sat, 12 Feb 2022 19:49:58 -0800 (PST) Message-ID: <64111FCCFEB91FC5434307463B7BAFDE@eigenstate.org> To: 9front@9front.org Date: Sat, 12 Feb 2022 22:49:56 -0500 From: ori@eigenstate.org In-Reply-To: <5628E6D857F1F7842D6A50FE216B3316@eigenstate.org> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit List-ID: <9front.9front.org> List-Help: X-Glyph: ➈ X-Bullshit: API ORM over ACPI content-driven proxy factory table database Subject: Re: [9front] git: tune deltification Reply-To: 9front@9front.org Precedence: bulk Quoth ori@eigenstate.org: > Dropping the chunk size reduces pack sizes > by about 15%, from 120 megs to 100. > > Replacing sha1 with murmurhash3 when hashing > deltas drops the time to repack the 9front > repo by about 20 seconds. Updated, changing u64 -> u32 where appropriate. diff 2367a2aeaec8432e6b059135e49c2fa86e415ae5 uncommitted --- a/sys/src/cmd/git/delta.c +++ b/sys/src/cmd/git/delta.c @@ -4,10 +4,9 @@ #include "git.h" enum { - Minchunk = 128, + Minchunk = 32, + Splitmask = 0x7f, Maxchunk = 8192, - Splitmask = (1<<8)-1, - }; static u32int geartab[] = { @@ -45,16 +44,47 @@ 0x9984a4f4, 0xd5de43cc, 0xd294daed, 0xbecba2d2, 0xf1f6e72c, 0x5551128a, 0x83af87e2, 0x6f0342ba, }; -static u64int -hash(void *p, int n) +/* murmurhash3 */ +u32int +hash(void *ptr, int len) { - uchar buf[SHA1dlen]; - sha1((uchar*)p, n, buf, nil); - return GETBE64(buf); + u32int h, k, s; + uchar *p; + int i; + + /* Read in groups of 4. */ + h = 2928213749ul; + p = ptr; + for (i = len >> 2; i; i--) { + k = *(u32int*)p; + s = k * 0xcc9e2d51; + s = (s << 15) | (s >> 17); + h ^= s*0x1b873593; + h = (h << 13) | (h >> 19); + h = h * 5 + 0xe6546b64; + p += 4; + } + /* Read the rest. */ + k = 0; + for (i = len & 3; i; i--) { + k <<= 8; + k |= p[i - 1]; + } + s = k * 0xcc9e2d51; + s = (s << 15) | (s >> 17); + h ^= s*0x1b873593; + /* Finalize. */ + h ^= len; + h ^= h >> 16; + h *= 0x85ebca6b; + h ^= h >> 13; + h *= 0xc2b2ae35; + h ^= h >> 16; + return h; } static void -addblk(Dtab *dt, void *buf, int len, int off, u64int h) +addblk(Dtab *dt, void *buf, int len, int off, u32int h) { int i, sz, probe; Dblock *db; @@ -88,7 +118,7 @@ lookup(Dtab *dt, uchar *p, int n) { int probe; - u64int h; + u32int h; h = hash(p, n); for(probe = h % dt->sz; dt->b[probe].buf != nil; probe = (probe + 1) % dt->sz){ @@ -127,7 +157,7 @@ dtinit(Dtab *dt, Object *obj) { uchar *s, *e; - u64int h; + u32int h; vlong n, o; o = 0; --- a/sys/src/cmd/git/git.h +++ b/sys/src/cmd/git/git.h @@ -183,7 +183,7 @@ uchar *buf; int len; int off; - u64int hash; + u32int hash; }; struct Delta {