From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <9front-bounces@9front.inri.net>
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org
X-Spam-Level: 
X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID,
	DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI
	autolearn=ham autolearn_force=no version=3.4.4
Received: from 9front.inri.net (9front.inri.net [168.235.81.73])
	by inbox.vuxu.org (Postfix) with ESMTP id 301D12163E
	for <ml@inbox.vuxu.org>; Fri, 29 Mar 2024 17:03:40 +0100 (CET)
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8bit
Received: from layka.disroot.org ([178.21.23.139]) by 9front; Fri Mar 29 12:02:21 -0400 2024
Received: from localhost (localhost [127.0.0.1])
	by disroot.org (Postfix) with ESMTP id 92CF140DB0
	for <9front@9front.org>; Fri, 29 Mar 2024 17:02:18 +0100 (CET)
X-Virus-Scanned: SPAM Filter at disroot.org
Received: from layka.disroot.org ([127.0.0.1])
	by localhost (disroot.org [127.0.0.1]) (amavisd-new, port 10024)
	with ESMTP id qUcQ3XoWqaOl for <9front@9front.org>;
	Fri, 29 Mar 2024 17:02:17 +0100 (CET)
Date: Fri, 29 Mar 2024 17:02:15 +0100
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=disroot.org; s=mail;
	t=1711728137; bh=whRJQESnqvNTcYAcBhHnAORMQHwDnGTVqHmRbb7Vjp0=;
	h=Date:To:Subject:From:References:In-Reply-To;
	b=dT4/ffOO8DX3jwceu7ho8aOqkORID5RwLFfuUDXA0kvy8tkxDzhWKfBELg58IsY9e
	 YuTlxiCjG0vkHPoxtPDBCy9xObN3+oOXnT/0Jp6R9GJ08fWkzBQzQIJiwBpep5SEGn
	 09AiAgejB3/o5mVLcQ9yJXslFI6KeSffoS2IL4e+OkEnxt3mwtbJ1t2axcGesXa2Z3
	 iXJnAvuax5ixYiMNZZYvGFa0D1HTK2uB64izbdhUZxXIR7bDUi7NniUCbdsv/i2BVh
	 GyIrc3mVNJbPjkAyfd7/8msOc6wJt5pPjj9RnA3M3CaKK13+cGdI4XFfyTImyE7m9H
	 kMQ9ZkQl766KA==
To: 9front@9front.org
From: Eolien55 <eolien55@disroot.org>
References: <505C905934B3CDB74EC21F69ADF03150@wopr.sciops.net>
In-Reply-To: <505C905934B3CDB74EC21F69ADF03150@wopr.sciops.net>
Message-Id: <32D57IH2WC1B8.2D6VU4LLQOX0C@e55-lap.my.domain>
List-ID: <9front.9front.org>
List-Help: <http://lists.9front.org>
X-Glyph: ➈
X-Bullshit: optimized scale-out firewall content-driven-aware generator 
Subject: Re: [9front] [PATCH] libc: replace lrand's algorithm
Reply-To: 9front@9front.org
Precedence: bulk

qwx@sciops.net wrote:
> On Fri Mar 29 09:19:39 +0100 2024, eolien55@disroot.org wrote:
> Here `state' is 64-bit, whereas in your code, it's a ulong,
> ie. 32-bit, which seems to break everything.

I've appeared to make a rookie mistake. It should obviously be
a u64int.

> Furthermore, your code seems to have flipped part of the last
> statement and return the equivalent of:
> 
> 	return (xorshifted >> ((-rot) & 31)) | (xorshifted << rot);
> 
> Is that intentional?

Well it is a right rotate instead of a left rotate. I don't believe
it changes anything relevant. Maybe it does, but I'm not sure.
Anyway it should probably indeed be changed to avoid such "puzzled
moments" (code is more often read than written).

> Moreover, for the benchmark itself, while seeding is significantly
> faster, actually calling lrand barely is; the new code is only
> 1.07x faster than the old.  The scale is also minuscule, around
> 42 seconds for circa 4.3 billion calls (you're looping on a signed
> long from 0 to -1).  Am I misreading?

You are not misreading. The main issues of the current algorithm are
not so much ones of speed (von Neumann's middle square algorithm is
incredibly fast but has extremely poor quality), but ones of state
size and statistical quality. Additive Lagged Fibonacci Generators
(ALFG) such as this one, have bad statistical properties, uncovered by
both mathematical analysis and empirical testing.

The scale is indeed miniscule, and the performance benefit arguably
small, but I do not think that performance alone should decide.

> lrand is used by rand(2) itself and is used throughout the system
> and elsewhere, and who knows what impact changing it might have.

I looked through every usage of lrand-related functions in the codebase,
and it seems that all programs use it simply as "a uniform pseudo-random
number generator", without other expectations on the output (things
that would vary from algorithm to algorithm). However the manpage should
be changed to reflect the algorithm change.

However, through looking at the full source tree, it appears there
are at least 3 lrand duplicates in the source tree, each with its
own algorithm: one for the kernel, in /sys/src/9/port/random.c, which
uses xoroshiro128+, which is quite good; one for normal userspace in
/sys/src/libc/lrand.c, which uses the aforementioned ALFG, which is
quite suboptimal; and one for ape in /sys/src/lib/v/rand.c, which
seemingly points to this[1] article. The algorithm is not in common
use, so maybe it should be changed to one whose properties are
better-known?

Given that kernel's lrand is xoroshiro128+, it would make sense to
either change kernel's lrand to PCG too, or change the proposed
algorithm to xoroshiro128+.

> Applications which need guarantees about distribution or
> performance are not supposed to use rand(2) to begin with.

Well in that case, some code in /sys/src/cmd needs to be changed
to not use rand, like spin and venti.

Cheers,
Elie Le Vaillant

[1] https://dl.acm.org/doi/pdf/10.1145/63039.63042