From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: * X-Spam-Status: No, score=1.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,HDRS_MISSP autolearn=no autolearn_force=no version=3.4.4 Received: (qmail 7148 invoked from network); 27 Jul 2022 17:47:39 -0000 Received: from hurricane.the-brannons.com (2602:ff06:725:1:20::25) by inbox.vuxu.org with ESMTPUTF8; 27 Jul 2022 17:47:39 -0000 Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by hurricane.the-brannons.com (OpenSMTPD) with ESMTP id 1da82dfe for ; Wed, 27 Jul 2022 10:47:34 -0700 (PDT) Received: from resdmta-h1p-028597.sys.comcast.net (resdmta-h1p-028597.sys.comcast.net [2001:558:fd02:2446::d]) by hurricane.the-brannons.com (OpenSMTPD) with ESMTPS id 2d0243be (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256:NO) for ; Wed, 27 Jul 2022 10:47:29 -0700 (PDT) Received: from resomta-h1p-027916.sys.comcast.net ([96.102.179.203]) by resdmta-h1p-028597.sys.comcast.net with ESMTP id GhwuolNQB2d5sGl83oXwrZ; Wed, 27 Jul 2022 17:47:27 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=20190202a; t=1658944047; bh=sKI+454KjmmEOJ7K9OgOn6t+m74YpNt/2u0/Z7FBmHM=; h=Received:Received:To:From:Reply-to:Subject:Date:Message-ID: Mime-Version:Content-Type; b=Td/Td3h7xYXuC+ATfGJlKc7rWmg7p9lVqbriq0qlUHTSEhVWFvBDbngCvrTs8qfuV hAvJTk67aMaFABCpMq474DHCEqcjhDpnbaQihD8Hcq1n5CKJauJAYiDtctv5qtxEh1 MwEw/bTo0XcpwiBHYKsOTFPTlspe7/5HDr1SH1UqbfbMZbqGH78l+336Ff5S1h7foU G8z0tfmUyzvwEv6+Z42nc6xocO/sfuthMdHgnbkAaYNTf3AwZuvrGxHVPOmKvmY3D3 P9G80hGguc2sP75XyiPUqGps94ORSZUIrSksWGTUtabif9bHpz6FFzBRK5oZG8ou7j EkIj0zafWaClg== Received: from unknown ([IPv6:2601:408:c001:30::ac39]) by resomta-h1p-027916.sys.comcast.net with ESMTPSA id Gl7foW2AG270mGl7ho71sM; Wed, 27 Jul 2022 17:47:06 +0000 X-Xfinity-VMeta: sc=0.00;st=legit To:edbrowse-dev@edbrowse.org From: Karl Dahlke Reply-to: Karl Dahlke User-Agent: edbrowse/3.8.2.1+ Subject: Architecture Date: Wed, 27 Jul 2022 13:47:03 -0400 Message-ID: <20220627134703.eklhad@comcast.net> X-BeenThere: edbrowse-dev@edbrowse.org List-Id: Edbrowse Development List Mime-Version: 1.0 Content-Type: multipart/mixed; boundary=nextpart-eb-872591 Content-Transfer-Encoding: 7bit This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. --nextpart-eb-872591 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Just a note / thought, in case I'm gone and you want to keep edbrowse = going, and maybe modify it for the modern age. Over 20 years ago I chose a design where the lines of a file are stored = in an array. Not literally of course, a line can be of arbitrary length, but the = structure that defines a line. An alternate design is a link list of lines. If the lines are short, this adds considerable overhead, pointer to = next and previous. Computers didn't have a lot of memory, and we could hold more lines, = and larger files, with my simple array - still within the limits of memory which was about half a gig. I still think that was a reasonable design circa 2000. Now apply this to large files, and first, some can't even be = represented, with the limits of that array. For others, delete a line and you are moving millions of lines up in = the array. That's fine for one line, but g/stuff/d is horribly inefficient, = quadratic in the size of the file, and prohibitive in some situations. So I wrote mass delete, mass join, mass read, mass substitute, to = address all this, and that's fine, but at some point we should step back and ask if we should redesign the = representation using link lists. We don't care about the pointer overhead now, we have ram to burn. And deleting a line - just snap it out. We could throw away all that specialized mass-operation code. And large files are easier to manage. Now some things are slower, like accessing line 937241. You have to step through 937241 links to get there. But a computer can do that pretty fast, and it's linear in the size of = the file. There aren't any operations, any more, that become quadratic in the = size of the file. At the end of the day, it's usually the exponent that makes all the = difference. So after 3.8.3, if somebody has nothing better to do, ask whether a link list redesign is appropriate, and how it might be = accomplished. You know, rebuilding the ship while it is still sailing. Karl Dahlke --nextpart-eb-872591--