edbrowse-dev - development list for edbrowse
 help / color / mirror / Atom feed
* Architecture
@ 2022-07-27 17:47 Karl Dahlke
  0 siblings, 0 replies; only message in thread
From: Karl Dahlke @ 2022-07-27 17:47 UTC (permalink / raw)
  To: edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 1989 bytes --]

Just a note / thought, in case I'm gone and you want to keep edbrowse going, and maybe modify it for the modern age.

Over 20 years ago I chose a design where the lines of a file are stored in an array.
Not literally of course, a line can be of arbitrary length, but the structure that defines a line.
An alternate design is a link list of lines.
If the lines are short, this adds considerable overhead, pointer to next and previous.
Computers didn't have a lot of memory, and we could hold more lines, and larger files, with my simple array -
still within the limits of memory which was about half a gig.
I still think that was a reasonable design circa 2000.

Now apply this to large files, and first, some can't even be represented, with the limits of that array.
For others, delete a line and you are moving millions of lines up in the array.
That's fine for one line, but g/stuff/d is horribly inefficient, quadratic in the size of the file, and prohibitive in some situations.
So I wrote mass delete, mass join, mass read, mass substitute, to address all this, and that's fine,
but at some point we should step back and ask if we should redesign the representation using link lists.
We don't care about the pointer overhead now, we have ram to burn.
And deleting a line - just snap it out.
We could throw away all that specialized mass-operation code.
And large files are easier to manage.
Now some things are slower, like accessing line 937241.
You have to step through 937241 links to get there.
But a computer can do that pretty fast, and it's linear in the size of the file.
There aren't any operations, any more, that become quadratic in the size of the file.
At the end of the day, it's usually the exponent that makes all the difference.

So after 3.8.3, if somebody has nothing better to do,
ask whether a link list redesign is appropriate, and how it might be accomplished.
You know, rebuilding the ship while it is still sailing.

Karl Dahlke

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2022-07-27 17:47 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-27 17:47 Architecture Karl Dahlke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).