[Edbrowse-dev] Setters and Post Scan

edbrowse-dev - development list for edbrowse
 help / color / mirror / Atom feed

From: Karl Dahlke <eklhad@comcast.net>
To: Edbrowse-dev@lists.the-brannons.com
Subject: [Edbrowse-dev]   Setters and Post Scan
Date: Tue, 11 Mar 2014 16:59:10 -0400	[thread overview]
Message-ID: <20140211165910.eklhad@comcast.net> (raw)

> > perhaps growing as n^2 for the number of html tags,

> Out of interest, how do you arrive at this figure?

Mostly a gut feeling.
As background, consider my garbage collector for lines in a buffer.
This is just edits, no browsing.
Make a change, then make another change, so that there are now some lines
that you won't need even if you undo.
How do I determine and free those lines?
By comparing two maps of pointers.
The pointers in A, not in B, are not needed and can be freed.
The first version was simple and stupid, looking for things in A not in B,
and ran in n^2 time.
This didn't matter for a few years, until I started editing some really big files.
You also have started editing some large files.
If a file has a million lines, thatt's a trillion operations per edit.
I enter a substitute command and wait 2 minutes for a response.
I don't think so!
So a few years ago I rewrote it using quicksort and and internal form of comm,
and it is plenty quick, even for large files.
Look for qsort() in buffers.c.

Now fast forward to today; I am scanning a tree of js objects
and comparing the tags that are implied by those objects with the tags we already have,
and doing, essentially, a diff, so I can report to the user what has changed,
which a sighted person would simply see as a change to something on the screen.
It is very likely that there are quick ways to run this comparison,
but if there absolutely aren't,
and if a web page rewrote itself in a nasty way that like changed every other line,
then maybe the comparison would be as awful as n^2.
But as I say I expect there will be better algorithms
once we really dive into it,
and I'm quite sure that web pages only change small portions of themselves,
the text in various	 boxes, or menus, etc,
and don't completely rewrite themselves,
so that's why I don't think it will be a performance issue,
even if we implement it in a rather direct and plodding way.

Karl Dahlke

next             reply	other threads:[~2014-03-11 21:00 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-11 20:59 Karl Dahlke [this message]
2014-03-12  9:45 ` Adam Thompson
  -- strict thread matches above, loose matches on Subject: below --
2014-03-11 15:26 Karl Dahlke
2014-03-11 20:33 ` Adam Thompson
2014-03-10 12:27 Karl Dahlke
2014-03-11 13:47 ` Chris Brannon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140211165910.eklhad@comcast.net \
    --to=eklhad@comcast.net \
    --cc=Edbrowse-dev@lists.the-brannons.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).