edbrowse-dev - development list for edbrowse
 help / color / mirror / Atom feed
* [Edbrowse-dev]  Setters and Post Scan
@ 2014-03-11 15:26 Karl Dahlke
  2014-03-11 20:33 ` Adam Thompson
  0 siblings, 1 reply; 6+ messages in thread
From: Karl Dahlke @ 2014-03-11 15:26 UTC (permalink / raw)
  To: Edbrowse-dev

> I think so. I don't think it would really be feasible to handle a fully
> dynamic page in setters. How expensive is this going to be in terms of
> runtime?

In theory it could be bad, in a contrived webpage that I wrote myself,
perhaps growing as n^2 for the number of html tags,
but in practice I don't think it will be an issue.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Edbrowse-dev] Setters and Post Scan
  2014-03-11 15:26 [Edbrowse-dev] Setters and Post Scan Karl Dahlke
@ 2014-03-11 20:33 ` Adam Thompson
  0 siblings, 0 replies; 6+ messages in thread
From: Adam Thompson @ 2014-03-11 20:33 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 488 bytes --]

On Tue, Mar 11, 2014 at 11:26:57AM -0400, Karl Dahlke wrote:
> > I think so. I don't think it would really be feasible to handle a fully
> > dynamic page in setters. How expensive is this going to be in terms of
> > runtime?
> 
> In theory it could be bad, in a contrived webpage that I wrote myself,
> perhaps growing as n^2 for the number of html tags,
> but in practice I don't think it will be an issue.

Out of interest, how do you arrive at this figure?

Cheers,
Adam.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Edbrowse-dev] Setters and Post Scan
  2014-03-11 20:59 Karl Dahlke
@ 2014-03-12  9:45 ` Adam Thompson
  0 siblings, 0 replies; 6+ messages in thread
From: Adam Thompson @ 2014-03-12  9:45 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 3096 bytes --]

On Tue, Mar 11, 2014 at 04:59:10PM -0400, Karl Dahlke wrote:
> > > perhaps growing as n^2 for the number of html tags,
> 
> > Out of interest, how do you arrive at this figure?
> 
> Mostly a gut feeling.
> As background, consider my garbage collector for lines in a buffer.
> This is just edits, no browsing.
> Make a change, then make another change, so that there are now some lines
> that you won't need even if you undo.
> How do I determine and free those lines?
> By comparing two maps of pointers.

I did wonder how this worked.

> The pointers in A, not in B, are not needed and can be freed.
> The first version was simple and stupid, looking for things in A not in B,
> and ran in n^2 time.
Ah.
> This didn't matter for a few years, until I started editing some really big files.
> You also have started editing some large files.
Yeah.
> If a file has a million lines, thatt's a trillion operations per edit.
> I enter a substitute command and wait 2 minutes for a response.
> I don't think so!
> So a few years ago I rewrote it using quicksort and and internal form of comm,
> and it is plenty quick, even for large files.
Yeah, I'd never really noticed this going on till I turned up the debugging 
level.
> Look for qsort() in buffers.c.

Sounds interesting.

> 
> Now fast forward to today; I am scanning a tree of js objects
> and comparing the tags that are implied by those objects with the tags we already have,
> and doing, essentially, a diff, so I can report to the user what has changed,
> which a sighted person would simply see as a change to something on the screen.
> It is very likely that there are quick ways to run this comparison,
> but if there absolutely aren't,
> and if a web page rewrote itself in a nasty way that like changed every other line,
> then maybe the comparison would be as awful as n^2.
Perhaps, but I hope not.
> But as I say I expect there will be better algorithms
> once we really dive into it,
> and I'm quite sure that web pages only change small portions of themselves,
> the text in various	 boxes, or menus, etc,
> and don't completely rewrite themselves,
> so that's why I don't think it will be a performance issue,
> even if we implement it in a rather direct and plodding way.

Yeah, I doubt it'll be worse than the setter hell that full dynamic page
support would cause at any rate. As I've previously said though,
I think this is more of a reason to come up with a js-independant DOM which we
can get working efficiently, then put the js on top.
This allows much more flexibility, and hopefully provides opertunities to
remove the entire post scan each time js makes a change.

I'm thinking of the tag tree we previously discussed,
with a flag set at the point where a change is made.
Any post scan can then start from this place rather than the top of the tree.
The only issue with this is if someone set the inner html of something to some
broken html which effected tags outside where the inner html was set.
I'm not sure what to do in this case.

Cheers,
Adam.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Edbrowse-dev]   Setters and Post Scan
@ 2014-03-11 20:59 Karl Dahlke
  2014-03-12  9:45 ` Adam Thompson
  0 siblings, 1 reply; 6+ messages in thread
From: Karl Dahlke @ 2014-03-11 20:59 UTC (permalink / raw)
  To: Edbrowse-dev

> > perhaps growing as n^2 for the number of html tags,

> Out of interest, how do you arrive at this figure?

Mostly a gut feeling.
As background, consider my garbage collector for lines in a buffer.
This is just edits, no browsing.
Make a change, then make another change, so that there are now some lines
that you won't need even if you undo.
How do I determine and free those lines?
By comparing two maps of pointers.
The pointers in A, not in B, are not needed and can be freed.
The first version was simple and stupid, looking for things in A not in B,
and ran in n^2 time.
This didn't matter for a few years, until I started editing some really big files.
You also have started editing some large files.
If a file has a million lines, thatt's a trillion operations per edit.
I enter a substitute command and wait 2 minutes for a response.
I don't think so!
So a few years ago I rewrote it using quicksort and and internal form of comm,
and it is plenty quick, even for large files.
Look for qsort() in buffers.c.

Now fast forward to today; I am scanning a tree of js objects
and comparing the tags that are implied by those objects with the tags we already have,
and doing, essentially, a diff, so I can report to the user what has changed,
which a sighted person would simply see as a change to something on the screen.
It is very likely that there are quick ways to run this comparison,
but if there absolutely aren't,
and if a web page rewrote itself in a nasty way that like changed every other line,
then maybe the comparison would be as awful as n^2.
But as I say I expect there will be better algorithms
once we really dive into it,
and I'm quite sure that web pages only change small portions of themselves,
the text in various	 boxes, or menus, etc,
and don't completely rewrite themselves,
so that's why I don't think it will be a performance issue,
even if we implement it in a rather direct and plodding way.

Karl Dahlke

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Edbrowse-dev] Setters and Post Scan
  2014-03-10 12:27 Karl Dahlke
@ 2014-03-11 13:47 ` Chris Brannon
  0 siblings, 0 replies; 6+ messages in thread
From: Chris Brannon @ 2014-03-11 13:47 UTC (permalink / raw)
  To: Edbrowse-dev

Karl Dahlke <eklhad@comcast.net> writes:

> The fact that js could completely rearrange the page,
> makes it likely that we will move away from setters and towards
> a post scan for these things.

I think so.  I don't think it would really be feasible to handle a fully
dynamic page in setters.  How expensive is this going to be in terms of
runtime?

-- Chris

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Edbrowse-dev] Setters and Post Scan
@ 2014-03-10 12:27 Karl Dahlke
  2014-03-11 13:47 ` Chris Brannon
  0 siblings, 1 reply; 6+ messages in thread
From: Karl Dahlke @ 2014-03-10 12:27 UTC (permalink / raw)
  To: Edbrowse-dev

A setter is a function that is called when a variable is set,
to perform some sort of side effect.
Setters will always be needed.
Example is the URL class.
If an object o is URL,
and you set o = "http://foobar.com:1234/whatever.html"
that has the side effect of setting

o.protocol = "http"
o.port = 1234
o.host = "foobar.com"
o.data = "whatever.html"

And setting document.cookies to a string puts a cookie in your cookie jar.
Did you know that?
http can set cookies for you in the header, in the usual way,
but js can also look at and set your cookies.

All this is implemented now, because lots of websites use it.
Like any spare time project, I've been driven somewhat by expediency,
what do I need to do to access my favorite sites?
It's not a great way to program but there it is.

Another setter that isn't implemented, and maybe should be,
is the selectedIndex in a pick-one select list.
This is a dropdown menu where you pick one item.
In js there is an array called options[],
an array of Option objects.
options[0] is the first option, option[1] is the second, and so on.
So when you pick an option in edbrowse I do two things:
field.selectedIndex = 3
field.options[3].checked = true
But if for some reason javascript did one, I should do the other as a side efect.
If it set field.options[4].checked = true then I should set
field.options[3].checked = false
field.selectedIndex = 4
Or if instead js set field.selectedIndex = 4 then I have to update
the checked fields.
I guess this is uncommon, because I haven't run into it,
and thus haven't implemented it.

The interesting thing here is this cannot be done by a post scan.
I can't scan the tree of objects after the fact and figure it out.
I see that selectedIndex and options.checked are inconsistent,
but I don't know which one is right.
Thus setters will always be needed.

But should they be used to update the edbrowse buffer?
That's an interesting question.
I do this only once, so far, look for
javaSetsTagVar in jsloc.cpp.
This is for standard text fields.
Perhaps you entered $35,
and js cleans it up to $35.00 to include the cents.
The modified string has to be reflected back in your text buffer.
This is done now with the value setter, and it works.
In the last line of jsrt, the calc button to calculate the size
of your text buffer, it sets a field value in javascript,
and that is pushed back to the edbrowse buffer.
Add some text to buffer 2, switch back to buffer 1,
and push <calc> again.
But should we be doing most of this work with setters, or with a post scan?

The fact that js could completely rearrange the page, new paragraphs
new links new tables etc,
makes it likely that we will move away from setters and towards
a post scan for these things.
Step through the input fields, and aha, price.value is now "$35.00"
It use to be "$35", so change it in the buffer and notify the user.

Setters could in theory update line 27 5 times, giving you 5 messages
"line 27 has been updated",
whereas a post scan would see that line 27 was different and print that message once.

All this is coming to the fore as I think about rebuilding an option list
that was changed by javascript.
You live in Michigan; here are the doctors you can pick from in your state.
It happens all the time.
Doing this by setters would be a nightmare.
Very intrusive code, which we are all trying to avoid.
And it would almost certainly have bugs, or be hard to maintain.
The post scan however steps through the option tags that I have,
and the js options in field.options[], and compares them.
If they are different I rebuild the tags and notify the user
that the menu has changed.

I'm not at the stage of writing any code yet, still designing,
still thinking out loud.
And If I post these thoughts then you can comment and we can all
move forward together.

Yes, Jean from debian got back to us and I think he's going to checkout our
latest and try to build it and see if he can determine the problem on his end.
It's a puzzle.

Karl Dahlke

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-03-12  9:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-11 15:26 [Edbrowse-dev] Setters and Post Scan Karl Dahlke
2014-03-11 20:33 ` Adam Thompson
  -- strict thread matches above, loose matches on Subject: below --
2014-03-11 20:59 Karl Dahlke
2014-03-12  9:45 ` Adam Thompson
2014-03-10 12:27 Karl Dahlke
2014-03-11 13:47 ` Chris Brannon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).