On Fri, Nov 06, 2015 at 02:51:14PM -0500, Karl Dahlke wrote:
> So I was thinking, if I were to start all over again,
> what might I do differently?
> All those native methods and their side effects, very awkward,
> and they don't do the right thing all the time,
> and they are engine specific since they are in C,
> so have to be modified if we upgrade or switch js engines,
> and you still have to check a lot of js variables anyways because
> js doesn't always call those functions to do its thing.
> So ...
>=20
> What if there was a lot more js at the start, more prototypes,
> more functions, more js setters, and some specific edbrowse variables und=
er
> window.eb$, variables that we can query after js has run.
> There would be almost no native methods,
> only one that I can think of and I'll get to that below.

Ok, I'm going to start my response with a big *NO* (but please read on for
my reasoning).

> The real down side here is there are two ways to render text.
> First the one we already have which is based on the html tree.
> We have to keep that one because the user might run without js,
> for any number of reasons.
> The second, now, is to render the possibly modified js tree.
> Start at document.body and traverse childNodes depth first.
> This could build the text buffer directly, or, it could
> build a tree of nodes whence we call the first render routine above.
> It might be nice to leverage the preexisting render routine.
> Such would happen after every javascript call, push the button,
> modify a field with onchange code, make a different selection
> with onselect code, submit a form with onsubmit code, you get the idea.
> It looks a little painful relative to performance but in reality
> I don't think it makes any difference.
> You wouldn't run the update all that often, and web pages
> aren't that huge, and computers are pretty fast.
> Don't think performance matters, but the two different render routines,
> that's the down side.

Ok, if you're talking about modern web pages performance could start to be a
serious issue with this, particularly when we get into async js.
At the moment it's not, and there are *so many* other issues (but a decreasing
amount) that performance isn't top of the list.
However there are performance issues which could be come more of a priority  as
we get an increasingly complete js implementation.

> The up side is almost nothing is done through native methods and side effects
> passed back to the html process, everything is gleaned after js returns,
> and nothing is missed.
> Nothing gets lost in translation.
> The whole js tree, whatever it is, is rerendered.

That's true, but I fear we're seeing shiny things and mistaking them for good
design (see below).

> Let's look at some other nonrendering side effects.
> document.cookie could have an inbuilt setter
> to add the new cookie to a list of cookies to be processed
> when js returns,
> window.eb$.cookieList[],
> and the setter would also fold the new cookie into the cookie string
> that is returned by getter when document.cookie is queried.
> It's pretty easy, certainly easier than the native code we have today
> jseng-moz.cpp line 955.
> Perhaps not less code, but more maintainable.

And both insecure and incredibly fragile against a malicious web page.
Someone could insert all sorts of crud in there,
or use some sort of compromise to insert a magic property in place of the array
to, for example (in a world where we have ajax)
capture all cookies set by a different website or similar (and I've not even
tried to think about this too hard).

> Here's something that seems like it still has to be native.
> document.location.href = "new web page".
> js doesn't keep going, it stops and edbrowse has to fetch a new web page.
> So whenever edbrowse has to take action, right now,
> not later, not delayed,
> not when js is finished but right now,
> that's a candidate for a native method.
> But here again, it doesn't have to be native, because js is suppose to stop.
> So set document.location.href like you normally would,
> set a jump flag in window.eb$.jumpNewLocation, and then throw an
> exception so that js stops.
> Same model, edbrowse checks everything after js returns.
> It sees the jump flag and goes to a new web page.

Unless someone uses an iframe to set this (again via a compromised site).
This is a window level property so an iframe etc could do all sorts of damage here.

> Here's another one, document.forms[0].submit().
> Run the onsubmit code first and if that's ok then set a jump flag
> in window.eb$.jumpFormSubmit0, and throw an exception so js stops.
> Still nothing is native.

Or compromise to bypass form validation, or capture the url somehow, or redirect the user to somewhere else.

> The only thing I've found so far that really must be native is that pesky  innerHTML.
> It has to parse html and fold objects into the js tree now,
> before the next line of js runs,
> and, js does not stop, so we can't just throw an exception.
> I'm not going to translate the entire tidy system into js,
> so that will remain a C routine.
> innerHTML has to be native, and run the text through tidy,
> and our html-tidy.c or some variation thereof
> to make js nodes, and paste them into the tree,
> then it returns and js marches on.
> That's how the native method works today,
> and we could pretty much keep it as is.
> But that's it.
> Is it possible that if I were starting all over again
> with a js centric design that there would be only one native method, innerHTML?

Perhaps, but you'd have a fragile, easily broken DOM with a bunch of
designed-in security holes. I'm not claiming I'm a cyber security expert,
but I do work in that industry and this design is setting alarm bells ringing for me.

There's a reason a chunk of DOM objects are read-only;
part of that is performance but I suspect most of it is to prevent web developers
breaking fundimental mechanisms as is possible with this design.

> I'm not saying this is a better design, although part of me thinks it is,
> since there is less engine specific code, and each time we render the
> text buffer straight from the horse's mouth.

That's, if anything, why we need a *more* native DOM,
but decoupled from the js engine, i.e.
create object stubs which go back to the DOM to set DOM attributes so we don't
need to make a bunch of js variable checks to render the DOM.

> But I don't know.
> And I'm certainly not planning any changes of this magnitude any time soon.
> We need to march towards 3.6.0 and stability.
> I just wanted to put this idea out there,
> in case I get hit by a bus tomorrow or something.

It's an interesting idea and seems, superficially, like a good one.
However, if I've learned anything about the internet it's that, in general,
browsers are the primary way of exploiting users' computers and thus it's
important that the attack surface is kept as small as possible.
That means reducing the amount of internal DOM implementation which web pages
can fiddle with, and probably means moving more of our code into nativeC,
or at least implementing tighter security aroundit.
Without wanting to sound too negative,
we're currently very fortunate that we don't have a more fully featured js
implementation since we simply don't have the security in place to do this well.
I'm not talking about DOM stuff, that's not that much of a problem and is a
 very good idea to sort out.
The real issues are going to happen when we get AJAX and the file system
accessing functionality present in really modern browsers' implementations 
(e.g. new versions of Firefox).

Cheers,
Adam.