[Edbrowse-dev] JS1

edbrowse-dev - development list for edbrowse
 help / color / mirror / Atom feed

* [Edbrowse-dev] JS1
@ 2017-07-23  2:49 Karl Dahlke
  2017-07-24 17:57 ` Adam Thompson
  0 siblings, 1 reply; 10+ messages in thread
From: Karl Dahlke @ 2017-07-23  2:49 UTC (permalink / raw)
  To: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 3543 bytes --]

This is either for fun, or for testing, or for the future. I'm not sure.

export JS1=on

Set this to any nonempty value, and edbrowse remains one process. It doesn't fork off a copy of itself to manage the js.
The default is as it was, 2 processes and pipes.
The first implementation of this feature was easy peasy, almost too easy.
Leave all the pipes up and don't do the fork.
Send messages through the pipes and back around to the same process.
After edbrowse has sent a message to js, and is waiting for a response, it does this.

		whichproc = 'j';
		processMessage1();
		whichproc = 'e';

processMessage1 is in jseng-duk.c and it reads and processes the message and writes the response back to the pipe, back to edbrowse.
So it looks like the other process ran.
whichproc j or e makes the single process believe it is running as the js process then as edbrowse again.
If you just joined us and looked at the software, and thought we routinely ran as one process, you'd think it was the stupidest design ever.
Sending messages out through pipes and back to myself instead of just calling the functions that are right there.
But it's evolution, isn't it?
It reminds me of the gene that manufactures vitamin C, the gene that works in most animals, but not us.
It's broken in us, it's still there but broken, so we have to eat fruit or die of scurvy.
But I digress.

This worked for jsrt and all my local tests, but not in the real world.
Websites can have large strings, and these can be part of the messages.
When you push a long message into a pipe, it blocks until somebody reads part of it from the other side.
That's how pipes have worked since 1980.
With nothing asynchronous reading from the other side, everything stopped.
Oops.
So I bypassed the pipes and put the message in memory.
Then the other simulated process reads it from memory, writes the reply in memory, and so on, and that works.
It's still a modest change, and only comes into play if JS1=on.

For nearly 2 years Adam has told me we should stay with 2 processes, and so far he's been right all along.
When js crashes, and it still crashes a lot, at least you keep your websites and the book you were reading and the email you were writing and so on.
And of course the 2 process messaging API is great encapsulation, and meant I could switch from mozilla to duktape in 3 weeks instead of 6 months.
All good, but the separate processes and spaces cause trouble.
Keeping a consistent set of cookies in the two curl spaces is not easy, and I'm still not sure we're doing it right.
You can test it with JS1=on, one curl space, and see if the website renders properly.
Or consider the code Dominique is working on right now, the domains and passwords for 401 authentication.
He wouldn't need to do any of that if it was one process.
But it's two, so he has to write code to pass all that stuff in messages through pipes and down to the js process.
Well before any of this is written he can test it in one process to see if the authentication works for both edbrowse http fetch and js http fetch.
It's a way to compare the two worlds.

Wil one process some day be the norm?
Wouldn't change the user interface, so it's entirely up to us.
Maybe someday, and boy could we simplify the code!
But not until js is rock solid and just plain doesn't crash. We're not there yet.
Example, we call tidy to parse html, and we don't feel like we need a separate process and pipes to do that, because tidy doesn't crash.

Karl Dahlke

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Edbrowse-dev] JS1
  2017-07-23  2:49 [Edbrowse-dev] JS1 Karl Dahlke
@ 2017-07-24 17:57 ` Adam Thompson
  2017-07-28 17:39   ` Karl Dahlke
  0 siblings, 1 reply; 10+ messages in thread
From: Adam Thompson @ 2017-07-24 17:57 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

On Sat, Jul 22, 2017 at 10:49:40PM -0400, Karl Dahlke wrote:
> This is either for fun, or for testing, or for the future. I'm not sure.
> 
> export JS1=on

Ok, so how're we going to do all the async js stuff we'll have to start doing
properly soon?  Tbh I'm not sure how much I'm attached to the multiple process
idea other than it means js can go pop and we don't... but we *need* to sort
out async js soon.  Unfortunately websites are becoming increasingly reliant on
ajax being genuinely async and that's before we get to websockets etc.
With 2 processes it means that we can start with a simple interface with pipes
and select as now.  With 1 we'll need to do thread-safe programming to make this
fly properly.  It's not impossible but we'll have to make a bunch of changes to
make the existing code properly thread-safe to make that work, or have some way
to run duktape in a mode where it only does a certain amount of instructions and
use select/poll/some event loop to allow io, comms and js to work together...
Honestly I'm not sure what the right design is yet. I've been, unfortunately,
rather busy and unable to do much on edbrowse recently so I'm still catching up
with things.
That being said, this variable, plus the fact that we've made the switch to
duktape (well done for that, sorry I ran out of time to help) gives us a
mechanism to may be play with some sort of different model for running js.

Cheers,
Adam.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Edbrowse-dev] JS1
  2017-07-24 17:57 ` Adam Thompson
@ 2017-07-28 17:39   ` Karl Dahlke
  2017-07-28 19:51     ` Adam Thompson
  0 siblings, 1 reply; 10+ messages in thread
From: Karl Dahlke @ 2017-07-28 17:39 UTC (permalink / raw)
  To: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 1681 bytes --]

> Ok, so how're we going to do all the async js stuff we'll have to start doing
> properly soon? Tbh I'm not sure how much I'm attached to the multiple process
> idea other than it means js can go pop and we don't...

Right. They are orthogonal issues.
We're not going to spin off a separate process for every asynchronous thing that the browser might do,
it will either be threads or some sort of time share polling,
so the separate process was necessary due to the fragility of js,
but I think we're fixing that step by step, and when the js side is as solid as the edbrowse side I'd like to fold them back together into one process,
which the JS1 variable is a start.
I may add some more code over the next week, still under JS1, to simplify the code when it is one process.
So many times we can just call a function instead of passing messages around, and we wouldn't even need all those updates when curl data changes etc,
so something to work towards.

As for the larger question, I don't think I know enough yet to answer.
Honestly it makes me nervous to think of making all of edbrowse + duktape threadsafe.
Javascrip timers are implemented now, and well tested, e.g. http://www.eklhad.net/async
They don't spin on cpu cycles, they actually use timers, but they also signal the main input loop so that everything is serialized.
We process what you typed in, or we run the js code associated with a timer, and we're back.
There aren't any threads, and it works pretty well.
I'm not saying that will work for everything we need to do, I just don't know enough yet,
but if it would work for other situations it would be easier.

Cheers,

Karl Dahlke

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Edbrowse-dev] JS1
  2017-07-28 17:39   ` Karl Dahlke
@ 2017-07-28 19:51     ` Adam Thompson
  2017-07-28 22:05       ` Karl Dahlke
  0 siblings, 1 reply; 10+ messages in thread
From: Adam Thompson @ 2017-07-28 19:51 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

On Fri, Jul 28, 2017 at 01:39:10PM -0400, Karl Dahlke wrote:
> > Ok, so how're we going to do all the async js stuff we'll have to start doing
> > properly soon? Tbh I'm not sure how much I'm attached to the multiple process
> > idea other than it means js can go pop and we don't...
> 
> Right. They are orthogonal issues.
> We're not going to spin off a separate process for every asynchronous thing that the browser might do,
> it will either be threads or some sort of time share polling,
> so the separate process was necessary due to the fragility of js,
> but I think we're fixing that step by step, and when the js side is as solid as the edbrowse side I'd like to fold them back together into one process,
> which the JS1 variable is a start.
> I may add some more code over the next week, still under JS1, to simplify the code when it is one process.
> So many times we can just call a function instead of passing messages around, and we wouldn't even need all those updates when curl data changes etc,
> so something to work towards.

They're kind of authogonal and kind of not.  With the separate process we can
keep spinning round the main input loop even if js is stuck in some expensive
operation... I can switch buffers and get on with browsing whilst the page in
buffer 1 does its thing.  We can do that by polling a pipe or switching to some
non-blocking IO.  With js in the same process we've got to either spin it up in
a separate thread (inwhich case anything which interacts with the main edbrowse
ui has to be thread-safe) or the browser grinds to a halt when someone puts an
ill-advised loop into their js.  If we can find a way to make the code
thread-safe or get duktape to do collaberative multithreading (i.e. stop after a
certain amount of instructions to allow us to spin round the main loop
again) I really don't mind, but it's just something to think about.  It'd be
nice to simplify (if not altogether remove) the code to sync the edbrowse and js
worlds however.

If I can get some time in front of a computer this weekend I'll have a look at
how much work this'll be to make happen.

> As for the larger question, I don't think I know enough yet to answer.
> Honestly it makes me nervous to think of making all of edbrowse + duktape threadsafe.

Same here... lots of ways things could go badly wrong!

> Javascrip timers are implemented now, and well tested, e.g. http://www.eklhad.net/async
> They don't spin on cpu cycles, they actually use timers, but they also signal the main input loop so that everything is serialized.
> We process what you typed in, or we run the js code associated with a timer, and we're back.
> There aren't any threads, and it works pretty well.

Yeah, that's a good example of how this stuff can work.

> I'm not saying that will work for everything we need to do, I just don't know enough yet,
> but if it would work for other situations it would be easier.

Agreed, I'm particularly worried about websites' increasing reliance on AJAX
being genuinely async and then there's websockets (which'll have to happen
sometime... unfortunately).  I know we can multiplex sockets and that'll work
well but it'll be complex.

Cheers,
Adam.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Edbrowse-dev] JS1
  2017-07-28 19:51     ` Adam Thompson
@ 2017-07-28 22:05       ` Karl Dahlke
  2017-08-11  7:46         ` Adam Thompson
  0 siblings, 1 reply; 10+ messages in thread
From: Karl Dahlke @ 2017-07-28 22:05 UTC (permalink / raw)
  To: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 2580 bytes --]

> They're kind of authogonal and kind of not. With the separate process we can
> keep spinning round the main input loop even if js is stuck in some expensive
> operation.

In theory we could, but today we don't.
I send a message over pipe to js process and block read waiting for a response.
Things to consider though: 1. websites aren't going to write js code that computes a million digits of pi, at least not on purpose.
I don't believe there are that many js expensive operations out in the wild.
That's why I didn't mind waiting for the response.
Second, I'm calling up the web page and I want to read the page, that is next on my list of things to do,
I really don't want to go off to another buffer and do something else.
If it's really that slow I'm going to be annoyed, whether I have the freedom to switch buffers while I wait or not.
It just shouldn't be that slow.
A measure of the quality of a mainstream browser is often how quickly it can bring up complex sweb pages.
3, so far the times it has been really slow is our fault, our DOM was not correct or complete and that caused js to go into a long loop, sometimes an infinite loop,
and that wouldn't happen on another browser or on edbrowse if we had all our objects correct.
I do understand what you're saying though and we need to keep it in mind.
If the day comes when we say, "wow we just can't give up all control to js and wait for it to finish", then yes there are advantages to having it across the street.
At this moment I don't thinkg waiting for a js routine to return is a show stopper, thus the simple design we have now. We'll see.
Obviously I'm not doing anything yet that would throw away the 2 process design.

Speaking of serialized poling and timers,
one thing I had to do was scale back the intervals, so they can't fire 20 times a second,
as they do on some sites for visual effects.
Timers were always running and edbrowse would never listen to what I was typing at the keyboard.
A blind person can't keep up with that kind of rapid fire screen refresh anyways.
So I don't let intervals run more than a couple times per second and that solves the problem,
but it illustrates some of the things we have to think about when coordinating all these "asynchronous" events.
See html.c line 2210.

I tell you, if you're running debug 4 or higher, even an interval that ticks every 2 or 3 seconds is annoying, spewing out periodic and regular debug messages,
and getting in the way of what you're really looking for, but not much we can do about that.

Karl Dahlke

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Edbrowse-dev] JS1
  2017-07-28 22:05       ` Karl Dahlke
@ 2017-08-11  7:46         ` Adam Thompson
  2017-08-11  9:51           ` Karl Dahlke
  0 siblings, 1 reply; 10+ messages in thread
From: Adam Thompson @ 2017-08-11  7:46 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

On Fri, Jul 28, 2017 at 06:05:19PM -0400, Karl Dahlke wrote:
> > They're kind of authogonal and kind of not. With the separate process we can
> > keep spinning round the main input loop even if js is stuck in some expensive
> > operation.
> 
> In theory we could, but today we don't.
> I send a message over pipe to js process and block read waiting for a response.
> Things to consider though: 1. websites aren't going to write js code that computes a million digits of pi, at least not on purpose.
> I don't believe there are that many js expensive operations out in the wild.
> That's why I didn't mind waiting for the response.
> Second, I'm calling up the web page and I want to read the page, that is next on my list of things to do,
> I really don't want to go off to another buffer and do something else.
> If it's really that slow I'm going to be annoyed, whether I have the freedom to switch buffers while I wait or not.
> It just shouldn't be that slow.
> A measure of the quality of a mainstream browser is often how quickly it can bring up complex sweb pages.
> 3, so far the times it has been really slow is our fault, our DOM was not correct or complete and that caused js to go into a long loop, sometimes an infinite loop,
> and that wouldn't happen on another browser or on edbrowse if we had all our objects correct.

Agreed.

> Speaking of serialized poling and timers,
> one thing I had to do was scale back the intervals, so they can't fire 20 times a second,
> as they do on some sites for visual effects.

Since some of these timers aren't for visual effects (polling, checking that
aspects of the DOM are correct etc) I had a quick look at the timers code
yesterday.  I have a version of the code with the 10 ms minimum as per the spec
and I've altered the code to run a single timer then spin round the loop again.
This appears to work in testing since what happens is that we run one of the very
fast timers, call select again, detect that we don't have input and that we have
timers waiting then run the next one.  In the case that we have input we read it
then spin back round and catch the fact that we have timers pending.  I'm not
sure if this exposes any new corners but it's available in:
https://github.com/arthompson/edbrowse.git
I can push the changes into the edbrowse main repo if people are happy.

Cheers,
Adam.
PS: I'm now running js in 1-process mode so that's how I've tested it.  I've
spun it up in 2-proc mode but I've not extensively tested it.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Edbrowse-dev] JS1
  2017-08-11  7:46         ` Adam Thompson
@ 2017-08-11  9:51           ` Karl Dahlke
  2017-08-12  7:20             ` Kevin Carhart
  0 siblings, 1 reply; 10+ messages in thread
From: Karl Dahlke @ 2017-08-11  9:51 UTC (permalink / raw)
  To: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 2997 bytes --]

Having not seen the patch, it sounds like you changed the "scheduler" to be more "round robin" between input and timers, and that's fine.
That part is ok but I am still concerned about rapid fire timers.

> Since some of these timers aren't for visual effects (polling, checking

Is there really a site that *needs* timers faster than 500ms for nonvisual purposes?
Kevin you've combed through the acid tests, is this a concern?

1. I suppose I have a general concern about resources, edbrowse will never be optimized the way chrome is, it's already slow, if we have a dozen such sites up with these timers, some of them pushed onto the stack so we don't even see them, and they're all running at this rapid rate, I can picture the load average climbing up above 3.

2. I live and die by db5. I couldn't have gotten this far debugging without db5.
Each timer produces a lot of debugging output. Some of this is overhead just to run the timer.
Did the user change one of the input fields, and do we have to push that change down to the js world before we run the js timer, and when we get back is there stuff we have to do?
Trust me it makes a lot of output.
If that flies every ten seconds it's prohibitive on the screen, and even when redirected into a debug file, the file is huge in a matter of a second.
And it might take several seconds to set up the condition I'm trying to debug.
Then I have to slog through a 100 megabyte file to find the problem.
So that's a concern to me.
I imagine you could let timers fly at db2 or less, and cap them at 600ms for db3 or higher (my cap is 600ms across the board today), but that means db2 and db3 run differently and that has its own concerns.

3. Edbrowse doesn't garbage collect at all. html tags keep accumulating.
Look at www.eklhad.net/async. It updates the page once a second via innerHTML.
On the js side, new objects are created, and the old ones go away via gc.
On the edbrowse side, new tags are created, and the old ones are marked dead when js removes the corresponding objects, thanks to Dominique's clever trick, but the old tags don't go away, they just sit there dead.
This is something we may need to address at some point, it's just not there today.
So memory will grow very quickly at 10ms, and not just memory, because a lot of the code simply scans through those tags linearly, so edbrowse will bog down very quickly.
Knuth would frown on a lot of this code, but it is what it is.
I don't think we can or should run high timers until this issue is addressed.

Push the round robin if you like, but I wouldn't ratchet the timers up to high just yet, or maybe make the cap a global variable that we can change in various circumstances, or set in the config file, or otherwise have some control over.
Perhaps the foreground window runs fast and all other edbrowse windows are throttled, or based on debug level, or slow down after we have 5000 tags, IDK.

Thanks for looking into this.

Karl Dahlke

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Edbrowse-dev] JS1
  2017-08-11  9:51           ` Karl Dahlke
@ 2017-08-12  7:20             ` Kevin Carhart
  2017-08-12  7:30               ` Karl Dahlke
  0 siblings, 1 reply; 10+ messages in thread
From: Kevin Carhart @ 2017-08-12  7:20 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

> Is there really a site that *needs* timers faster than 500ms for nonvisual purposes?
> Kevin you've combed through the acid tests, is this a concern?

Hmmm.... I'm in the file right now and I'm searching 'imeout' to try to
find instances of setTimeout and 'nterval' to try to find instances of 
setInterval.

setInterval doesn't appear at all.

setTimeout appears only twice.  In both cases, it's called like this:
             setTimeout(update, delay);

So what values does the variable 'delay' hold?  It's set to 10 higher up.

   var delay = 10;

However, there is also this reference to 5000ms in a comment:

           // we will give this test 500 attempts (5000ms) before aborting

So based on what value they hardcoded, maybe you could conclude from this 
that the scale along which they are mapping the range from good 
responsiveness to poor responsiveness is measured in seconds and not 
tenths of a second?  If that is right, I think the answer to your question 
is probably no and 500ms is faster than the values they used most of the 
time.  500ms is only 50 attempts, when they're willing to give the client 
ten times as long as that to finish a test.

Kevin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Edbrowse-dev] JS1
  2017-08-12  7:20             ` Kevin Carhart
@ 2017-08-12  7:30               ` Karl Dahlke
  2017-08-12  8:06                 ` Kevin Carhart
  0 siblings, 1 reply; 10+ messages in thread
From: Karl Dahlke @ 2017-08-12  7:30 UTC (permalink / raw)
  To: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 687 bytes --]

Interesting that you should point this out.
My throttle is only valid for intervals, which fire repeatedly.
There are no restrictions on timers, which fire only once.
So if a timer is scheduled for now + 10ms, then it fires at now + 10ms.
However, if that timer schedules another timer for now + 10 ms, then that timer fires in 10 ms, and it continues every 10 ms, and you have found a way around my restriction.
In any browser, that's more resource intensive than a 10 ms interval.
We're constantly creating objects every 10 ms, which gc must clean up before they accumulate, etc.
It's so inefficient I'm guessing nobody would do this, except maybe an acid test.

Karl Dahlke

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Edbrowse-dev] JS1
  2017-08-12  7:30               ` Karl Dahlke
@ 2017-08-12  8:06                 ` Kevin Carhart
  0 siblings, 0 replies; 10+ messages in thread
From: Kevin Carhart @ 2017-08-12  8:06 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

Yes.. and interval isn't even in there at all.  I'm not totally clear on 
how these instances of timeout are used.  It seems like it's more of an 
overall thing, like for the code that manages all the tests.  I do not 
know what that translates into in terms of prevalence or obscurity.  Also, 
note well that the makers of acidtests.org used variable names like

       kungFuDeathGrip = [e1, e2];

So it's possible that the acidtests has idiosyncracies in the first 
place... there may have been a lot of LSD going around when they wrote 
it..

  On Sat, 12 Aug 2017, Karl Dahlke wrote:

> Interesting that you should point this out.
> My throttle is only valid for intervals, which fire repeatedly.
> There are no restrictions on timers, which fire only once.
> So if a timer is scheduled for now + 10ms, then it fires at now + 10ms.
> However, if that timer schedules another timer for now + 10 ms, then that timer fires in 10 ms, and it continues every 10 ms, and you have found a way around my restriction.
> In any browser, that's more resource intensive than a 10 ms interval.
> We're constantly creating objects every 10 ms, which gc must clean up before they accumulate, etc.
> It's so inefficient I'm guessing nobody would do this, except maybe an acid test.
>
> Karl Dahlke
>

--------
Kevin Carhart * 415 225 5306 * The Ten Ninety Nihilists

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-08-12  8:06 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-23  2:49 [Edbrowse-dev] JS1 Karl Dahlke
2017-07-24 17:57 ` Adam Thompson
2017-07-28 17:39   ` Karl Dahlke
2017-07-28 19:51     ` Adam Thompson
2017-07-28 22:05       ` Karl Dahlke
2017-08-11  7:46         ` Adam Thompson
2017-08-11  9:51           ` Karl Dahlke
2017-08-12  7:20             ` Kevin Carhart
2017-08-12  7:30               ` Karl Dahlke
2017-08-12  8:06                 ` Kevin Carhart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).