edbrowse-dev - development list for edbrowse
 help / color / mirror / Atom feed
* [Edbrowse-dev] Tasks
@ 2014-12-27  3:17 Karl Dahlke
  2014-12-27  8:38 ` Adam Thompson
  2014-12-28 15:57 ` Adam Thompson
  0 siblings, 2 replies; 5+ messages in thread
From: Karl Dahlke @ 2014-12-27  3:17 UTC (permalink / raw)
  To: Edbrowse-dev

It's kinda funny - we get busy with our lives and don't do substantial
work on edbrowse for several months, then we dive in on things,
sometimes the same things, which isn't very efficient.
I'd like to propose a few tasks, using that term loosely,
and ask who might want to claim which ones.
These are in no order, and there may be others;
I may even be missing the most important ones.

1. Research into v8 or perhaps other js engines.
Just play with it, hello world, what can it do,
is it better or worse than moz, perhaps rewrite jseng.cpp in it if you
really want to dive in.

2. Is there any open source that would help us with DOM?
I thought we might steal from Chrome, which would play better with v8,
but I don't know if any dom software can reasonably be extracted from the whole.

3. How does dom really work anyways?
Is there a book or tutorial that actually tells us what we have to implement?

4. Fork off a copy of edbrowse and download files in the background,
as described in my earlier email.

5. Implement imap. A lot of people want this.
Many more would use it for mail if it had imap,
and curl supports imap,
so I don't think this would be as hard as it first appears.

6. How are we going to approach frames and iframes.
Today I turn them into hyperlinks to the web pages,
but *every* other browser puts all the pages together
into oneseamless whole.
We should probably do that too.
Then buffers and web pages don't corresponds 1 for 1 any more.
Lines 237 through 451 might be this page,
and 452 to 989 that page, and so on.

7. What happens when javascript accesses variables in other documents.
This can be done through frames.
In the mozilla world, those variables are in another compartments.
Doesn't that cause js to blow up?
Or at least not to see those variables?
Or is everything in firefox in one compartment,
but that can't be right either because each compartment
has one global window object.
Maybe interwindow communication doesn't happen enough for us to worry about,
and is usually done for visual effects anyways.

8. What is ajax and jquery and all those, and how much of that
do we have to implement?

9. Find the most common websites, and trace through the js, slowly and painfully,
to see what we really need to do.
This is market driven, the 100 most used websites,
and make edbrowse work for those.
Given our limited resources, we might have to proceed this way,
rather then doing it all.
I've tried to track through js to see where edbrowse fails and why,
and it's a terribly slow and frustrating process,
especially if the js has been deliberately crapized.
I think we really need to do some of this, but I rarely have the patience
to actually do it.


Karl Dahlke

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Edbrowse-dev] Tasks
  2014-12-27  3:17 [Edbrowse-dev] Tasks Karl Dahlke
@ 2014-12-27  8:38 ` Adam Thompson
  2014-12-28 15:57 ` Adam Thompson
  1 sibling, 0 replies; 5+ messages in thread
From: Adam Thompson @ 2014-12-27  8:38 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 609 bytes --]

On Fri, Dec 26, 2014 at 10:17:39PM -0500, Karl Dahlke wrote:
> It's kinda funny - we get busy with our lives and don't do substantial
> work on edbrowse for several months, then we dive in on things,
> sometimes the same things, which isn't very efficient.

Yeah true. I'm currently working on why we're back to segfaulting against
debian's mozjs24 (on i386, so may be it was just fixed on amd64).

This is annoying, since I'd like to take the new edbrowse for a spin.
Still, it doesn't blow up totally, just says that it can't communicate with js
and continues with js disabled which is cool.

Cheers,
Adam.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Edbrowse-dev] Tasks
  2014-12-27  3:17 [Edbrowse-dev] Tasks Karl Dahlke
  2014-12-27  8:38 ` Adam Thompson
@ 2014-12-28 15:57 ` Adam Thompson
  1 sibling, 0 replies; 5+ messages in thread
From: Adam Thompson @ 2014-12-28 15:57 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 5287 bytes --]

On Fri, Dec 26, 2014 at 10:17:39PM -0500, Karl Dahlke wrote:
> 1. Research into v8 or perhaps other js engines.
> Just play with it, hello world, what can it do,
> is it better or worse than moz, perhaps rewrite jseng.cpp in it if you
> really want to dive in.

Ok, this seems like something everyone can look at,
though perhaps we want to be careful with duplicated efort.
Has anyone successfully compiled anything against v8 yet?

> 2. Is there any open source that would help us with DOM?
> I thought we might steal from Chrome, which would play better with v8,
> but I don't know if any dom software can reasonably be extracted from the whole.

I think we've been down the route of "playing nicely"
with a js implementation already. I'd really like to get a proper DOM
independant of js, then plug js into such a DOM.
That changes the research perspective (at least in my mind)
to an open source html parsing library,
then writing code to convert that representation into a dynamic DOM 
alterable via js.
That also allows us to separate the parsing and rendering code which imho would be a good thing.
> 3. How does dom really work anyways?
> Is there a book or tutorial that actually tells us what we have to implement?

Yeah, the w3c have a bunch of DOM specs,
most of which assume we've already parsed the html into a tree (which people
also refer to as a DOM), hence why we need to separate the parsing and
rendering code (we've thankfully already separated the js).

> 4. Fork off a copy of edbrowse and download files in the background,
> as described in my earlier email.

Ok, I think this's already being handled.

> 5. Implement imap. A lot of people want this.
> Many more would use it for mail if it had imap,
> and curl supports imap,
> so I don't think this would be as hard as it first appears.
Unfortunately I don't use edbrowse's mail support so I can't really 
help with this.

> 6. How are we going to approach frames and iframes.
> Today I turn them into hyperlinks to the web pages,
> but *every* other browser puts all the pages together
> into oneseamless whole.
> We should probably do that too.
> Then buffers and web pages don't corresponds 1 for 1 any more.
> Lines 237 through 451 might be this page,
> and 452 to 989 that page, and so on.

Hmmm, not sure how to handle this either.
Perhaps we could load the pages, but keep them in another buffer,
with cross-buffer js support, or some sort of frame delimiters... I don't know.

> 7. What happens when javascript accesses variables in other documents.
> This can be done through frames.
> In the mozilla world, those variables are in another compartments.
> Doesn't that cause js to blow up?
> Or at least not to see those variables?
> Or is everything in firefox in one compartment,
> but that can't be right either because each compartment
> has one global window object.
> Maybe interwindow communication doesn't happen enough for us to worry about,
> and is usually done for visual effects anyways.

Yes we *do* need to worry about this.
Basically you can do all sorts of cross-compartment calls in mozjs,
I don't really know how to implement it, but I know it can be done.
Also, they have a splitwindow object to handle global and non-global window
components, but even the mozilla devs tell people this is intended as a
firefox-only feature. Again, I think we've got a very different design from
these browsers and need to think slightly differently (taking lessons from them
where we can of course).

> 8. What is ajax and jquery and all those, and how much of that
> do we have to implement?

Ajax has a w3c spec and tutorials, jquery is a library,
all be it a popular one. If we fix our DOM we'll get a bunch of jquery for
free, and if we do ajax we'll get most of the rest (we possibly need json as well).
As for how much do we need to implement;
if we want to have a browser that functions properly on the internet in a few
years then all of it.

> 9. Find the most common websites, and trace through the js, slowly and painfully,
> to see what we really need to do.
> This is market driven, the 100 most used websites,
> and make edbrowse work for those.
> Given our limited resources, we might have to proceed this way,
> rather then doing it all.

I'm not too sure about this approach.
I think at this stage we'd do better to get a cleaner,
more maintainable design and work from there.
If we make sure we keep up with bug tracking and releases then I think the top
100 websites'll appear enough on our bugs list to eventually end up fixed.
This also allows us to fix the more fundimental problems, like our DOM,
without getting distracted by trying to hack on an existing design.

> I've tried to track through js to see where edbrowse fails and why,
> and it's a terribly slow and frustrating process,
> especially if the js has been deliberately crapized.
> I think we really need to do some of this, but I rarely have the patience
> to actually do it.

Agreed, tracking through js is important,
and will become more so if we decide to revise the parsing,
rendering and js dom logic.
What we also need to get better at is turning our findings into jsrt tests.

Cheers,
Adam.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Edbrowse-dev] Tasks
  2014-12-28 18:42 Karl Dahlke
@ 2014-12-28 19:17 ` Adam Thompson
  0 siblings, 0 replies; 5+ messages in thread
From: Adam Thompson @ 2014-12-28 19:17 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 4110 bytes --]

On Sun, Dec 28, 2014 at 01:42:26PM -0500, Karl Dahlke wrote:
> > though perhaps we want to be careful with duplicated efort.
> 
> Right, specialy since there's only three of us.
> Would you like to look at v8?
> So far none of us has a hello world program working.
> There is one in src, that I got straight from google,
> but it doesn't compile.
> 
> make js_hello_v8

Ok, I'll have a look at that. I'm also looking at any other js engines I can
find (preferably not entangled with big browsers)
but so far I'm not sure how mature any such engines are.

> > I'd really like to get a proper DOM independant of js,
> > then plug js into such a DOM.
> 
> Yes, yes, and yes.
> I'm hoping that ebjs.c holds the api that will make this possible.
> That was reason enough to move js into its own process,
> though there are other advantages as well.
> So ... are the prototypes in eb.p, from sourcefile ebjs.c,
> enough to build a dom on top of?
> If not then what else do we need?

I think so from the js side of things.
The problem is the fact that we need to have better support for parsing html
into the initial structure before we load into js.
I think we may also need some more tree traversal stuff inside the js side of
the DOM, and I'm not entirely clear how we handle if js makes wide-spread DOM
changes (like massive amounts of page elements created directly as DOM objects).
I think that support probably needs expanding and linking back to the browser's
representation.
We also need to load in style attributes as well as otherwise there's
a good chance some js will blow up trying to perform some (to us unnecessary)
visual effect and thus not get to some critical part of the code (i.e. onsubmit handlers).
> > As for how much do we need to implement;
> > if we want to have a browser that functions properly on the internet in a few
> > years then all of it.
> 
> Yes I suppose so.
> It doesn't scare me, I just wish we had more resources,
> like maybe an NFS grant to pay us to do it.

More devs are usually helpful, certainly in this case.

> > tracking through js is important,
> 
> I tried in particular to track through the js in the jquery library,
> to see whether edbrowse is a mile away or ten thousand miles away
> from supporting it, but I got rather overwhelmed.
> It's still sitting in my to-do directory.

If we sort the child node and parent node thing (not sure how far that got
done), and probably get all the array types implemented (not just objects)
then we'll be much closer. Throw in support for not just dynamicly altering
selectors but also everything else in the DOM,
and a function for searching through the DOM in various ways,
and we'll have the page creation side sorted I think.

> > make the callback given to curl support both operations,
> 
> Well of course. That's obviously a better design. Thank you.
> I just put the data in memory or on disk,
> according to some variables.

Hopefully, though I'm sure I've missed something important in that logic,
are threads or something involved? Gdb always says that curl spawns a new thread when it downloads, but I assume the curl_perform call blocks, so we should be fine.
> > Yeah, I'm wondering if we want to make that string file more c++ anyway
> Anything that becomes common to edbrowse and edbrowse-js has to be C,
> because edbrowse is entirely in C.
> But making it common might not be worth the bother.
> The routines are different in subtle ways.
> The edbrowse copy calls setError() and such to report error conditions to the user,
> but in edbrowse-js those errors aren't as likely, and would be treated another way,
> so it will probably be 400 lines of code that look similar in 2 places,
> but aren't identical enough to put in common.

Yeah... that should've read "the string handling in that file".
I certainly didn't mean to suggest injecting c++ back into Edbrowse (we just
spent a while removing it). However the js process has to be in c++
(unfortunately) and so we may as well use c++ stuff there.

Cheers,
Adam.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Edbrowse-dev] Tasks
@ 2014-12-28 18:42 Karl Dahlke
  2014-12-28 19:17 ` Adam Thompson
  0 siblings, 1 reply; 5+ messages in thread
From: Karl Dahlke @ 2014-12-28 18:42 UTC (permalink / raw)
  To: Edbrowse-dev

> though perhaps we want to be careful with duplicated efort.

Right, specialy since there's only three of us.
Would you like to look at v8?
So far none of us has a hello world program working.
There is one in src, that I got straight from google,
but it doesn't compile.

make js_hello_v8

> I'd really like to get a proper DOM independant of js,
> then plug js into such a DOM.

Yes, yes, and yes.
I'm hoping that ebjs.c holds the api that will make this possible.
That was reason enough to move js into its own process,
though there are other advantages as well.
So ... are the prototypes in eb.p, from sourcefile ebjs.c,
enough to build a dom on top of?
If not then what else do we need?

> As for how much do we need to implement;
> if we want to have a browser that functions properly on the internet in a few
> years then all of it.

Yes I suppose so.
It doesn't scare me, I just wish we had more resources,
like maybe an NFS grant to pay us to do it.

> tracking through js is important,

I tried in particular to track through the js in the jquery library,
to see whether edbrowse is a mile away or ten thousand miles away
from supporting it, but I got rather overwhelmed.
It's still sitting in my to-do directory.

> make the callback given to curl support both operations,

Well of course. That's obviously a better design. Thank you.
I just put the data in memory or on disk,
according to some variables.

> Yeah, I'm wondering if we want to make that string file more c++ anyway

Anything that becomes common to edbrowse and edbrowse-js has to be C,
because edbrowse is entirely in C.
But making it common might not be worth the bother.
The routines are different in subtle ways.
The edbrowse copy calls setError() and such to report error conditions to the user,
but in edbrowse-js those errors aren't as likely, and would be treated another way,
so it will probably be 400 lines of code that look similar in 2 places,
but aren't identical enough to put in common.


Karl Dahlke

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-12-28 19:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-27  3:17 [Edbrowse-dev] Tasks Karl Dahlke
2014-12-27  8:38 ` Adam Thompson
2014-12-28 15:57 ` Adam Thompson
2014-12-28 18:42 Karl Dahlke
2014-12-28 19:17 ` Adam Thompson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).