[Edbrowse-dev] parser separation

edbrowse-dev - development list for edbrowse
 help / color / mirror / Atom feed

* [Edbrowse-dev] parser separation
@ 2015-08-30 11:31 Karl Dahlke
  2015-08-30 11:43 ` Adam Thompson
  0 siblings, 1 reply; 5+ messages in thread
From: Karl Dahlke @ 2015-08-30 11:31 UTC (permalink / raw)
  To: Edbrowse-dev

I was thinking about the history of our js interactions.
We felt it was wise to separate and encapsulate,
writing jseng-moz.cpp as a separate source file running a separate process.
Among other benefits,
this makes it easier to switch engines, or conform to Mozilla upgrades
and changing APIs.
We've talked about v8 and duktape for instance.
Those are still possibilities.
So ... if there is any uncertainty at all about the html parser,
or if we just want to keep the door open, should we do some encapsulation,
and should we start now?

The connection is far simpler than js.
Pass html text, get back a tree of nodes.
It's conceptually a function call,
and doesn't need to be a separate process or thread.
Nothing asynchronous etc, and no ongoing dialog with states etc.
So it's very simple, but still might be worth putting in another sourcefile.

html-tidy.c - use tidy to parse html
html-hub.c - use hubbub to parse html
...

Let the makefile link in whichever one we want,
just as the makefile determines mozilla or v8 or duktape etc.
Then htmltidy.c is the only file that needs tidy.h and its
structures and API, and it returns to us
a tree of our nodes, as converted from the tidy nodes,
which we then use to build our DOM.
It's a small bit of administrative overhead that might pay dividends.
What do you think?

Karl Dahlke

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Edbrowse-dev] parser separation
  2015-08-30 11:31 [Edbrowse-dev] parser separation Karl Dahlke
@ 2015-08-30 11:43 ` Adam Thompson
  2015-08-30 12:21   ` Karl Dahlke
  0 siblings, 1 reply; 5+ messages in thread
From: Adam Thompson @ 2015-08-30 11:43 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 2228 bytes --]

On Sun, Aug 30, 2015 at 07:31:07AM -0400, Karl Dahlke wrote:
> I was thinking about the history of our js interactions.
> We felt it was wise to separate and encapsulate,
> writing jseng-moz.cpp as a separate source file running a separate process.
> Among other benefits,
> this makes it easier to switch engines, or conform to Mozilla upgrades
> and changing APIs.
> We've talked about v8 and duktape for instance.
> Those are still possibilities.
> So ... if there is any uncertainty at all about the html parser,
> or if we just want to keep the door open, should we do some encapsulation,
> and should we start now?

Yes.

> The connection is far simpler than js.
> Pass html text, get back a tree of nodes.
> It's conceptually a function call,
> and doesn't need to be a separate process or thread.
> Nothing asynchronous etc, and no ongoing dialog with states etc.
> So it's very simple, but still might be worth putting in another sourcefile.
> 
> html-tidy.c - use tidy to parse html
> html-hub.c - use hubbub to parse html
> ...

Still not sold on hubbub. Can you actually get it to build without building
netsurf, or find any documentation on it for that matter?
If we can then yeah we can look at it also, that'd be potentially better,
but it seems to have disappeared as a separate project and I don't want
edbrowse to be bound to netsurf's development.
I know when I looked I couldn't extract libhubbub any more and the distro
package has gone.

> Let the makefile link in whichever one we want,
> just as the makefile determines mozilla or v8 or duktape etc.
> Then htmltidy.c is the only file that needs tidy.h and its
> structures and API, and it returns to us
> a tree of our nodes, as converted from the tidy nodes,
> which we then use to build our DOM.
> It's a small bit of administrative overhead that might pay dividends.
> What do you think?

I think that we probably want to also create a library of common functions for
working with the node tree in that case.
That'd essentially be a DOM, with the only work then being to create the rest of the objects it requires, and expose it to js...
oh and alter the rendering...
and bug squash.

Cheers,
Adam.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Edbrowse-dev]  parser separation
  2015-08-30 11:43 ` Adam Thompson
@ 2015-08-30 12:21   ` Karl Dahlke
  2015-08-30 19:06     ` Adam Thompson
  0 siblings, 1 reply; 5+ messages in thread
From: Karl Dahlke @ 2015-08-30 12:21 UTC (permalink / raw)
  To: Edbrowse-dev

> Still not sold on hubbub.

Oh I'm not either, not at all,
just like I'm not sold on v8 -
but encapsulating things in separate sourcefiles gives us options,
and flexibility, and sometimes leads to a better design.
I'll get started on this today.

And I agree we don't have to build a fully functional node right now,
but there are I suspect a few things we'll need to add to take advantage of tidy.
More on this later.

Karl Dahlke

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Edbrowse-dev] parser separation
  2015-08-30 12:21   ` Karl Dahlke
@ 2015-08-30 19:06     ` Adam Thompson
  2015-08-30 19:25       ` Karl Dahlke
  0 siblings, 1 reply; 5+ messages in thread
From: Adam Thompson @ 2015-08-30 19:06 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 1186 bytes --]

On Sun, Aug 30, 2015 at 08:21:07AM -0400, Karl Dahlke wrote:
> > Still not sold on hubbub.
> 
> Oh I'm not either, not at all,
> just like I'm not sold on v8 -
> but encapsulating things in separate sourcefiles gives us options,
> and flexibility, and sometimes leads to a better design.
> I'll get started on this today.

Agreed, one of the things I'd like to do is decouple the parsing logic from js
value creation and property setting.
The two are semantically different operations,
and the js stuff shouldn't be done if js is disabled anyway.
The DOM, when created, needs to know about js stuff,
though Ideally I'm aiming for an api which exposes our DOM to js without creating js
objects for everything and then syncing things around.
I'm not sure exactly how this would work,
but I don't want to have to reparse everything multiple times for the sake of
js object creation, or have the parser creating js objects in its node tree.

> And I agree we don't have to build a fully functional node right now,
> but there are I suspect a few things we'll need to add to take advantage of tidy.
> More on this later.

Indeed, that makes sense.

Cheers,
Adam.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Edbrowse-dev]   parser separation
  2015-08-30 19:06     ` Adam Thompson
@ 2015-08-30 19:25       ` Karl Dahlke
  0 siblings, 0 replies; 5+ messages in thread
From: Karl Dahlke @ 2015-08-30 19:25 UTC (permalink / raw)
  To: Edbrowse-dev

> one of the things I'd like to do is decouple the parsing logic from js

Yes yes absolutely.
That's what we're goinhg to do.

Karl Dahlke

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-08-30 19:23 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-30 11:31 [Edbrowse-dev] parser separation Karl Dahlke
2015-08-30 11:43 ` Adam Thompson
2015-08-30 12:21   ` Karl Dahlke
2015-08-30 19:06     ` Adam Thompson
2015-08-30 19:25       ` Karl Dahlke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).