edbrowse-dev - development list for edbrowse
 help / color / mirror / Atom feed
* [Edbrowse-dev] V 3.5.4
@ 2015-05-10 23:37 Karl Dahlke
  2015-05-11  0:37 ` chris
  0 siblings, 1 reply; 12+ messages in thread
From: Karl Dahlke @ 2015-05-10 23:37 UTC (permalink / raw)
  To: Edbrowse-dev

If Chris and I bat this back and forth a bit,
and decide to take the tidy5 plunge, which I think we should,
it really seems like the next logical step,
well anyways if we do, should we cut a version first?
Here's what we've done so far, as documented in CHANGES.

Messages in German, thanks to Sebastian Humenda.
Autoplay of audio files found on websites, using content-type,
and autoplay of audio files from directory mode.
Use a plugin to convert pdf to html, or any other conversion you wish.
Autoconvert such files as you encounter them via the g command.
directory listing sorted by locale, like/bin/ls.
Automatically include references when replying to an email, re or rea commands,
so it threads properly.

Not a lot of changes but still worth marking I think,
before we rewrite the whole html parser.
Chris remember the protocol:
change version from 3.5.3+ to 3.5.4
mark the version in github
change version to 3.5.4+ to proceed with our work.

I have nothing in progress or pending, and would be ok with marking a version.
Thoughts?

Karl Dahlke

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Edbrowse-dev] V 3.5.4
  2015-05-10 23:37 [Edbrowse-dev] V 3.5.4 Karl Dahlke
@ 2015-05-11  0:37 ` chris
  2015-05-11  0:54   ` Karl Dahlke
  2015-05-11 22:41   ` Adam Thompson
  0 siblings, 2 replies; 12+ messages in thread
From: chris @ 2015-05-11  0:37 UTC (permalink / raw)
  To: Edbrowse-dev

Karl Dahlke <eklhad@comcast.net> writes:

> and decide to take the tidy5 plunge, which I think we should,

Yes, I'm thinking so.  The only troubling bit is document.write /
innerhtml, which I completely forgot about.  Will tidy5 handle document
fragments created by those calls, or does this mean that we won't be
able to use it?  I still haven't done that research.

> Not a lot of changes but still worth marking I think,

Yes, I agree.
I also have nothing ready to go, so if Adam is ok with it, I'll go ahead
and post 3.5.4.

-- Chris

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Edbrowse-dev]  V 3.5.4
  2015-05-11  0:37 ` chris
@ 2015-05-11  0:54   ` Karl Dahlke
  2015-05-11 22:41   ` Adam Thompson
  1 sibling, 0 replies; 12+ messages in thread
From: Karl Dahlke @ 2015-05-11  0:54 UTC (permalink / raw)
  To: Edbrowse-dev

> The only troubling bit is document.write /

I don't think this will be a problem.
We pass the web page html string to tidy, get the node tree back.
Run javascript for each <script> node.
When document.write is called, we have another html string.
Call tidy again, get a node tree back,
attach it to the <script> node that ran the javascript.
And continue.

Not much different from what I do today.
I call htmlParse() whenever I have html source to parse,
no matter where that html comes from.

Karl Dahlke

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Edbrowse-dev] V 3.5.4
  2015-05-11  0:37 ` chris
  2015-05-11  0:54   ` Karl Dahlke
@ 2015-05-11 22:41   ` Adam Thompson
  2015-05-11 23:23     ` Karl Dahlke
  1 sibling, 1 reply; 12+ messages in thread
From: Adam Thompson @ 2015-05-11 22:41 UTC (permalink / raw)
  To: chris; +Cc: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 2006 bytes --]

On Sun, May 10, 2015 at 05:37:14PM -0700, chris@the-brannons.com wrote:
> Karl Dahlke <eklhad@comcast.net> writes:
> 
> > and decide to take the tidy5 plunge, which I think we should,
> 
> Yes, I'm thinking so.  The only troubling bit is document.write /
> innerhtml, which I completely forgot about.  Will tidy5 handle document
> fragments created by those calls, or does this mean that we won't be
> able to use it?  I still haven't done that research.

From what I've seen tidy should be able to handle document fragments.
Since we're converting from tidy's parse tree to our own DOM,
the inserting shouldn't be a problem either.

> > Not a lot of changes but still worth marking I think,
> 
> Yes, I agree.
> I also have nothing ready to go, so if Adam is ok with it, I'll go ahead
> and post 3.5.4.

Yeah go ahead. I think we've made enough changes (particularly in terms of the
plugin system) to do this.

In terms of future release planning,
I guess with tidy5 we'll be wanting to go for 3.6 as the next release since
it's going to include a new library dependancy?

I'm also thinking we should get a stabilised tidy5 based html parser before we
start playing with pulling the DOM into a separate process?
That seems to make the most sense in terms of avoiding breaking things.
It should also allow us to release this code sooner which is a good thing I think.

Next on my edbrowse todo list is to evaluate how ready the duktape js engine
(http://www.duktape.org) is to take over from Mozilla's spidermonkey for edbrowse-js.
It strikes me that we really don't use most of Spidermonkey and that if it'll
do everything we want, duktape's much more suited to our use-case.
In addition, this raises the possibility of moving the js and html stuff back
into an edbrowse-dom process since duktape's in C like the rest of edbrowse.
Plus we get a js engine which is developed to be a js engine and not massively
tied to one browser and usage model.

Cheers,
Adam.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Edbrowse-dev]  V 3.5.4
  2015-05-11 22:41   ` Adam Thompson
@ 2015-05-11 23:23     ` Karl Dahlke
  2015-05-12  7:32       ` Adam Thompson
  2015-05-12  8:07       ` chris
  0 siblings, 2 replies; 12+ messages in thread
From: Karl Dahlke @ 2015-05-11 23:23 UTC (permalink / raw)
  To: Edbrowse-dev

> I guess with tidy5 we'll be wanting to go for 3.6 as the next release

I don't have strong feelings about this; whatever you and Chris think.

> I'm also thinking we should get a stabilised tidy5 based html parser before
> we start playing with pulling the DOM into a separate process?

Yes, and definitely yes.
Don't move all the chess pieces at once.
And it really will bring benefit: more web pages parsed properly,
all the nodes building js objects not just some of them,
all the html attributes becoming members in the corresponding js nodes
not just some of them, etc etc.

> Next on my edbrowse todo list is to evaluate how ready the duktape js engine is

A good trial is to
cp jseng-moz.cpp jseng-duk.c
and then modify the latter to use the duktape engine calls
and ideally we could just plug either one into edbrowse and they should
both work the same.
Good side by side comparisons.
I wanted to do the same with v8 but never got round to it,
and the v8 interface isn't as easy as I had hoped.
I never even got hello world to run
or even compile: js_hello_v8.cpp

Karl Dahlke

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Edbrowse-dev] V 3.5.4
  2015-05-11 23:23     ` Karl Dahlke
@ 2015-05-12  7:32       ` Adam Thompson
  2015-05-12  9:15         ` Karl Dahlke
  2015-05-12  8:07       ` chris
  1 sibling, 1 reply; 12+ messages in thread
From: Adam Thompson @ 2015-05-12  7:32 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 1607 bytes --]

On Mon, May 11, 2015 at 07:23:21PM -0400, Karl Dahlke wrote:
> > I guess with tidy5 we'll be wanting to go for 3.6 as the next release
> 
> I don't have strong feelings about this; whatever you and Chris think.

I'd go for 3.6 if given the choice, making 3.5.4 (the currently being worked on release) the last release in the 3.5 series.

> > I'm also thinking we should get a stabilised tidy5 based html parser before
> > we start playing with pulling the DOM into a separate process?
> 
> Yes, and definitely yes.
> Don't move all the chess pieces at once.
> And it really will bring benefit: more web pages parsed properly,
> all the nodes building js objects not just some of them,
> all the html attributes becoming members in the corresponding js nodes
> not just some of them, etc etc.

Yes definitely.

> > Next on my edbrowse todo list is to evaluate how ready the duktape js engine is
> 
> A good trial is to
> cp jseng-moz.cpp jseng-duk.c
> and then modify the latter to use the duktape engine calls
> and ideally we could just plug either one into edbrowse and they should
> both work the same.
> Good side by side comparisons.

Yeah that's the plan. It'll also be a good test of how adaptable the protocol
between edbrowse and edbrowse-js is in terms of passing around memory 
addresses etc.

> I wanted to do the same with v8 but never got round to it,
> and the v8 interface isn't as easy as I had hoped.
> I never even got hello world to run
> or even compile: js_hello_v8.cpp

Indeed, I'm thinking that v8 really isn't an option here.

Cheers,
Adam.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Edbrowse-dev] V 3.5.4
  2015-05-11 23:23     ` Karl Dahlke
  2015-05-12  7:32       ` Adam Thompson
@ 2015-05-12  8:07       ` chris
  1 sibling, 0 replies; 12+ messages in thread
From: chris @ 2015-05-12  8:07 UTC (permalink / raw)
  To: Edbrowse-dev

Karl Dahlke <eklhad@comcast.net> writes:

>> I guess with tidy5 we'll be wanting to go for 3.6 as the next release
>
> I don't have strong feelings about this; whatever you and Chris think.

Ok, 3.5.4 is pushed.
I bumped to 3.5.4+ for now.
Our next release will hopefully be 3.6, with parsing by tidy5, but you
never know.  I'm hedging my bets.

-- Chris

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Edbrowse-dev]   V 3.5.4
  2015-05-12  7:32       ` Adam Thompson
@ 2015-05-12  9:15         ` Karl Dahlke
  2015-05-13  7:17           ` Adam Thompson
  0 siblings, 1 reply; 12+ messages in thread
From: Karl Dahlke @ 2015-05-12  9:15 UTC (permalink / raw)
  To: Edbrowse-dev

> Writing the jseng-duk.c process ...

And the cool part is, such work can take place independently
of other work on edbrowse.
To use the new process, if we decide to do so, is just a change of makefile.

This is a switch back from c++ to c, but I (wisely) used almost none
of the c++ features, save those needed for the mozilla api,
and yes a few strings, cause I was being lazy.
String effects; for example,
to gather side effects of running js and pass them back to edbrowse.
I think the easiest path is to copy the stringAndString etc routines over,
for growing strings dynamically.
They're not perfect but we're all use to them.
stringfile.c lines 150 to 227 would probably suffice.

If we continue down this path we might want a string + url sourcefile to be
shared between the two processes.
Nobody likes duplicated code.
Common is currently jseng-moz.cpp lines 54 through 548,
plus the aforementioned string management routines, so over 600 lines,
and that's just too much code to leave duplicated in the long run.


Karl Dahlke

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Edbrowse-dev] V 3.5.4
  2015-05-12  9:15         ` Karl Dahlke
@ 2015-05-13  7:17           ` Adam Thompson
  2015-05-13  8:16             ` Karl Dahlke
  2015-05-13 13:15             ` Karl Dahlke
  0 siblings, 2 replies; 12+ messages in thread
From: Adam Thompson @ 2015-05-13  7:17 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 1477 bytes --]

On Tue, May 12, 2015 at 05:15:32AM -0400, Karl Dahlke wrote:
> > Writing the jseng-duk.c process ...
> 
> And the cool part is, such work can take place independently
> of other work on edbrowse.
> To use the new process, if we decide to do so, is just a change of makefile.

Indeed.

> This is a switch back from c++ to c, but I (wisely) used almost none
> of the c++ features, save those needed for the mozilla api,
> and yes a few strings, cause I was being lazy.
> String effects; for example,
> to gather side effects of running js and pass them back to edbrowse.
> I think the easiest path is to copy the stringAndString etc routines over,
> for growing strings dynamically.
> They're not perfect but we're all use to them.
> stringfile.c lines 150 to 227 would probably suffice.
> 
> If we continue down this path we might want a string + url sourcefile to be
> shared between the two processes.
> Nobody likes duplicated code.
> Common is currently jseng-moz.cpp lines 54 through 548,
> plus the aforementioned string management routines, so over 600 lines,
> and that's just too much code to leave duplicated in the long run.

Tbh, since this is in C, I was just going to use the already built edbrowse modules.
There's no real reason not to in this code.
The reason before was because the jseng-moz.cpp engine is in c++,
but this is in C like the rest of the project so I see no reason to duplicate
any of this stuff.

Cheers,
Adam.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Edbrowse-dev]    V 3.5.4
  2015-05-13  7:17           ` Adam Thompson
@ 2015-05-13  8:16             ` Karl Dahlke
  2015-05-13 13:15             ` Karl Dahlke
  1 sibling, 0 replies; 12+ messages in thread
From: Karl Dahlke @ 2015-05-13  8:16 UTC (permalink / raw)
  To: Edbrowse-dev

> Tbh, since this is in C,
> I was just going to use the already built edbrowse modules.

Do you plan to link stringfile.o and url.o into jseng-duk executable?
Sounds good, but I'm sure through external references you'll get
the whole damn thing in your lap, including main.o,
which you can't accommodate.
I'm sure we would need to do some rearranging of routines into sourcefiles,
differently than is done today.
Like the way I moved related functions into plugin.c recently.
I'm sure there's a C graphing tool out there showing functions calling
other functions that would confirm or deny this,
Or just try
cc stringfile.o url.o
and note the zillions of undefines, besides main as expected.

Second, there are a couple of routines that act a little differently in the two processes.
I don't remember which ones or how or why they act differently,
I only remember when I wrote jseng-moz.cpp that there were a couple instances like that.
Would it help if I looked through these functions side by side
and determined, and reported, which ones act differently?

Meantime I'd say just carry the 500 lines of code along in your new process,
and don't try to share any modules yet, unless and until we're pretty sure
we want to go down that path, and then it will be well worth
the effort to rearrange functions in sourcefiles and share modules.

Karl Dahlke

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Edbrowse-dev]    V 3.5.4
  2015-05-13  7:17           ` Adam Thompson
  2015-05-13  8:16             ` Karl Dahlke
@ 2015-05-13 13:15             ` Karl Dahlke
  2015-05-14  7:03               ` Adam Thompson
  1 sibling, 1 reply; 12+ messages in thread
From: Karl Dahlke @ 2015-05-13 13:15 UTC (permalink / raw)
  To: Edbrowse-dev

> since this is in C, I was just going to use the already built edbrowse modules.

Reviewing the various undefines, most are global variables,
which can of course be moved, and then perhaps just a couple
functions to move or stub.
Here is what I propose, and I would be happy to lay this groundwork if you like.
Move a few functions and variables around,
and perhaps stub one or two in jseng-moz.cpp,
so that edbrowse and edbrowse-js share stringfile.o url.o messages.o.
This seems to be the best place to draw the circle around.
Of course I'd need separate make rules to compile these three sources
stringfile++.o url++.o messages++.o
when folding them into the c++ program jseng-moz.
I haven't used make in this way before but I'm sure it can be done.
I can work on this in preparation for your work on different js engines,
or I can just wait to see what comes of your duktape research.
As you wish.

Karl Dahlke

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Edbrowse-dev] V 3.5.4
  2015-05-13 13:15             ` Karl Dahlke
@ 2015-05-14  7:03               ` Adam Thompson
  0 siblings, 0 replies; 12+ messages in thread
From: Adam Thompson @ 2015-05-14  7:03 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 1252 bytes --]

On Wed, May 13, 2015 at 09:15:57AM -0400, Karl Dahlke wrote:
> > since this is in C, I was just going to use the already built edbrowse modules.
> 
> Reviewing the various undefines, most are global variables,
> which can of course be moved, and then perhaps just a couple
> functions to move or stub.
> Here is what I propose, and I would be happy to lay this groundwork if you like.
> Move a few functions and variables around,
> and perhaps stub one or two in jseng-moz.cpp,
> so that edbrowse and edbrowse-js share stringfile.o url.o messages.o.
> This seems to be the best place to draw the circle around.
> Of course I'd need separate make rules to compile these three sources
> stringfile++.o url++.o messages++.o
> when folding them into the c++ program jseng-moz.
> I haven't used make in this way before but I'm sure it can be done.
> I can work on this in preparation for your work on different js engines,
> or I can just wait to see what comes of your duktape research.
> As you wish.

I'd say go for it, may be make a separate header file for these (included in
eb.h of course). For the c++ versions I'd be tempted to leave jseng-moz.cpp as
is at the moment. If duktape fails we can revisit this.

Regards,
Adam.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-05-14  7:03 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-10 23:37 [Edbrowse-dev] V 3.5.4 Karl Dahlke
2015-05-11  0:37 ` chris
2015-05-11  0:54   ` Karl Dahlke
2015-05-11 22:41   ` Adam Thompson
2015-05-11 23:23     ` Karl Dahlke
2015-05-12  7:32       ` Adam Thompson
2015-05-12  9:15         ` Karl Dahlke
2015-05-13  7:17           ` Adam Thompson
2015-05-13  8:16             ` Karl Dahlke
2015-05-13 13:15             ` Karl Dahlke
2015-05-14  7:03               ` Adam Thompson
2015-05-12  8:07       ` chris

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).