edbrowse-dev - development list for edbrowse
 help / color / mirror / Atom feed
* [Edbrowse-dev] Rebundle
@ 2015-12-22 23:13 Karl Dahlke
  2015-12-23 23:17 ` Kevin Carhart
  0 siblings, 1 reply; 13+ messages in thread
From: Karl Dahlke @ 2015-12-22 23:13 UTC (permalink / raw)
  To: Edbrowse-dev

Ok - the latest commit rebundles the files so that
http.c and auth.c are now part of the common library,
shared between the two processes.
We can now use our existing machinery to make http requests
from inside javascript objects - at least I think we can.

Kevin, please have a look and see what it would take to call httpConnect.
In other words, you should not have to use curl directly any more.
(Though we do have to initialize curl, and the http curl handle,
which I forgot to do.)

This is one of those commits where a diff won't help you.
It looks like I added 1000 lines and deleted 900,
but really I just moved some functions and variables around
to make both processes load correctly.
It's not as monsterous a change as it looks.

Both processes read the config file in exactly the same way.
With this working, I removed one of the arguments passed into edbrowse-js.
arg3 was the size of the js pool, which edbrowse use to pass to edbrowse-js,
but no need, because edbrowse-js now reads the same information.
And the same will be true of all your proxy servers, and the novs sites that you don't
want certificates, and the CA file, and other such things.

But it's not quite that simple.
There are a couple of dynamic changes made in edbrowse
which will have to pass to edbrowse-js,
probably by some new messages.

vs disable certificates

ua change the user agent

sr send referrer

401 authorization establish a password for a url.
You don't want to have to type in your password twice,
once for each process.

These are the only ones I can think of right now.
It would be sweet if there was a shared memory block for this stuff,
and no need to pass messages when these change,
but shared memory is not available in windows,
so I don't want to go there.

I changed CMakeLists.txt but didn't test that.
I added -lcurl to the js process in the makefile,
but it looks like I didn't need to do that in cmake.
It looks like one list of libraries is made with everything
and then applied to both processes, if I'm reading it right.
Well I'm sure someone will let me know.

Enjoy.

Karl Dahlke

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Edbrowse-dev] Rebundle
  2015-12-22 23:13 [Edbrowse-dev] Rebundle Karl Dahlke
@ 2015-12-23 23:17 ` Kevin Carhart
  2015-12-24  0:01   ` Kevin Carhart
  0 siblings, 1 reply; 13+ messages in thread
From: Kevin Carhart @ 2015-12-23 23:17 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev



> Kevin, please have a look and see what it would take to call httpConnect.

This is great - I'm in the thick of it now and should find something
out today.

> discussion about going back to one process

I'm reading all of this with interest though I probably won't
get involved in architectural fundamentals.

Kevin


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Edbrowse-dev] Rebundle
  2015-12-23 23:17 ` Kevin Carhart
@ 2015-12-24  0:01   ` Kevin Carhart
  2015-12-24  0:04     ` Kevin Carhart
  0 siblings, 1 reply; 13+ messages in thread
From: Kevin Carhart @ 2015-12-24  0:01 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev




curl_easy_init() at the top of httpConnect.  So then would I also need to 
add a call to the cleanup at the bottom of httpConnect?

Here's what I have for the native implementation of a javascript function. 
It currently succeeds in returning the response to javascript, but then 
edbrowse segfaults on quit.

static JSBool fetchAsynchronous(JSContext * cx, unsigned int argc, jsval * 
vp)
{
JS::RootedString str(cx);
JS::RootedString headers(cx);
JS::RootedString payload(cx);
JS::CallArgs args = JS::CallArgsFromVp(argc, vp);
str = JS_ValueToString(cx, args[0]);
char *curl_url = JS_c_str(str);
headers = JS_ValueToString(cx, args[1]);
char *curl_headers = JS_c_str(headers);
// url, headers, payload
payload = JS_ValueToString(cx, args[2]);
char *post = JS_c_str(payload);
char *javatext;
if (httpConnect(curl_url,false,false))
{
if(hcode == 200) {
javatext = serverData;
prepareForBrowse(javatext, serverDataLen);
}
}
args.rval().set(STRING_TO_JSVAL(JS_NewStringCopyZ(cx, javatext)));
}



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Edbrowse-dev] Rebundle
  2015-12-24  0:01   ` Kevin Carhart
@ 2015-12-24  0:04     ` Kevin Carhart
  2015-12-24  0:17       ` Karl Dahlke
  2015-12-25 15:10       ` Karl Dahlke
  0 siblings, 2 replies; 13+ messages in thread
From: Kevin Carhart @ 2015-12-24  0:04 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev



Oops, sorry, I clobbered my first line in that message.

You mentioned that the curl handle isn't initialized, so I added a 
curl_easy_init() at the top of httpConnect.  So then would I also need to 
add a call to the cleanup at the bottom of httpConnect?

Here's what I have for the native implementation of a javascript function. 
It currently succeeds in returning the response to javascript, but then 
edbrowse segfaults on quit.

static JSBool fetchAsynchronous(JSContext * cx, unsigned int argc, jsval * 
vp)
{
JS::RootedString str(cx);
JS::RootedString headers(cx);
JS::RootedString payload(cx);
JS::CallArgs args = JS::CallArgsFromVp(argc, vp);
str = JS_ValueToString(cx, args[0]);
char *curl_url = JS_c_str(str);
headers = JS_ValueToString(cx, args[1]);
char *curl_headers = JS_c_str(headers);
// url, headers, payload
payload = JS_ValueToString(cx, args[2]);
char *post = JS_c_str(payload);
char *javatext;
if (httpConnect(curl_url,false,false))
{
if(hcode == 200) {
javatext = serverData;
prepareForBrowse(javatext, serverDataLen);
}
}
args.rval().set(STRING_TO_JSVAL(JS_NewStringCopyZ(cx, javatext)));
}



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Edbrowse-dev]  Rebundle
  2015-12-24  0:04     ` Kevin Carhart
@ 2015-12-24  0:17       ` Karl Dahlke
  2015-12-24  0:46         ` Kevin Carhart
  2015-12-25 15:10       ` Karl Dahlke
  1 sibling, 1 reply; 13+ messages in thread
From: Karl Dahlke @ 2015-12-24  0:17 UTC (permalink / raw)
  To: Edbrowse-dev

Kevin -
You don't want to change httpConnect() in any way, if you can help it.

I was noticing that main, down the line, calls

	cookiesFromJar();
	http_curl_init();

This is after you jump away in the js process,
so these two lines have to move up before js_main() is called,
or you'd have to replicate them in js_main().
I'd be happy to push a small change that moves them up.
If it still seg faults after that
then we'll have to dig and delve.

Karl Dahlke

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Edbrowse-dev] Rebundle
  2015-12-24  0:17       ` Karl Dahlke
@ 2015-12-24  0:46         ` Kevin Carhart
  0 siblings, 0 replies; 13+ messages in thread
From: Kevin Carhart @ 2015-12-24  0:46 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev



Oh yes, that seems to fix the seg fault.

> I'd be happy to push a small change that moves them up.

Great.  Thank you.
Kevin

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Edbrowse-dev]  Rebundle
  2015-12-24  0:04     ` Kevin Carhart
  2015-12-24  0:17       ` Karl Dahlke
@ 2015-12-25 15:10       ` Karl Dahlke
  2015-12-26  8:44         ` [Edbrowse-dev] Rebundle / One program # processes Kevin Carhart
  1 sibling, 1 reply; 13+ messages in thread
From: Karl Dahlke @ 2015-12-25 15:10 UTC (permalink / raw)
  To: Edbrowse-dev

> Here's what I have for the native implementation of a javascript function.

Kevin I finally got round to looking at this.

You should free the strings created by js_c_str() (memory leak).
You may also have to free serverData after it is converted to a js string.
It too is allocated.
javatext should be initialized to null else
the last line will seg fault when httpConnect fails.
Not clear what happens for other error codes besides 200.

That said, it's a great and simple implementation of
a synchronous http call.
And with my latest commits,
it will use the proxies and agent and cookie jar in the config file.
But it will not use transient or new cookies, like session cookies,
that might be produced by the initial web page fetch from edbrowse,
and it won't until we merge things together and there is but one instance of curl running.

Next question is could we spin off the http in the background,
which edbrowse indeed can do,
and how that would fit together,
and I'll think about that one,
but meantime I do think some is better than none,
and it might be worth implementing this synchronous routine,
in 30 lines of code, so more websites work,
and then work on the asynchronous model later.
But that's just my opinion / philosophy,
which is not always right.
A website indeed could work, that didn't work before,
but you may have to wait a few seconds for the http to return serially -
well I'd make that tradeoff for now.

anyways if you want to send me a revised function, or patch, or whatever...

Karl Dahlke

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Edbrowse-dev] Rebundle / One program # processes
  2015-12-25 15:10       ` Karl Dahlke
@ 2015-12-26  8:44         ` Kevin Carhart
  2015-12-26 15:56           ` Karl Dahlke
  0 siblings, 1 reply; 13+ messages in thread
From: Kevin Carhart @ 2015-12-26  8:44 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev



Hi group,
To those who celebrate it, merry christmas,

Okay, here is my reply to all the messages from today.
Thank you for reviewing my code, Karl, and catching my memory oversights.
I could revise this now, but I'm inclined to wait a while, because
it sounds like you might be about to supercede it with increments
in the right direction, on a solid foundation.  And this sounded
like a good reason not to use S-JAX:

> Also, there are several well established design patterns which will
> completely kill edbrowse with the new synchronous http.

So maybe the zip of S-JAX code was predominantly a proof of
concept to get the wheels going around, and it doesn't
bother me if it goes away.  It sounds as though,
if time permits, you're about to supercede it by addressing
some of the deeper questions.  We're now in one of the deadest
weeks of the year, so are you guys likely to do a lot over
the next several days?   And if so, I'll hold off for a bit
on the S-JAX, because I think it may become a moot point.
Which is very exciting by the way!

Kevin






On Fri, 25 Dec 2015, Karl Dahlke wrote:

>> Here's what I have for the native implementation of a javascript function.
>
> Kevin I finally got round to looking at this.
>
> You should free the strings created by js_c_str() (memory leak).
> You may also have to free serverData after it is converted to a js string.
> It too is allocated.
> javatext should be initialized to null else
> the last line will seg fault when httpConnect fails.
> Not clear what happens for other error codes besides 200.
>
> That said, it's a great and simple implementation of
> a synchronous http call.
> And with my latest commits,
> it will use the proxies and agent and cookie jar in the config file.
> But it will not use transient or new cookies, like session cookies,
> that might be produced by the initial web page fetch from edbrowse,
> and it won't until we merge things together and there is but one instance of curl running.
>
> Next question is could we spin off the http in the background,
> which edbrowse indeed can do,
> and how that would fit together,
> and I'll think about that one,
> but meantime I do think some is better than none,
> and it might be worth implementing this synchronous routine,
> in 30 lines of code, so more websites work,
> and then work on the asynchronous model later.
> But that's just my opinion / philosophy,
> which is not always right.
> A website indeed could work, that didn't work before,
> but you may have to wait a few seconds for the http to return serially -
> well I'd make that tradeoff for now.
>
> anyways if you want to send me a revised function, or patch, or whatever...
>
> Karl Dahlke
> _______________________________________________
> Edbrowse-dev mailing list
> Edbrowse-dev@lists.the-brannons.com
> http://lists.the-brannons.com/mailman/listinfo/edbrowse-dev
>

--------
Kevin Carhart * 415 225 5306 * The Ten Ninety Nihilists

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Edbrowse-dev] Rebundle / One program # processes
  2015-12-26  8:44         ` [Edbrowse-dev] Rebundle / One program # processes Kevin Carhart
@ 2015-12-26 15:56           ` Karl Dahlke
  2015-12-27  7:54             ` Adam Thompson
  0 siblings, 1 reply; 13+ messages in thread
From: Karl Dahlke @ 2015-12-26 15:56 UTC (permalink / raw)
  To: Edbrowse-dev

> I could revise this now, but I'm inclined to wait a while, because
> it sounds like you might be about to supercede it with increments

Oh I don't know, your submissions are always helpful.
And our revisions may well be transparent to your work.
In other words, you call httpConnect() either way,
and it might call curl locally, as it does now,
or it may send a request to a curl server and give you the response,
but the function call is likely to be the same anyways.
So I'd say press on; the only question is whether we paste it in
or wait, whether it adds value as it is in its synchronous mode,
whether it accesses more websites or causes unexpected trouble, or both.
It's a judgment call.
Usually I say "put it in", but that's me.

Karl Dahlke

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Edbrowse-dev] Rebundle / One program # processes
  2015-12-26 15:56           ` Karl Dahlke
@ 2015-12-27  7:54             ` Adam Thompson
  2015-12-27  9:18               ` [Edbrowse-dev] response headers and body? Kevin Carhart
  0 siblings, 1 reply; 13+ messages in thread
From: Adam Thompson @ 2015-12-27  7:54 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 1309 bytes --]

On Sat, Dec 26, 2015 at 10:56:03AM -0500, Karl Dahlke wrote:
> > I could revise this now, but I'm inclined to wait a while, because
> > it sounds like you might be about to supercede it with increments
> 
> Oh I don't know, your submissions are always helpful.
> And our revisions may well be transparent to your work.
> In other words, you call httpConnect() either way,
> and it might call curl locally, as it does now,
> or it may send a request to a curl server and give you the response,
> but the function call is likely to be the same anyways.

Agreed.

> So I'd say press on; the only question is whether we paste it in
> or wait, whether it adds value as it is in its synchronous mode,
> whether it accesses more websites or causes unexpected trouble, or both.
> It's a judgment call.
> Usually I say "put it in", but that's me.

How about a toggle? Karl could that work?
Like an xhr command to toggle whether xhr actually made connections or not.
If off it'd print a warning informing the user it's disabled and do...
something... preferably to not break js but to tell it that we don't have fully
working xhr support. I'd suggest we design such a toggle to be a permenant
feature since I for one would like to be able to switch this off in certain cases.

Cheers,
Adam.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Edbrowse-dev] response headers and body?
  2015-12-27  7:54             ` Adam Thompson
@ 2015-12-27  9:18               ` Kevin Carhart
  2015-12-27 13:38                 ` Karl Dahlke
  0 siblings, 1 reply; 13+ messages in thread
From: Kevin Carhart @ 2015-12-27  9:18 UTC (permalink / raw)
  To: Adam Thompson; +Cc: Karl Dahlke, Edbrowse-dev



>> Oh I don't know, your submissions are always helpful.
> Agreed.

Great.  Thank you, in that case I'm working on this now
but I noticed that
when prepareForBrowse populates the string 'javatext' after
a successful '200 OK' response, this is just the page body.
Is there a way to return the entire 'raw' response with
headers on the beginning?  I think the idiom is that the
xhr object in JS will have
blah.responseHeaders
blah.responseText
and maybe
blah.responseXML
etc

And some calling site's code might then switch on the
response header values, or grab the Set-Cookie:, or something.

I'll remove my hardcoded test for hcode=='200' and assuming
that both headers & body can be returned to JS, just deal
with carving it up there.

Adam said,
> How about a toggle? Karl could that work?
> Like an xhr command to toggle whether xhr actually made connections or not.

Sounds like a good idea- this would be a boolean similar to
other configurables, right?  So if I carry on with this routine
I'm doing, then later it could be wrapped in:
if (xhr)
{
}

Is that OK?

thanks
Kevin

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Edbrowse-dev] response headers and body?
  2015-12-27  9:18               ` [Edbrowse-dev] response headers and body? Kevin Carhart
@ 2015-12-27 13:38                 ` Karl Dahlke
  2015-12-27 21:03                   ` Kevin Carhart
  0 siblings, 1 reply; 13+ messages in thread
From: Karl Dahlke @ 2015-12-27 13:38 UTC (permalink / raw)
  To: Edbrowse-dev

> Is there a way to return the entire 'raw' response with
> headers on the beginning?

No but there could be easily, or better,
return both the headers and the body.
Headers are captured in the string http_headers in http.c,
we just need to return it to the caller.
I'll look at this later; after Chris finishes his work in http.c.
I don't want to collide.
Let's just say we can easily do this.

> it could be wrapped in: if (xhr)

sure, and probably in the else clause simulate not being able
to connect to the website and let js march on and do whatever it would
have done in that case.
I mean, sometimes you can't get to a site, especially a different one
from the site you just fetched.
Google analytics or some such.


Karl Dahlke

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Edbrowse-dev] response headers and body?
  2015-12-27 13:38                 ` Karl Dahlke
@ 2015-12-27 21:03                   ` Kevin Carhart
  0 siblings, 0 replies; 13+ messages in thread
From: Kevin Carhart @ 2015-12-27 21:03 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev



> Karl mentions allowXHR


Very cool!  Thank you.


>> Is there a way to return the entire 'raw' response with
>> headers on the beginning?
>
> No but there could be easily, or better,
> return both the headers and the body.

Sounds great.


>> it could be wrapped in: if (xhr)
>
> sure, and probably in the else clause simulate not being able
> to connect to the website and let js march on and do whatever it would
> have done in that case.

Yeah, like a try-catch so it won't bail out altogether.



Kevin

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-12-27 21:03 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-22 23:13 [Edbrowse-dev] Rebundle Karl Dahlke
2015-12-23 23:17 ` Kevin Carhart
2015-12-24  0:01   ` Kevin Carhart
2015-12-24  0:04     ` Kevin Carhart
2015-12-24  0:17       ` Karl Dahlke
2015-12-24  0:46         ` Kevin Carhart
2015-12-25 15:10       ` Karl Dahlke
2015-12-26  8:44         ` [Edbrowse-dev] Rebundle / One program # processes Kevin Carhart
2015-12-26 15:56           ` Karl Dahlke
2015-12-27  7:54             ` Adam Thompson
2015-12-27  9:18               ` [Edbrowse-dev] response headers and body? Kevin Carhart
2015-12-27 13:38                 ` Karl Dahlke
2015-12-27 21:03                   ` Kevin Carhart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).