edbrowse-dev - development list for edbrowse
 help / color / mirror / Atom feed
* [Edbrowse-dev] gc
@ 2014-01-04 22:14 Karl Dahlke
  2014-01-04 22:42 ` Chris Brannon
  2014-01-04 22:52 ` Adam Thompson
  0 siblings, 2 replies; 6+ messages in thread
From: Karl Dahlke @ 2014-01-04 22:14 UTC (permalink / raw)
  To: Edbrowse-dev

I have always wondered about gc in c++.
It cannot be easy and straightforward like it is in java.
(One reason I was always afraid of c++)
So possibly void * won't work, like you have to tell the compiler
that it's a pointer to a certain object of a certain class,
for c++ to keep it around.
Or - maybe you have to explicitly set void * x = object * o
so that when it crosses the equals sign it tells gc
that it is off somewhere else and should not be deleted.
Unfortunately this is advanced stuff that won't be in my first line tutorial.
And then moz js may have its own internal gc.
That would be stupid, to reinvent what c++ has already done, but who knows.
We need to become not just competent in this stuff, but near experts.

Karl Dahlke

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Edbrowse-dev] gc
  2014-01-04 22:14 [Edbrowse-dev] gc Karl Dahlke
@ 2014-01-04 22:42 ` Chris Brannon
  2014-01-04 22:52 ` Adam Thompson
  1 sibling, 0 replies; 6+ messages in thread
From: Chris Brannon @ 2014-01-04 22:42 UTC (permalink / raw)
  To: Edbrowse-dev

Karl Dahlke <eklhad@comcast.net> writes:

> I have always wondered about gc in c++.

C++ doesn't have true GC.  What it has is deterministic destruction of
objects.  You can use objects to manage resources (such as heap), that
will be reclaimed when the object goes out of scope or is destroyed via
the "delete" operator.
In C++, it's called RAII, short for "resource acquisition is
initialization".  Turns out that you can do some really fancy stuff like
ref-counted smart pointers that free the associated memory when the last
referencing object goes away.  This isn't true GC, but it gets you some
of the benefits.

-- Chris

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Edbrowse-dev] gc
  2014-01-04 22:14 [Edbrowse-dev] gc Karl Dahlke
  2014-01-04 22:42 ` Chris Brannon
@ 2014-01-04 22:52 ` Adam Thompson
  1 sibling, 0 replies; 6+ messages in thread
From: Adam Thompson @ 2014-01-04 22:52 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

On Sat, Jan 04, 2014 at 05:14:14PM -0500, Karl Dahlke wrote:
> I have always wondered about gc in c++.
> It cannot be easy and straightforward like it is in java.
> (One reason I was always afraid of c++)
> So possibly void * won't work, like you have to tell the compiler
> that it's a pointer to a certain object of a certain class,
> for c++ to keep it around.

No, c++ doesn't have its own GC, but it does have the concept of object
constructors and destructors. If I understand the mozilla api correctly,
they basicly use these to hook into their javascript GC system such that when a
RootedObject (I think I've got the type name correct)
is constructed it tells the javascript GC not to collect the object pointed to
by the RootedObject instance, and when the afore mentioned instance goes out of
scope or is destroyed, its destructor tells the javascript GC that the
RootedObject instance is no longer alive and thus,
if all references to the javascript object are destroyed,
the javascript object should be collected.

The problem for us is that we currently don't construct any of these
RootedObject instances, which means that the SpiderMonkey internal GC doesn't
know we want to keep anything we create.
This basicly means (I think) that our objects are collected very soon after they're created, causing the segfault problem.
Another important note is this also applies to javascript values (strings etc) and a copule of other things.

As I see it, either we start keeping pointers to these GC constructs (in void *
should be fine) or we add a further layer of abstraction to basicly make a
more c-like api. Something like an explicit registration and unregistration
setup, with *all* the javascript stuff being in a single void * (runtime,
context list etc).
I think the first option may be the easier short term fix, though it's going to lead to much typecasting.

Cheers,
Adam.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Edbrowse-dev] gc
  2014-01-05 13:46 ` Adam Thompson
@ 2014-01-05 19:52   ` Adam Thompson
  0 siblings, 0 replies; 6+ messages in thread
From: Adam Thompson @ 2014-01-05 19:52 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

On Sun, Jan 05, 2014 at 01:46:22PM +0000, Adam Thompson wrote:
> On Sun, Jan 05, 2014 at 02:30:44AM -0500, Karl Dahlke wrote:
> > > The problem for us is that we currently don't construct any of these
> > > RootedObject instances, which means that the SpiderMonkey internal GC
> > 
> > Really?
> > My layer calls, for example, JS_ConstructObjectWithArguments()
> > to make a new object,
> > I would suppose you replace that with some kind of js new call,
> > which implicitly or explicitly creates a new object in c++,
> > which calls the constructor you described,
> > and I would figure that's good enough to keep it around,
> > until we remove it, which use to be some kind of js_free,
> > now some kind of js_destroy,
> > and then the gc can clean up the loose ends,
> > and I'm sorry in advance if I'm oversimplifying it,
> > because I haven't looked at any of your code or how it works;
> > I just didn't expect a problem here.
> > Allocate becomes new construct, and all should be well.
> 
> I'm guessing this wasn't an issue before since you had to explicitly call the
> GC so you had time to insert the references into the environment (i.e.
> make jwin the global object), however now it seems to be called as part of most operations.
> 
> To allow this to work they now have the rooting api,
> which as far as I can work out, provides objects which are like the "smart"
> pointers chris was talking about. Instead of just freeing the object however,
> these ones hook into the GC for the javascript environment allowing objects to
> stay around in the environment until all references,
> including those in the host app, are gone.

Ok, on closer inspection it looks like we've been living dangerously for a
while as JS_Add*Root functions are present in smjs 185 as well.
The good news is I discovered this whilst looking for a non-c++ alternative to
the smart pointers approach, and the fact that it still exists and is still
supported (though is only encouraged when absolutely necessary)
means that I can save myself a whole bunch of work.
The bad news is that there's been a set of GC-related bugs there for, well,
however long edbrowse's been using SpiderMonkey I guess.
This supports my suspicions that the GC is probably behind many of the js
related segfaults since we really should've been telling it about most of the
js stuff we do.

The reason this is more of an issue now is because they've improved their GC to
make it much more memory efficient apparently.

Cheers,
Adam.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Edbrowse-dev] gc
  2014-01-05  7:30 Karl Dahlke
@ 2014-01-05 13:46 ` Adam Thompson
  2014-01-05 19:52   ` Adam Thompson
  0 siblings, 1 reply; 6+ messages in thread
From: Adam Thompson @ 2014-01-05 13:46 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

On Sun, Jan 05, 2014 at 02:30:44AM -0500, Karl Dahlke wrote:
> > The problem for us is that we currently don't construct any of these
> > RootedObject instances, which means that the SpiderMonkey internal GC
> 
> Really?
> My layer calls, for example, JS_ConstructObjectWithArguments()
> to make a new object,
> I would suppose you replace that with some kind of js new call,
> which implicitly or explicitly creates a new object in c++,
> which calls the constructor you described,
> and I would figure that's good enough to keep it around,
> until we remove it, which use to be some kind of js_free,
> now some kind of js_destroy,
> and then the gc can clean up the loose ends,
> and I'm sorry in advance if I'm oversimplifying it,
> because I haven't looked at any of your code or how it works;
> I just didn't expect a problem here.
> Allocate becomes new construct, and all should be well.

Not quite.  My understanding of the problem is this:
Garbage collectors work by removing values (objects, strings, numbers etc) which,
within the world of the language they're garbage collecting (javascript in our
case) are no longer reachable (no references exist to them).
Depending on the kind of GC used, they will do this every time a language
operation is performed or periodically.
When embedding a language into an application (like we are with javascript)
the garbage collector only knows about the world inside the language environment (i.e.
what objects have been created by javascript etc),
and not that of the embedding (host) application.
To allow the host app to create custom objects within the embedded language
environment an api needs to exist to tell the garbage collector that the object
which has just appeared within its environment is actually referenced in the
outside world and is not left over from for example (excuse my rusty js syntax):
var x = object();
x = object(); /* the previous value of x is now garbage */

What we're currently doing is constructing objects within the javascript
environment but failing to tell the garbage collector that they're actually
referenced in the outside world, thus they seem to get collected almost imediately.
We then try to use these objects (either by defining references to them within the
javascript environment or initialising standard classes within them or whatever),
however they've already been freed, hence the segfaults.

The key thing to note from the above is that the GC knows nothing about
pointers etc within the host application (edbrowse)
unless explicitly told about them. Calling object construction methods (JS_New
etc) only creates an object within the javascript environment and returns a
pointer to it so that the host app can act on it.
It does not create a reference within the javascript environment.
It's sort of like the line:
object();

Rather than:
var ref = object();

I'm guessing this wasn't an issue before since you had to explicitly call the
GC so you had time to insert the references into the environment (i.e.
make jwin the global object), however now it seems to be called as part of most operations.

To allow this to work they now have the rooting api,
which as far as I can work out, provides objects which are like the "smart"
pointers chris was talking about. Instead of just freeing the object however,
these ones hook into the GC for the javascript environment allowing objects to
stay around in the environment until all references,
including those in the host app, are gone.
Hope this helps somewhat, and appologies if it's confusing or you know it
already (or both).

Cheers,
Adam.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Edbrowse-dev]  gc
@ 2014-01-05  7:30 Karl Dahlke
  2014-01-05 13:46 ` Adam Thompson
  0 siblings, 1 reply; 6+ messages in thread
From: Karl Dahlke @ 2014-01-05  7:30 UTC (permalink / raw)
  To: Edbrowse-dev

> The problem for us is that we currently don't construct any of these
> RootedObject instances, which means that the SpiderMonkey internal GC

Really?
My layer calls, for example, JS_ConstructObjectWithArguments()
to make a new object,
I would suppose you replace that with some kind of js new call,
which implicitly or explicitly creates a new object in c++,
which calls the constructor you described,
and I would figure that's good enough to keep it around,
until we remove it, which use to be some kind of js_free,
now some kind of js_destroy,
and then the gc can clean up the loose ends,
and I'm sorry in advance if I'm oversimplifying it,
because I haven't looked at any of your code or how it works;
I just didn't expect a problem here.
Allocate becomes new construct, and all should be well.

Karl Dahlke

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-01-05 19:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-04 22:14 [Edbrowse-dev] gc Karl Dahlke
2014-01-04 22:42 ` Chris Brannon
2014-01-04 22:52 ` Adam Thompson
2014-01-05  7:30 Karl Dahlke
2014-01-05 13:46 ` Adam Thompson
2014-01-05 19:52   ` Adam Thompson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).