GC refactor ahead

chromatic chromatic at wgz.org
Thu Nov 26 19:40:26 UTC 2009


On Thursday 26 November 2009 10:29:00 Geoffrey Broadwell wrote:

> On Thu, 2009-11-26 at 10:01 -0800, chromatic wrote:

> > ... we have to trace the entire live set on every GC run,
> > even if only 10% of the GCables created are live and 90% are short-lived.
> I am somewhat suprised by this.  My expectation (and it sounds like
> Patrick's as well) was that GC froth (items both created and destroyed
> between GC runs) would be removed from the "live set" before the GC run
> sees them -- and thus would not affect the execution time of the GC run.

How do you identify items that were once but now aren't live?

There are a couple of options.

1) refcounting
2) create all new GCables from young generation pools and ...
2a) use pervasive write barriers to mark the pool as containing GCables 
pointed to from an older generation, so you know they need marking
2b) use pervasive write barriers to promote GCables pointed to from an older 
generation to a generation not immediately collectable
3) trace the whole live set from the roots to all reachable objects

#1 is out.
We do #3.

We should move to a #2 system.

(I've left out copying and compacting because that's an implementation detail 
for reclaiming, not marking live/dead.)

> It sounds like instead you're saying that creating an item adds it to
> the live set, but dying does *not* remove it from the live set -- and
> that only a full GC run can do this removal.

Yes; that's how non-refcounting GC works.  You can always identify what's 
live.  You can identify what's dead by a union operation: everything you 
haven't marked as live is dead.

> That sounds less than awesome, performance wise.
>
> If the problem is that we don't know if the GCable has references to or
> from it, and need to do a full GC run to resolve this, then it would be
> nice if we could address this either through hinting or algorithmically.

We do need write barriers, but it's going to be a pain to add them to the 
entire system.  *Every place* that stores a STRING or a PMC has to use the 
appropriate macro or function call.

That includes all dynops, extension code, and custom PMCs.

-- c


More information about the parrot-dev mailing list