GC refactor ahead
tewk at tewk.com
Sat Nov 28 05:57:53 UTC 2009
Many generational GCs allocate from mmaped "pages" of say 4k-16k in
size. mprotect and handling SIGSEGV can then be used as a write barrier
without having to modify the VM code, the write barriers are of course
Andrew Whitworth wrote:
> On Thu, Nov 26, 2009 at 2:40 PM, chromatic <chromatic at wgz.org> wrote:
>> How do you identify items that were once but now aren't live?
>> There are a couple of options.
>> 1) refcounting
>> 2) create all new GCables from young generation pools and ...
>> 2a) use pervasive write barriers to mark the pool as containing GCables
>> pointed to from an older generation, so you know they need marking
>> 2b) use pervasive write barriers to promote GCables pointed to from an older
>> generation to a generation not immediately collectable
>> 3) trace the whole live set from the roots to all reachable objects
>> #1 is out.
>> We do #3.
>> We should move to a #2 system.
> Another option, similar in idea but different in implementation would
> be to use escape analysis to allocate certain groups of short-lived
> PMCs on the stack instead of the heap. This would exclude those items
> from GC entirely, likely reducing a lot of pressure from certain
> internal-only operations.
> Of the options chromatic mentioned, I think something along the idea
> of 2b is best. PMCs which are attached to other PMCs in a
> semi-permanent way could all be treated together by the GC as a single
> "super-object". This is similar to how we currently handle PMCs and
> their associated attributes structures. In this case, the write
> barriers don't need to be pervasive throughout the system, they would
> only need to be implemented inside aggregate PMC types (any PMC which
> contains a reference to another PMC).
> Once we switch to Lorito for most things, I think we can avoid messy
> operations like stack-walking and move to a more precise (as opposed
> to a conservative) approach.
> Here are some of the high-priority things that we need to do if we
> want to improve GC performance:
> 1) Decrease the total number of PMCs created in the flow of normal
> operations (various optimizations, precise GC instead of conservative
> GC, etc)
> 2) Decrease the number of times a long-lived PMC needs to be marked
> (Generational GC and write barriers)
> 3) Decrease the number of items we have to examine during a sweep
> Once we have these improvements in place, we can talk about improving
> the efficiency of each operation by tweaking the implementation
> details (compacting GC, etc).
>> We do need write barriers, but it's going to be a pain to add them to the
>> entire system. *Every place* that stores a STRING or a PMC has to use the
>> appropriate macro or function call.
>> That includes all dynops, extension code, and custom PMCs.
> I think we can get away with a more limited approach by only demanding
> write barriers inside aggregate PMCs.
> --Andrew Whitworth
More information about the parrot-dev