GC refactor ahead

Kevin Tew tewk at tewk.com
Sat Nov 28 05:57:53 UTC 2009


Many generational GCs allocate from mmaped "pages" of say 4k-16k in 
size.  mprotect and handling SIGSEGV can then be used as a write barrier 
without having to modify the VM code,   the write barriers are of course 
per page.

Kevin


Andrew Whitworth wrote:
> On Thu, Nov 26, 2009 at 2:40 PM, chromatic <chromatic at wgz.org> wrote:
>   
>> How do you identify items that were once but now aren't live?
>>
>> There are a couple of options.
>>
>> 1) refcounting
>> 2) create all new GCables from young generation pools and ...
>> 2a) use pervasive write barriers to mark the pool as containing GCables
>> pointed to from an older generation, so you know they need marking
>> 2b) use pervasive write barriers to promote GCables pointed to from an older
>> generation to a generation not immediately collectable
>> 3) trace the whole live set from the roots to all reachable objects
>>
>> #1 is out.
>> We do #3.
>>
>> We should move to a #2 system.
>>     
>
> Another option, similar in idea but different in implementation would
> be to use escape analysis to allocate certain groups of short-lived
> PMCs on the stack instead of the heap. This would exclude those items
> from GC entirely, likely reducing a lot of pressure from certain
> internal-only operations.
>
> Of the options chromatic mentioned, I think something along the idea
> of 2b is best. PMCs which are attached to other PMCs in a
> semi-permanent way could all be treated together by the GC as a single
> "super-object". This is similar to how we currently handle PMCs and
> their associated attributes structures. In this case, the write
> barriers don't need to be pervasive throughout the system, they would
> only need to be implemented inside aggregate PMC types (any PMC which
> contains a reference to another PMC).
>
> Once we switch to Lorito for most things, I think we can avoid messy
> operations like stack-walking and move to a more precise (as opposed
> to a conservative) approach.
>
> Here are some of the high-priority things that we need to do if we
> want to improve GC performance:
>
> 1) Decrease the total number of PMCs created in the flow of normal
> operations (various optimizations, precise GC instead of conservative
> GC, etc)
> 2) Decrease the number of times a long-lived PMC needs to be marked
> (Generational GC and write barriers)
> 3) Decrease the number of items we have to examine during a sweep
>
> Once we have these improvements in place, we can talk about improving
> the efficiency of each operation by tweaking the implementation
> details (compacting GC, etc).
>
>   
>> We do need write barriers, but it's going to be a pain to add them to the
>> entire system.  *Every place* that stores a STRING or a PMC has to use the
>> appropriate macro or function call.
>>
>> That includes all dynops, extension code, and custom PMCs.
>>     
>
> I think we can get away with a more limited approach by only demanding
> write barriers inside aggregate PMCs.
>
> --Andrew Whitworth
> _______________________________________________
> http://lists.parrot.org/mailman/listinfo/parrot-dev
>   



More information about the parrot-dev mailing list